c++ - abstract class and alternative to virtual constructor - c++

Let say I have the following code:
class Block{
private:
data Data;
public:
data getData();
Block(arg3 Arg3, arg4 Arg4);
};
Actually, there are several ways to build a block, but always with the same member Data and method getData(), the only difference is how to build the block. In other words, the only difference is the constructor...
Instead of writing a different class for each building process, I could factorize parts of my code, defining and declaring getData in an abstract class, if there were such thing as a virtual constructor in c++ that I could write differently for each derived class corresponding to a different building process.
I do not have a lot experience for this kind of things, so I wondered if there was an alternative to a virtual constructor ? or may be a different way to do this factorization ?
PS: I am aware of https://isocpp.org/wiki/faq/virtual-functions#virtual-ctors but it seems quite complex regarding what I want to do, which seems quite common... I just want to factorize shared code between several classes, which corresponds to everything except the constructor. And I want to force new classes corresponding to other building processes to implement a new constructor.
More details about my particular situation:
I have an algorithm where I use blocks and it does not depend on their building process, so I have implemented the algorithm using a template argument to represent a block indifferently of its building process. But I use a few methods and its constructor, so I need my classes representing blocks to all have the same kind of methods I need and the same constructor to use them as a template argument of my algorithm implementation. That is why I thought of abstract class, to force a newly implemented class representing blocks to have the methods and the constructor I need in the algorithm I implemented. May be it is a bad design pattern and that is why I am stuck...
EDIT
Thank you for your answers so far. I tried to be a little generic but I feel that it is actually too vague, even with the details I gave at the end. So here is what I thought to do: I have a Matrix class as follows
// Matrix.hpp
template<typename GenericBlock> class Matrix{
std::vector<GenericBlock> blocks;
Matrix(arg1 Arg1, arg2 Arg2);
};
template<typename GenericBlock>
Matrix<GenericBlock>::Matrix(arg1 Arg1, arg2 Arg2){
// Do stuff
GenericBlock B(arg3 Arg3, arg4 Arg4);
B.getData();
}
The blocks are actually compressed, and there exists several ways to compress them and it does not change anything in the class Matrix. To avoid writing a matrix class for each compression technics, I used a template argument as you saw. So I just need to write a class for each compression technics, but they must have the same methods and constructor arguments to be compatible with Matrix.
That is why I thought of doing an abstract class, for writing a class for each compression technics. In the abstract class, I would write everything needed in Matrix so that every derived class would be compatible with Matrix. My problem now in my example is: I can define getData in the abstract class because it is always the same (for example, Datacan be the number of rows). The only thing derived classes would really need to define is the constructor.
One solution would be to not have an abstract class and use a protected constructor may be. But it does not force newly derived class to reimplement a constructor. That is why I am stuck. But I think this problem is generic enough to interest other people. So is there an alternative to a virtual constructor in this case ? (may be a factory pattern but it seems quite complex for a such common problem) If not, is there a better way to implement a matrix class whose blocks can be built in different manners, i.e. whose constructor can be different from each other, while having the same data and a few method in common ?
PS: I am interested in compression technics that produce low rank matrices, that is why the data is always the same, but not the building process.

From what you have shared so far it is not clear why you need an abstract class or virtual constructors. A factory function for each way of building a block will do:
class Block {
Data data;
public:
Block(Data d) : data(std::move(d)) {}
Data getData();
};
Block createABlock() { return Block{Data{1.0, 2.0, 3.0}}; }
Block createBBlock() { return Block{Data{42.0, 3.14, 11.6}}; }
int main() {
auto b1 = createABlock();
auto b2 = createBBlock();
}
Live demo.
Perhaps this needs to be extended with an abstract factory so you can pass a generic block factory around:
using BlockFactory = std::function<Block()>;
int main() {
BlockFactory f = createABlock;
auto b3 = f();
}
EDIT:
Regarding your EDIT, what you have suggested works fine. You don't need a virtual constructor. The template type GenericBlock just has to satisfy the implicit interface defined by the template. It doesn't need to derive from a particular base class (although it could do). The only thing required of it, is it must have a constructor that takes a particular set of arguments and a getData method. What you have is compile time static polymorphism, virtual functions are for run time dynamic polymorphism.
Inheritance will work fine but as I said above I'd be tempted to use some sort of factory. You may not need to template the whole Matrix class as only the constructor needs the factory. If the factory is known at compile-time this can be passed as a template parameter:
class Matrix {
std::vector<Block> blocks;
public:
template<typename BlockFactory>
Matrix(BlockFactory f);
};
template<typename BlockFactory>
Matrix::Matrix(BlockFactory f){
// Do stuff...
Block B = f();
auto data = B.getData();
for (auto v : data)
std::cout << v << " ";
std::cout << "\n";
}
int main() {
Matrix ma(createABlock);
Matrix mb(createBBlock);
}
Live demo.

TL:DR, but if the data are the same for all Blocks, you don't even need multiple classes, but only multiple constructors.
class Block
{
enum { type1, type2, type3 };
int type;
data Data;
public:
Block(int x)
: type(type1), Data(x) {}
Block(std::string const& str)
: type(type2), Data(str) {}
Block(data const*x)
: type(type3), Data(data) {}
/* ... */
};

template<class T>struct tag_t{constexpr tag_t(){}; usong type=T;};
template<class T>constexpr tag_t<T> tag{};
this lets you pass types as values.
struct BlockA{};
struct BlockB{};
class Block {
enum BlockType { typeA, typeB };;
BlockType type;
data Data;
public:
Block(tag_t<BlockA>, int x)
: type(typeA), Data(x) {}
Block(tag_t<BlockB>, int x)
: type(typeB), Data(2*x+7) {}
/* ... */
};
the blocks are all the same type. The tag determines how they are constructed.

There is no alternative to a virtual constructor, because there is no virtual constructor to begin with. I know it can be hard to accept, but it is the truth.
Anyhow, you do not need anything like what a virtual constructor would be if it existed....
[..] the only difference is how to build the block. In other words, the
only difference is the constructor...
If the only difference is in the constructor, then simply make the constructor take a parameter that tells what type of block is required. Alternatively you can have some functions that construct blocks in different ways:
struct Block {
private:
Block(){}
friend Block createWoodenBlock();
friend Block createStoneBlock();
};
Block createWoodenBlock(){ return Block(); }
Block createStoneBlock(){ return Block(); }
int main() {
Block woody = createWoodenBlock();
Block stony = createStoneBlock();
}

Related

Storing templated objects in a vector (Storing Class<int>, Class<double> in a single vector)

There is a templated class, let it be
template<typename T> class A { std::vector<T> data; };
The problem I am facing here is, users can create several types of this class, but I need to track them, best case is I have a reference of these objects in another vector, but that would not work since all types are different.
Can you recommend a good design pattern which can encapsulate this.
I can store pointers and then typecast it, but its not elegant.
I can change the architecture as well, if the solution provided is good enough.
The basic question I am trying to solve is, I have a class of vector of custom types, how do I store them.
As previous comments stated - you first need to make sure this is what you need.
With that been said, I had a similar requirement in a project of mine, which I eventually solved with inheritance and PIMPL, as follows:
class A{
private:
struct Abstract {
virtual void f() = 0;
};
template <typename T>
struct Implementation : public Abstract {
std::vector<T> data;
virtual void f() {...}
};
std::unique_ptr<Abstract> impl;
public:
template <typename T>
A(): impl(std::make_unique<Implementation<T> >()){}
void f() {impl->f();}
};
This allows you to create a container of objects of type A, and access them via the public interface defined therein (the method f). The underlying type T of each A object is specified on construction. All other implementation details specific to the type T are hidden.
The solution suffers the inherent overhead of virtual functions. I'm not sure how it compares to the std::any approach performance-wise.
std::any is the modern c++17 solution. Specifically, you should use
A<int> a;
a.data.push_back(0);
// fill refernces...
std::vector<std::any> refernces;
refernces.push_back(&a.data[0]);
// check which type is active.
if(int** iPtr = std::any_cast<int*>(&references[0]); iPtr != nullptr)
{
// its an int*
int& i = **iPtr;
// do something with i.
}
These pointers can point into the A<int>::data and A<double>::data vectors.
For a complete reference, see here https://en.cppreference.com/w/cpp/utility/any.

Simplify an extensible "Perform Operation X on Data Y" framework

tl;dr
My goal is to conditionally provide implementations for abstract virtual methods in an intermediate workhorse template class (depending on template parameters), but to leave them abstract otherwise so that classes derived from the template are reminded by the compiler to implement them if necessary.
I am also grateful for pointers towards better solutions in general.
Long version
I am working on an extensible framework to perform "operations" on "data". One main goal is to allow XML configs to determine program flow, and allow users to extend both allowed data types and operations at a later date, without having to modify framework code.
If either one (operations or data types) is kept fixed architecturally, there are good patterns to deal with the problem. If allowed operations are known ahead of time, use abstract virtual functions in your data types (new data have to implement all required functionality to be usable). If data types are known ahead of time, use the Visitor pattern (where the operation has to define virtual calls for all data types).
Now if both are meant to be extensible, I could not find a well-established solution.
My solution is to declare them independently from one another and then register "operation X for data type Y" via an operation factory. That way, users can add new data types, or implement additional or alternative operations and they can be produced and configured using the same XML framework.
If you create a matrix of (all data types) x (all operations), you end up with a lot of classes. Hence, they should be as minimal as possible, and eliminate trivial boilerplate code as far as possible, and this is where I could use some inspiration and help.
There are many operations that will often be trivial, but might not be in specific cases, such as Clone() and some more (omitted here for "brevity"). My goal is to conditionally provide implementations for abstract virtual methods if appropriate, but to leave them abstract otherwise.
Some solutions I considered
As in example below: provide default implementation for trivial operations. Consequence: Nontrivial operations need to remember to override with their own methods. Can lead to run-time problems if some future developer forgets to do that.
Do NOT provide defaults. Consequence: Nontrivial functions need to be basically copy & pasted for every final derived class. Lots of useless copy&paste code.
Provide an additional template class derived from cOperation base class that implements the boilerplate functions and nothing else (template parameters similar to specific operation workhorse templates). Derived final classes inherit from their concrete operation base class and that template. Consequence: both concreteOperationBase and boilerplateTemplate need to inherit virtually from cOperation. Potentially some run-time overhead, from what I found on SO. Future developers need to let their operations inherit virtually from cOperation.
std::enable_if magic. Didn't get the combination of virtual functions and templates to work.
Here is a (fairly) minimal compilable example of the situation:
//Base class for all operations on all data types. Will be inherited from. A lot. Base class does not define any concrete operation interface, nor does it necessarily know any concrete data types it might be performed on.
class cOperation
{
public:
virtual ~cOperation() {}
virtual std::unique_ptr<cOperation> Clone() const = 0;
virtual bool Serialize() const = 0;
//... more virtual calls that can be either trivial or quite involved ...
protected:
cOperation(const std::string& strOperationID, const std::string& strOperatesOnType)
: m_strOperationID()
, m_strOperatesOnType(strOperatesOnType)
{
//empty
}
private:
std::string m_strOperationID;
std::string m_strOperatesOnType;
};
//Base class for all data types. Will be inherited from. A lot. Does not know any operations that might be performed on it.
struct cDataTypeBase
{
virtual ~cDataTypeBase() {}
};
Now, I'll define an example data type.
//Some concrete data type. Still does not know any operations that might be performed on it.
struct cDataTypeA : public cDataTypeBase
{
static const std::string& GetDataName()
{
static const std::string strMyName = "cDataTypeA";
return strMyName;
}
};
And here is an example operation. It defines a concrete operation interface, but does not know the data types it might be performed on.
//Some concrete operation. Does not know all data types it might be expected to work on.
class cConcreteOperationX : public cOperation
{
public:
virtual bool doSomeConcreteOperationX(const cDataTypeBase& dataBase) = 0;
protected:
cConcreteOperationX(const std::string& strOperatesOnType)
: cOperation("concreteOperationX", strOperatesOnType)
{
//empty
}
};
The following template is meant to be the boilerplate workhorse. It implements as much trivial and repetitive code as possible and is provided alongside the concrete operation base class - concrete data types are still unknown, but are meant to be provided as template parameters.
//ConcreteOperationTemplate: absorb as much common/trivial code as possible, so concrete derived classes can have minimal code for easy addition of more supported data types
template <typename ConcreteDataType, typename DerivedOperationType, bool bHasTrivialCloneAndSerialize = false>
class cConcreteOperationXTemplate : public cConcreteOperationX
{
public:
//Can perform datatype cast here:
virtual bool doSomeConcreteOperationX(const cDataTypeBase& dataBase) override
{
const ConcreteDataType* pCastData = dynamic_cast<const ConcreteDataType*>(&dataBase);
if (pCastData == nullptr)
{
return false;
}
return doSomeConcreteOperationXOnCastData(*pCastData);
}
protected:
cConcreteOperationXTemplate()
: cConcreteOperationX(ConcreteDataType::GetDataName()) //requires ConcreteDataType to have a static method returning something appropriate
{
//empty
}
private:
//Clone can be implemented here via CRTP
virtual std::unique_ptr<cOperation> Clone() const override
{
return std::unique_ptr<cOperation>(new DerivedOperationType(*static_cast<const DerivedOperationType*>(this)));
}
//TODO: Some Magic here to enable trivial serializations, but leave non-trivials abstract
//Problem with current code is that virtual bool Serialize() override will also be overwritten for bHasTrivialCloneAndSerialize == false
virtual bool Serialize() const override
{
return true;
}
virtual bool doSomeConcreteOperationXOnCastData(const ConcreteDataType& castData) = 0;
};
Here are two implementations of the example operation on the example data type. One of them will be registered as the default operation, to be used if the user does not declare anything else in the config, and the other is a potentially much more involved non-default operation that might take many additional parameters into account (these would then have to be serialized in order to be correctly re-instantiated on the next program run). These operations need to know both the operation and the data type they relate to, but could potentially be implemented at a much later time, or in a different software component where the specific combination of operation and data type are required.
//Implementation of operation X on type A. Needs to know both of these, but can be implemented if and when required.
class cConcreteOperationXOnTypeADefault : public cConcreteOperationXTemplate<cDataTypeA, cConcreteOperationXOnTypeADefault, true>
{
virtual bool doSomeConcreteOperationXOnCastData(const cDataTypeA& castData) override
{
//...do stuff...
return true;
}
};
//Different implementation of operation X on type A.
class cConcreteOperationXOnTypeASpecialSauce : public cConcreteOperationXTemplate<cDataTypeA, cConcreteOperationXOnTypeASpecialSauce/*, false*/>
{
virtual bool doSomeConcreteOperationXOnCastData(const cDataTypeA& castData) override
{
//...do stuff...
return true;
}
//Problem: Compiler does not remind me that cConcreteOperationXOnTypeASpecialSauce might need to implement this method
//virtual bool Serialize() override {}
};
int main(int argc, char* argv[])
{
std::map<std::string, std::map<std::string, std::unique_ptr<cOperation>>> mapOpIDAndDataTypeToOperation;
//...fill map, e.g. via XML config / factory method...
const cOperation& requestedOperation = *mapOpIDAndDataTypeToOperation.at("concreteOperationX").at("cDataTypeA");
//...do stuff...
return 0;
}
if you data types are not virtual (for each operation call you know both operation type and data type at compile time) you may consider following approach:
#include<iostream>
#include<string>
template<class T>
void empty(T t){
std::cout<<"warning about missing implementation"<<std::endl;
}
template<class T>
void simple_plus(T){
std::cout<<"simple plus"<<std::endl;
}
void plus_string(std::string){
std::cout<<"plus string"<<std::endl;
}
template<class Data, void Implementation(Data)>
class Operation{
public:
static void exec(Data d){
Implementation(d);
}
};
#define macro_def(OperationName) template<class T> class OperationName : public Operation<T, empty<T>>{};
#define macro_template_inst( TypeName, OperationName, ImplementationName ) template<> class OperationName<TypeName> : public Operation<TypeName, ImplementationName<TypeName>>{};
#define macro_inst( TypeName, OperationName, ImplementationName ) template<> class OperationName<TypeName> : public Operation<TypeName, ImplementationName>{};
// this part may be generated on base of .xml file and put into .h file, and then just #include generated.h
macro_def(Plus)
macro_template_inst(int, Plus, simple_plus)
macro_template_inst(double, Plus, simple_plus)
macro_inst(std::string, Plus, plus_string)
int main() {
Plus<int>::exec(2);
Plus<double>::exec(2.5);
Plus<float>::exec(2.5);
Plus<std::string>::exec("abc");
return 0;
}
Minus of this approach is that you'd have to compile project in 2 steps: 1) transform .xml to .h 2) compile project using generated .h file. On plus side compiler/ide (I use qtcreator with mingw) gives warning about unused parameter t in function
void empty(T t)
and stack trace where from it was called.

Should I prefer mixins or function templates to add behavior to a set of unrelated types?

Mixins and function templates are two different ways of providing a behavior to a wide set of types, as long as these types meet some requirements.
For example, let's assume that I want to write some code that allows me to save an object to a file, as long as this object provides a toString member function (this is a rather silly example, but bear with me). A first solution is to write a function template like the following:
template <typename T>
void toFile(T const & obj, std::string const & filename)
{
std::ofstream file(filename);
file << obj.toString() << '\n';
}
...
SomeClass o1;
toFile(o1, "foo.txt");
SomeOtherType o2;
toFile(o2, "bar.txt");
Another solution is to use a mixin, using CRTP:
template <typename Derived>
struct ToFile
{
void toFile(std::string const & filename) const
{
Derived * that = static_cast<Derived const *>(this);
std::ofstream file(filename);
file << that->toString() << '\n';
}
};
struct SomeClass : public ToFile<SomeClass>
{
void toString() const {...}
};
...
SomeClass o1;
o.toFile("foo.txt");
SomeOtherType o2;
o2.toFile("bar.txt");
What are the pros and cons of these two approaches? Is there a favored one, and if so, why?
The first approach is much more flexible, as it can be made to work with any type that provides any way to be converted to a std::string (this can be achieved using traits-classes) without the need to modify that type. Your second approach would always require modification of a type in order to add functionality.
Pro function templates: the coupling is looser. You don't need to derive from anything to get the functionality in a new class; in your example, you only implement the toString method and that's it. You can even use a limited form of duck typing, since the type of toString isn't specified.
Pro mixins: nothing, strictly; your requirement is for something that works with unrelated classes and mixins cause them to be become related.
Edit: Alright, due to the way the C++ type system works, the mixin solution will strictly produce unrelated classes. I'd go with the template function solution, though.
I would like to propose an alternative, often forgotten because it is a mix of duck-typing and interfaces, and very few languages propose this feat (note: very close to Go's take to interfaces actually).
// 1. Ask for a free function to exist:
void toString(std::string& buffer, SomeClass const& sc);
// 2. Create an interface that exposes this function
class ToString {
public:
virtual ~ToString() {}
virtual void toString(std::string& buffer) const = 0;
}; // class ToString
// 3. Create an adapter class (bit of magic)
template <typename T>
class ToStringT final: public ToString {
public:
ToStringT(T const& t): t(t) {}
virtual void toString(std::string& buffer) const override {
toString(buffer, t);
}
private:
T t; // note: for reference you need a reference wrapper
// I won't delve into this right now, suffice to say
// it's feasible and only require one template overload
// of toString.
}; // class ToStringT
// 4. Create an adapter maker
template <typename T>
ToStringT<T> toString(T const& t) { return std::move(ToStringT<T>(t)); }
And now ? Enjoy!
void print(ToString const& ts); // aka: the most important const
int main() {
SomeClass sc;
print(toString(sc));
};
The two stages is a bit heavyweight, however it gives an astonishing degree of functionality:
No hard-wiring data / interface (thanks to duck-typing)
Low-coupling (thanks to abstract classes)
And also easy integration:
You can write an "adapter" for an already existing interface, and migrate from an OO code base to a more agile one
You can write an "interface" for an already existing set of overloads, and migrate from a Generic code base to a more clustered one
Apart from the amount of boiler-plate, it's really amazing how you seamlessly pick advantages from both worlds.
A few thoughts I had while writing this question:
Arguments in favor of template functions:
A function can be overloaded, so third-party and built-in types can be handled.
Arguments in favor of mixins:
Homogeneous syntax: the added behavior is invoked like any other member functions. However, it is well known that the interface of a C++ class includes not only its public member functions but also the free functions that operates on instances of this type, so this is just an aesthetic improvement.
By adding a non-template base class to the mixins, we obtain an interface (in the Java/C# sense) that can be use to handle all objects providing the behavior. For example, if we make ToFile<T> inherits from FileWritable (declaring a pure virtual toFile member function), we can have a collection of FileWritable without having to resort to complicated heterogeneous data structures.
Regarding usage, I'd say that function templates are more idiomatic in C++.

given abstract base class X, how to create another template class D<T> where T is the type of the class deriving from X?

I want to be able to accept a Message& object which references either a Message1 or Message2 class. I want to be able to create a MessageWithData<Message1> or MessageWithData<Message2> based on the underlying type of the Message& object. For example, see below:
class Message {};
class Message1 : public Message {};
class Message2 : public Message {};
template<typename Message1or2>
class MessageWithData : public Message1or2 { public: int x, y; }
class Handler()
{
public:
void process(const Message& message, int x, int y)
{
// create object messageWithData whose type is
// either a MessageWithData<Message1> or a MessageWithData<Message2>
// based on message's type.. how do I do this?
//
messageWithData.dispatch(...)
}
};
The messageWithData class essentially contains methods inherited from Message which allow it to be dynamically double dispatched back to the handler based on its type. My best solution so far has been to keep the data separate from the message type, and pass it all the way through the dynamic dispatch chain, but I was hoping to come closer to the true idiom of dynamic double dispatch wherein the message type contains the variable data.
(The method I'm more or less following is from http://jogear.net/dynamic-double-dispatch-and-templates)
You're trying to mix runtime and compile-time concepts, namely (runtime-)polymorphism and templates. Sorry, but that is not possible.
Templates operate on types at compile time, also called static types. The static type of message is Message, while its dynamic type might be either Message1 or Message2. Templates don't know anything about dynamic types and they can't operate on them. Go with either runtime polymorphism or compile-time polymorphism, sometimes also called static polymorphism.
The runtime polymorphism approach is the visitor pattern, with double dispatch. Here is an example of compile-time polymorphism, using the CRTP idiom:
template<class TDerived>
class Message{};
class Message1 : public Message<Message1>{};
class Message2 : public Message<Message2>{};
template<class TMessage>
class MessageWithData : public TMessage { public: int x, y; };
class Handler{
public:
template<class TMessage>
void process(Message<TMessage> const& m, int x, int y){
MessageWithData<TMessage> mwd;
mwd.x = 42;
mwd.y = 1337;
}
};
You have
void process(const Message& message, int x, int y)
{
// HERE
messageWithData.dispatch(...)
}
At HERE, you want to create either a MessageWithData<Message1> or a MessageWithData<Message2>, depending on whether message is an instance of Message1 or Message1.
But you cannot do that, because the class template MessageWithData<T> needs to know at compile time what T should be, but that type is not available at that point in the code until runtime by dispatching into message.
As has been mentioned, it is not possible to build your template as is.
I do not see any issue with passing additional parameters, though I would perhaps pack them into a single structure, for ease of manipulation.
Certainly I find it more idiomatic to use a supplementary Data parameter, rather than extending a class hierarchy to shoehorn all this into a pattern.
It is an anti-pattern to try to make a design fit a pattern. The proper way is to adapt the pattern so that it fits the design.
That being said...
There are several alternatives to your solution. Inheritance seems weird, but without the whole design at hand it may be your best bet.
It has been mentioned already that you cannot freely mix compile-time and run-time polymorphisms. I usually use Shims to circumvent the issue:
class Message {};
template <typename T> class MessageShim<T>: public Message {};
class Message1: public MessageShim<Message1> {};
The scheme is simple and allow you to benefit from the best of both worlds:
Message being non-template mean that you can apply traditional OO strategies
MessageShim<T> being template mean that you can apply traditional Generic Programming strategies
Once done, you should be able to get what you want, for better or worse.
As Xeo says, you probably shouldn't do this in this particular case - better design alternatives exist. That said, you can do it with RTTI, but it's generally frowned upon because your process() becomes a centralised maintenance point that needs to be updated as new derived classes are added. That's easily overlooked and prone to run-time errors.
If you must persue this for some reason, then at least generalise the facility so a single function uses RTTI-based runtime type determination to invoke arbitrary behaviour, as in:
#include <iostream>
#include <stdexcept>
struct Base
{
virtual ~Base() { }
template <class Op>
void for_rt_type(Op& op);
};
struct Derived1 : Base
{
void f() { std::cout << "Derived1::f()\n"; }
};
struct Derived2 : Base
{
void f() { std::cout << "Derived2::f()\n"; }
};
template <class Op>
void Base::for_rt_type(Op& op)
{
if (Derived1* p = dynamic_cast<Derived1*>(this))
op(p);
else if (Derived2* p = dynamic_cast<Derived2*>(this))
op(p);
else
throw std::runtime_error("unmatched dynamic type");
}
struct Op
{
template <typename T>
void operator()(T* p)
{
p->f();
}
};
int main()
{
Derived1 d1;
Derived2 d2;
Base* p1 = &d1;
Base* p2 = &d2;
Op op;
p1->for_rt_type(op);
p2->for_rt_type(op);
}
In the code above, you can substitute your own Op and have the same runtime-to-compiletime handover take place. It may or may not help to think of this as a factory method in reverse :-}.
As discussed, for_rt_type has to be updated for each derived type: particularly painful if one team "owns" the base class and other teams write derived classes. As with a lot of slightly hacky things, it's more practical and maintainable in support of private implementation rather than as an API feature of a low-level enterprise library. Wanting to use this is still typically a sign of bad design elsewhere, but not always: occasionally there are algorithms (Ops) that benefit enormously:
compile-time optimisations, dead code removal etc.
derived types only need same semantics, but details can vary
e.g. Derived1::value_type is int, Derived2::value_type is double - allows algorithms for each to be efficient and use appropriate rounding etc.. Similarly for different container types where only a shared API is exercised.
you can use template metaprogramming, SFINAE etc. to customise the behaviours in a derived-type specific way
Personally, I think knowledge of and ability to apply this technique (however rarely) is an important part of mastering polymorphism.

How to implement the factory method pattern in C++ correctly

There's this one thing in C++ which has been making me feel uncomfortable for quite a long time, because I honestly don't know how to do it, even though it sounds simple:
How do I implement Factory Method in C++ correctly?
Goal: to make it possible to allow the client to instantiate some object using factory methods instead of the object's constructors, without unacceptable consequences and a performance hit.
By "Factory method pattern", I mean both static factory methods inside an object or methods defined in another class, or global functions. Just generally "the concept of redirecting the normal way of instantiation of class X to anywhere else than the constructor".
Let me skim through some possible answers which I have thought of.
0) Don't make factories, make constructors.
This sounds nice (and indeed often the best solution), but is not a general remedy. First of all, there are cases when object construction is a task complex enough to justify its extraction to another class. But even putting that fact aside, even for simple objects using just constructors often won't do.
The simplest example I know is a 2-D Vector class. So simple, yet tricky. I want to be able to construct it both from both Cartesian and polar coordinates. Obviously, I cannot do:
struct Vec2 {
Vec2(float x, float y);
Vec2(float angle, float magnitude); // not a valid overload!
// ...
};
My natural way of thinking is then:
struct Vec2 {
static Vec2 fromLinear(float x, float y);
static Vec2 fromPolar(float angle, float magnitude);
// ...
};
Which, instead of constructors, leads me to usage of static factory methods... which essentially means that I'm implementing the factory pattern, in some way ("the class becomes its own factory"). This looks nice (and would suit this particular case), but fails in some cases, which I'm going to describe in point 2. Do read on.
another case: trying to overload by two opaque typedefs of some API (such as GUIDs of unrelated domains, or a GUID and a bitfield), types semantically totally different (so - in theory - valid overloads) but which actually turn out to be the same thing - like unsigned ints or void pointers.
1) The Java Way
Java has it simple, as we only have dynamic-allocated objects. Making a factory is as trivial as:
class FooFactory {
public Foo createFooInSomeWay() {
// can be a static method as well,
// if we don't need the factory to provide its own object semantics
// and just serve as a group of methods
return new Foo(some, args);
}
}
In C++, this translates to:
class FooFactory {
public:
Foo* createFooInSomeWay() {
return new Foo(some, args);
}
};
Cool? Often, indeed. But then- this forces the user to only use dynamic allocation. Static allocation is what makes C++ complex, but is also what often makes it powerful. Also, I believe that there exist some targets (keyword: embedded) which don't allow for dynamic allocation. And that doesn't imply that the users of those platforms like to write clean OOP.
Anyway, philosophy aside: In the general case, I don't want to force the users of the factory to be restrained to dynamic allocation.
2) Return-by-value
OK, so we know that 1) is cool when we want dynamic allocation. Why won't we add static allocation on top of that?
class FooFactory {
public:
Foo* createFooInSomeWay() {
return new Foo(some, args);
}
Foo createFooInSomeWay() {
return Foo(some, args);
}
};
What? We can't overload by the return type? Oh, of course we can't. So let's change the method names to reflect that. And yes, I've written the invalid code example above just to stress how much I dislike the need to change the method name, for example because we cannot implement a language-agnostic factory design properly now, since we have to change names - and every user of this code will need to remember that difference of the implementation from the specification.
class FooFactory {
public:
Foo* createDynamicFooInSomeWay() {
return new Foo(some, args);
}
Foo createFooObjectInSomeWay() {
return Foo(some, args);
}
};
OK... there we have it. It's ugly, as we need to change the method name. It's imperfect, since we need to write the same code twice. But once done, it works. Right?
Well, usually. But sometimes it does not. When creating Foo, we actually depend on the compiler to do the return value optimisation for us, because the C++ standard is benevolent enough for the compiler vendors not to specify when will the object created in-place and when will it be copied when returning a temporary object by value in C++. So if Foo is expensive to copy, this approach is risky.
And what if Foo is not copiable at all? Well, doh. (Note that in C++17 with guaranteed copy elision, not-being-copiable is no problem anymore for the code above)
Conclusion: Making a factory by returning an object is indeed a solution for some cases (such as the 2-D vector previously mentioned), but still not a general replacement for constructors.
3) Two-phase construction
Another thing that someone would probably come up with is separating the issue of object allocation and its initialisation. This usually results in code like this:
class Foo {
public:
Foo() {
// empty or almost empty
}
// ...
};
class FooFactory {
public:
void createFooInSomeWay(Foo& foo, some, args);
};
void clientCode() {
Foo staticFoo;
auto_ptr<Foo> dynamicFoo = new Foo();
FooFactory factory;
factory.createFooInSomeWay(&staticFoo);
factory.createFooInSomeWay(&dynamicFoo.get());
// ...
}
One may think it works like a charm. The only price we pay for in our code...
Since I've written all of this and left this as the last, I must dislike it too. :) Why?
First of all... I sincerely dislike the concept of two-phase construction and I feel guilty when I use it. If I design my objects with the assertion that "if it exists, it is in valid state", I feel that my code is safer and less error-prone. I like it that way.
Having to drop that convention AND changing the design of my object just for the purpose of making factory of it is.. well, unwieldy.
I know that the above won't convince many people, so let's me give some more solid arguments. Using two-phase construction, you cannot:
initialise const or reference member variables,
pass arguments to base class constructors and member object constructors.
And probably there could be some more drawbacks which I can't think of right now, and I don't even feel particularly obliged to since the above bullet points convince me already.
So: not even close to a good general solution for implementing a factory.
Conclusions:
We want to have a way of object instantiation which would:
allow for uniform instantiation regardless of allocation,
give different, meaningful names to construction methods (thus not relying on by-argument overloading),
not introduce a significant performance hit and, preferably, a significant code bloat hit, especially at client side,
be general, as in: possible to be introduced for any class.
I believe I have proven that the ways I have mentioned don't fulfil those requirements.
Any hints? Please provide me with a solution, I don't want to think that this language won't allow me to properly implement such a trivial concept.
First of all, there are cases when
object construction is a task complex
enough to justify its extraction to
another class.
I believe this point is incorrect. The complexity doesn't really matter. The relevance is what does. If an object can be constructed in one step (not like in the builder pattern), the constructor is the right place to do it. If you really need another class to perform the job, then it should be a helper class that is used from the constructor anyway.
Vec2(float x, float y);
Vec2(float angle, float magnitude); // not a valid overload!
There is an easy workaround for this:
struct Cartesian {
inline Cartesian(float x, float y): x(x), y(y) {}
float x, y;
};
struct Polar {
inline Polar(float angle, float magnitude): angle(angle), magnitude(magnitude) {}
float angle, magnitude;
};
Vec2(const Cartesian &cartesian);
Vec2(const Polar &polar);
The only disadvantage is that it looks a bit verbose:
Vec2 v2(Vec2::Cartesian(3.0f, 4.0f));
But the good thing is that you can immediately see what coordinate type you're using, and at the same time you don't have to worry about copying. If you want copying, and it's expensive (as proven by profiling, of course), you may wish to use something like Qt's shared classes to avoid copying overhead.
As for the allocation type, the main reason to use the factory pattern is usually polymorphism. Constructors can't be virtual, and even if they could, it wouldn't make much sense. When using static or stack allocation, you can't create objects in a polymorphic way because the compiler needs to know the exact size. So it works only with pointers and references. And returning a reference from a factory doesn't work too, because while an object technically can be deleted by reference, it could be rather confusing and bug-prone, see Is the practice of returning a C++ reference variable, evil? for example. So pointers are the only thing that's left, and that includes smart pointers too. In other words, factories are most useful when used with dynamic allocation, so you can do things like this:
class Abstract {
public:
virtual void do() = 0;
};
class Factory {
public:
Abstract *create();
};
Factory f;
Abstract *a = f.create();
a->do();
In other cases, factories just help to solve minor problems like those with overloads you have mentioned. It would be nice if it was possible to use them in a uniform way, but it doesn't hurt much that it is probably impossible.
Simple Factory Example:
// Factory returns object and ownership
// Caller responsible for deletion.
#include <memory>
class FactoryReleaseOwnership{
public:
std::unique_ptr<Foo> createFooInSomeWay(){
return std::unique_ptr<Foo>(new Foo(some, args));
}
};
// Factory retains object ownership
// Thus returning a reference.
#include <boost/ptr_container/ptr_vector.hpp>
class FactoryRetainOwnership{
boost::ptr_vector<Foo> myFoo;
public:
Foo& createFooInSomeWay(){
// Must take care that factory last longer than all references.
// Could make myFoo static so it last as long as the application.
myFoo.push_back(new Foo(some, args));
return myFoo.back();
}
};
Have you thought about not using a factory at all, and instead making nice use of the type system? I can think of two different approaches which do this sort of thing:
Option 1:
struct linear {
linear(float x, float y) : x_(x), y_(y){}
float x_;
float y_;
};
struct polar {
polar(float angle, float magnitude) : angle_(angle), magnitude_(magnitude) {}
float angle_;
float magnitude_;
};
struct Vec2 {
explicit Vec2(const linear &l) { /* ... */ }
explicit Vec2(const polar &p) { /* ... */ }
};
Which lets you write things like:
Vec2 v(linear(1.0, 2.0));
Option 2:
you can use "tags" like the STL does with iterators and such. For example:
struct linear_coord_tag linear_coord {}; // declare type and a global
struct polar_coord_tag polar_coord {};
struct Vec2 {
Vec2(float x, float y, const linear_coord_tag &) { /* ... */ }
Vec2(float angle, float magnitude, const polar_coord_tag &) { /* ... */ }
};
This second approach lets you write code which looks like this:
Vec2 v(1.0, 2.0, linear_coord);
which is also nice and expressive while allowing you to have unique prototypes for each constructor.
You can read a very good solution in: http://www.codeproject.com/Articles/363338/Factory-Pattern-in-Cplusplus
The best solution is on the "comments and discussions", see the "No need for static Create methods".
From this idea, I've done a factory. Note that I'm using Qt, but you can change QMap and QString for std equivalents.
#ifndef FACTORY_H
#define FACTORY_H
#include <QMap>
#include <QString>
template <typename T>
class Factory
{
public:
template <typename TDerived>
void registerType(QString name)
{
static_assert(std::is_base_of<T, TDerived>::value, "Factory::registerType doesn't accept this type because doesn't derive from base class");
_createFuncs[name] = &createFunc<TDerived>;
}
T* create(QString name) {
typename QMap<QString,PCreateFunc>::const_iterator it = _createFuncs.find(name);
if (it != _createFuncs.end()) {
return it.value()();
}
return nullptr;
}
private:
template <typename TDerived>
static T* createFunc()
{
return new TDerived();
}
typedef T* (*PCreateFunc)();
QMap<QString,PCreateFunc> _createFuncs;
};
#endif // FACTORY_H
Sample usage:
Factory<BaseClass> f;
f.registerType<Descendant1>("Descendant1");
f.registerType<Descendant2>("Descendant2");
Descendant1* d1 = static_cast<Descendant1*>(f.create("Descendant1"));
Descendant2* d2 = static_cast<Descendant2*>(f.create("Descendant2"));
BaseClass *b1 = f.create("Descendant1");
BaseClass *b2 = f.create("Descendant2");
I mostly agree with the accepted answer, but there is a C++11 option that has not been covered in existing answers:
Return factory method results by value, and
Provide a cheap move constructor.
Example:
struct sandwich {
// Factory methods.
static sandwich ham();
static sandwich spam();
// Move constructor.
sandwich(sandwich &&);
// etc.
};
Then you can construct objects on the stack:
sandwich mine{sandwich::ham()};
As subobjects of other things:
auto lunch = std::make_pair(sandwich::spam(), apple{});
Or dynamically allocated:
auto ptr = std::make_shared<sandwich>(sandwich::ham());
When might I use this?
If, on a public constructor, it is not possible to give meaningful initialisers for all class members without some preliminary calculation, then I might convert that constructor to a static method. The static method performs the preliminary calculations, then returns a value result via a private constructor which just does a member-wise initialisation.
I say 'might' because it depends on which approach gives the clearest code without being unnecessarily inefficient.
Loki has both a Factory Method and an Abstract Factory. Both are documented (extensively) in Modern C++ Design, by Andei Alexandrescu. The factory method is probably closer to what you seem to be after, though it's still a bit different (at least if memory serves, it requires you to register a type before the factory can create objects of that type).
I don't try to answer all of my questions, as I believe it is too broad. Just a couple of notes:
there are cases when object construction is a task complex enough to justify its extraction to another class.
That class is in fact a Builder, rather than a Factory.
In the general case, I don't want to force the users of the factory to be restrained to dynamic allocation.
Then you could have your factory encapsulate it in a smart pointer. I believe this way you can have your cake and eat it too.
This also eliminates the issues related to return-by-value.
Conclusion: Making a factory by returning an object is indeed a solution for some cases (such as the 2-D vector previously mentioned), but still not a general replacement for constructors.
Indeed. All design patterns have their (language specific) constraints and drawbacks. It is recommended to use them only when they help you solve your problem, not for their own sake.
If you are after the "perfect" factory implementation, well, good luck.
This is my c++11 style solution. parameter 'base' is for base class of all sub-classes. creators, are std::function objects to create sub-class instances, might be a binding to your sub-class' static member function 'create(some args)'. This maybe not perfect but works for me. And it is kinda 'general' solution.
template <class base, class... params> class factory {
public:
factory() {}
factory(const factory &) = delete;
factory &operator=(const factory &) = delete;
auto create(const std::string name, params... args) {
auto key = your_hash_func(name.c_str(), name.size());
return std::move(create(key, args...));
}
auto create(key_t key, params... args) {
std::unique_ptr<base> obj{creators_[key](args...)};
return obj;
}
void register_creator(const std::string name,
std::function<base *(params...)> &&creator) {
auto key = your_hash_func(name.c_str(), name.size());
creators_[key] = std::move(creator);
}
protected:
std::unordered_map<key_t, std::function<base *(params...)>> creators_;
};
An example on usage.
class base {
public:
base(int val) : val_(val) {}
virtual ~base() { std::cout << "base destroyed\n"; }
protected:
int val_ = 0;
};
class foo : public base {
public:
foo(int val) : base(val) { std::cout << "foo " << val << " \n"; }
static foo *create(int val) { return new foo(val); }
virtual ~foo() { std::cout << "foo destroyed\n"; }
};
class bar : public base {
public:
bar(int val) : base(val) { std::cout << "bar " << val << "\n"; }
static bar *create(int val) { return new bar(val); }
virtual ~bar() { std::cout << "bar destroyed\n"; }
};
int main() {
common::factory<base, int> factory;
auto foo_creator = std::bind(&foo::create, std::placeholders::_1);
auto bar_creator = std::bind(&bar::create, std::placeholders::_1);
factory.register_creator("foo", foo_creator);
factory.register_creator("bar", bar_creator);
{
auto foo_obj = std::move(factory.create("foo", 80));
foo_obj.reset();
}
{
auto bar_obj = std::move(factory.create("bar", 90));
bar_obj.reset();
}
}
Factory Pattern
class Point
{
public:
static Point Cartesian(double x, double y);
private:
};
And if you compiler does not support Return Value Optimization, ditch it, it probably does not contain much optimization at all...
extern std::pair<std::string_view, Base*(*)()> const factories[2];
decltype(factories) factories{
{"blah", []() -> Base*{return new Blah;}},
{"foo", []() -> Base*{return new Foo;}}
};
I know this question has been answered 3 years ago, but this may be what your were looking for.
Google has released a couple of weeks ago a library allowing easy and flexible dynamic object allocations. Here it is: http://google-opensource.blogspot.fr/2014/01/introducing-infact-library.html