Simplify an extensible "Perform Operation X on Data Y" framework - c++

tl;dr
My goal is to conditionally provide implementations for abstract virtual methods in an intermediate workhorse template class (depending on template parameters), but to leave them abstract otherwise so that classes derived from the template are reminded by the compiler to implement them if necessary.
I am also grateful for pointers towards better solutions in general.
Long version
I am working on an extensible framework to perform "operations" on "data". One main goal is to allow XML configs to determine program flow, and allow users to extend both allowed data types and operations at a later date, without having to modify framework code.
If either one (operations or data types) is kept fixed architecturally, there are good patterns to deal with the problem. If allowed operations are known ahead of time, use abstract virtual functions in your data types (new data have to implement all required functionality to be usable). If data types are known ahead of time, use the Visitor pattern (where the operation has to define virtual calls for all data types).
Now if both are meant to be extensible, I could not find a well-established solution.
My solution is to declare them independently from one another and then register "operation X for data type Y" via an operation factory. That way, users can add new data types, or implement additional or alternative operations and they can be produced and configured using the same XML framework.
If you create a matrix of (all data types) x (all operations), you end up with a lot of classes. Hence, they should be as minimal as possible, and eliminate trivial boilerplate code as far as possible, and this is where I could use some inspiration and help.
There are many operations that will often be trivial, but might not be in specific cases, such as Clone() and some more (omitted here for "brevity"). My goal is to conditionally provide implementations for abstract virtual methods if appropriate, but to leave them abstract otherwise.
Some solutions I considered
As in example below: provide default implementation for trivial operations. Consequence: Nontrivial operations need to remember to override with their own methods. Can lead to run-time problems if some future developer forgets to do that.
Do NOT provide defaults. Consequence: Nontrivial functions need to be basically copy & pasted for every final derived class. Lots of useless copy&paste code.
Provide an additional template class derived from cOperation base class that implements the boilerplate functions and nothing else (template parameters similar to specific operation workhorse templates). Derived final classes inherit from their concrete operation base class and that template. Consequence: both concreteOperationBase and boilerplateTemplate need to inherit virtually from cOperation. Potentially some run-time overhead, from what I found on SO. Future developers need to let their operations inherit virtually from cOperation.
std::enable_if magic. Didn't get the combination of virtual functions and templates to work.
Here is a (fairly) minimal compilable example of the situation:
//Base class for all operations on all data types. Will be inherited from. A lot. Base class does not define any concrete operation interface, nor does it necessarily know any concrete data types it might be performed on.
class cOperation
{
public:
virtual ~cOperation() {}
virtual std::unique_ptr<cOperation> Clone() const = 0;
virtual bool Serialize() const = 0;
//... more virtual calls that can be either trivial or quite involved ...
protected:
cOperation(const std::string& strOperationID, const std::string& strOperatesOnType)
: m_strOperationID()
, m_strOperatesOnType(strOperatesOnType)
{
//empty
}
private:
std::string m_strOperationID;
std::string m_strOperatesOnType;
};
//Base class for all data types. Will be inherited from. A lot. Does not know any operations that might be performed on it.
struct cDataTypeBase
{
virtual ~cDataTypeBase() {}
};
Now, I'll define an example data type.
//Some concrete data type. Still does not know any operations that might be performed on it.
struct cDataTypeA : public cDataTypeBase
{
static const std::string& GetDataName()
{
static const std::string strMyName = "cDataTypeA";
return strMyName;
}
};
And here is an example operation. It defines a concrete operation interface, but does not know the data types it might be performed on.
//Some concrete operation. Does not know all data types it might be expected to work on.
class cConcreteOperationX : public cOperation
{
public:
virtual bool doSomeConcreteOperationX(const cDataTypeBase& dataBase) = 0;
protected:
cConcreteOperationX(const std::string& strOperatesOnType)
: cOperation("concreteOperationX", strOperatesOnType)
{
//empty
}
};
The following template is meant to be the boilerplate workhorse. It implements as much trivial and repetitive code as possible and is provided alongside the concrete operation base class - concrete data types are still unknown, but are meant to be provided as template parameters.
//ConcreteOperationTemplate: absorb as much common/trivial code as possible, so concrete derived classes can have minimal code for easy addition of more supported data types
template <typename ConcreteDataType, typename DerivedOperationType, bool bHasTrivialCloneAndSerialize = false>
class cConcreteOperationXTemplate : public cConcreteOperationX
{
public:
//Can perform datatype cast here:
virtual bool doSomeConcreteOperationX(const cDataTypeBase& dataBase) override
{
const ConcreteDataType* pCastData = dynamic_cast<const ConcreteDataType*>(&dataBase);
if (pCastData == nullptr)
{
return false;
}
return doSomeConcreteOperationXOnCastData(*pCastData);
}
protected:
cConcreteOperationXTemplate()
: cConcreteOperationX(ConcreteDataType::GetDataName()) //requires ConcreteDataType to have a static method returning something appropriate
{
//empty
}
private:
//Clone can be implemented here via CRTP
virtual std::unique_ptr<cOperation> Clone() const override
{
return std::unique_ptr<cOperation>(new DerivedOperationType(*static_cast<const DerivedOperationType*>(this)));
}
//TODO: Some Magic here to enable trivial serializations, but leave non-trivials abstract
//Problem with current code is that virtual bool Serialize() override will also be overwritten for bHasTrivialCloneAndSerialize == false
virtual bool Serialize() const override
{
return true;
}
virtual bool doSomeConcreteOperationXOnCastData(const ConcreteDataType& castData) = 0;
};
Here are two implementations of the example operation on the example data type. One of them will be registered as the default operation, to be used if the user does not declare anything else in the config, and the other is a potentially much more involved non-default operation that might take many additional parameters into account (these would then have to be serialized in order to be correctly re-instantiated on the next program run). These operations need to know both the operation and the data type they relate to, but could potentially be implemented at a much later time, or in a different software component where the specific combination of operation and data type are required.
//Implementation of operation X on type A. Needs to know both of these, but can be implemented if and when required.
class cConcreteOperationXOnTypeADefault : public cConcreteOperationXTemplate<cDataTypeA, cConcreteOperationXOnTypeADefault, true>
{
virtual bool doSomeConcreteOperationXOnCastData(const cDataTypeA& castData) override
{
//...do stuff...
return true;
}
};
//Different implementation of operation X on type A.
class cConcreteOperationXOnTypeASpecialSauce : public cConcreteOperationXTemplate<cDataTypeA, cConcreteOperationXOnTypeASpecialSauce/*, false*/>
{
virtual bool doSomeConcreteOperationXOnCastData(const cDataTypeA& castData) override
{
//...do stuff...
return true;
}
//Problem: Compiler does not remind me that cConcreteOperationXOnTypeASpecialSauce might need to implement this method
//virtual bool Serialize() override {}
};
int main(int argc, char* argv[])
{
std::map<std::string, std::map<std::string, std::unique_ptr<cOperation>>> mapOpIDAndDataTypeToOperation;
//...fill map, e.g. via XML config / factory method...
const cOperation& requestedOperation = *mapOpIDAndDataTypeToOperation.at("concreteOperationX").at("cDataTypeA");
//...do stuff...
return 0;
}

if you data types are not virtual (for each operation call you know both operation type and data type at compile time) you may consider following approach:
#include<iostream>
#include<string>
template<class T>
void empty(T t){
std::cout<<"warning about missing implementation"<<std::endl;
}
template<class T>
void simple_plus(T){
std::cout<<"simple plus"<<std::endl;
}
void plus_string(std::string){
std::cout<<"plus string"<<std::endl;
}
template<class Data, void Implementation(Data)>
class Operation{
public:
static void exec(Data d){
Implementation(d);
}
};
#define macro_def(OperationName) template<class T> class OperationName : public Operation<T, empty<T>>{};
#define macro_template_inst( TypeName, OperationName, ImplementationName ) template<> class OperationName<TypeName> : public Operation<TypeName, ImplementationName<TypeName>>{};
#define macro_inst( TypeName, OperationName, ImplementationName ) template<> class OperationName<TypeName> : public Operation<TypeName, ImplementationName>{};
// this part may be generated on base of .xml file and put into .h file, and then just #include generated.h
macro_def(Plus)
macro_template_inst(int, Plus, simple_plus)
macro_template_inst(double, Plus, simple_plus)
macro_inst(std::string, Plus, plus_string)
int main() {
Plus<int>::exec(2);
Plus<double>::exec(2.5);
Plus<float>::exec(2.5);
Plus<std::string>::exec("abc");
return 0;
}
Minus of this approach is that you'd have to compile project in 2 steps: 1) transform .xml to .h 2) compile project using generated .h file. On plus side compiler/ide (I use qtcreator with mingw) gives warning about unused parameter t in function
void empty(T t)
and stack trace where from it was called.

Related

Separate class ownership and use, generate optimal (fast) code

In general, my question was simple, I want to imlement some design pattern, which allows following:
there is exists some predefined interface (Interface class);
and exists class (Utilizer), which accepts another class (via pointer, reference, smart-pointer, whatever else...) implementing predefined interface, and stars using this class via the interface;
class Utilizer should be able to own other class passed to it (which implements Interface) and delete it when Utilizer is destroyed.
In managed languages (like C#, Java) this can be implemented in simple way: class Utilizer might accept reference to base class (Interface) and hold this reference in the class, and use interface via the reference. On destruction of Utilizer class, the garbage collector might delete class, which implements `Interface'.
In C++ we have no garbage collector... Ok, we can use some smart_pointer, but this might be not generic smart pointer, but smart pointer of some particular type (for example, unique_ptr with user specified deleter, because class, which implements Interface is resided in shared memory and regular operator delete() can't be applied to this class...)
And second nuisance: virtual functions. Of course, when you are using managed languages you may not notice this. But if you made Interface class as abstract base class (with virtual keyword), then you will notice, that in test function (see the code below) compiler performs indirect calls (via function pointers). This happens because compiler needs to access virtual functions table. The call via function pointer is not very heavy (few processor ticks, or event tens of ticks), but the major issue is that compiler doesn't see that happens next, after the indirection. Optimizer stops here. Functions can't be inlined anymore. And we get not optimal code, which doesn't reduces to few machine instructions (for example test function reduces in the example to loading of two constant and calling printf function), we get unoptimal "generic" implementation, which effectively nullifies all the benefits of C++.
There is typical solution to avoid getting of unoptimal code -- avoid using virtual functions (prefer CRTP pattern instead), avoid type erasure (in the example, Utilizer class might store not Accessor, but std::function<Interface<T>&()> -- this solution is nice, but indirection in std::function leads to generation of unoptimal code again).
And the essence of the question, how to implement the logic described above (class which owns other abstract, non some particular, class and uses it) in C++ effectively?
Not sure if I was able to clearly express my thought. Below is the my implementation with the comments. It generates optimal code (see disassembly of test function in live demo live demo), all is inlined as expected. But the whole implementation looks cumbersome.
I would like to hear how can I improve the code.
#include <utility>
#include <memory>
#include <functional>
#include <stdio.h>
#include <math.h>
// This type implements interface: later Utilizer class
// accept Accessor type, which was able to return reference
// to object of some type, which implements this interface,
// and Utilizer class uses returned object via this interface.
template <typename Impl> class Interface
{
public:
int oper(int arg) { return static_cast<Impl*>(this)->oper(arg); }
const char *name() const { return static_cast<const Impl*>(this)->name(); }
};
// Class which uses object, returned by Accessor class, via
// predefined interface of type Interface<Impl>.
// Utilizer class can perform operations on any class
// which inherited from Interface class, but Utilizer
// doesn't directly owns parficular instance of the
// class implementing Interface: Accessor serves for
// getting of particular implementation of Interface
// from somewhere.
template <typename Accessor> class Utilizer
{
private:
typedef typename std::remove_reference<decltype(std::declval<Accessor>()())>::type Impl;
Accessor accessor;
// This static_cast allows only such Accessor types, for
// which operator() returns class inherited from Interface
Interface<Impl>& get() const { return static_cast<Interface<Impl>&>(accessor()); }
public:
template <typename...Args> Utilizer(Args&& ...args) : accessor(std::forward<Args>(args)...) {}
// Following functions is the public interface of Utilizer class
// (this interface have no relations with Interface class,
// except of the fact, that implementation uses Interface class):
double func(int a, int b)
{
if (a > 0) return sqrt(get().oper(a) + b);
else return get().oper(b) * a;
}
const char *text() const
{
const char *result = get().name();
if (result == nullptr) return "unknown";
return result;
}
};
// This is implementation of Interface<Impl> interface
// (program may have multiple similar classes and Utilizer
// can work with any of these classes).
struct Implementation : public Interface<Implementation>
{
Implementation() { puts("Implementation()"); }
Implementation(const Implementation&) { puts("copy Implementation"); }
~Implementation() { puts("~Implementation()"); }
// Following functions are implementation of functions
// defined in Interface<Impl>:
int oper(int arg) { return arg + 42; }
const char *name() const { return "implementation"; }
};
// This is class which owns some particular implementation
// of the class inherited from Interface. This class only
// owns the class which was given to it and allows accessing
// this class via operator(). This class is intendent to be
// template argument for Utilizer class.
template <typename SmartPointer> struct Owner
{
SmartPointer p;
Owner(Owner&& other) : p(std::move(other.p)) {}
template <typename... Args> Owner(Args&&...args) : p(std::forward<Args>(args)...) {}
Implementation& operator()() const { return *p; }
};
typedef std::unique_ptr<Implementation> PtrType;
typedef Utilizer<Owner<PtrType> > UtilType;
void test(UtilType& utilizer)
{
printf("%f %s\n", utilizer.func(1, 2), utilizer.text());
}
int main()
{
PtrType t(new Implementation);
UtilType utilizer(std::move(t));
test(utilizer);
return 0;
}
Your CPU is smarter than you think. Modern CPUs are absolutely capable of guessing the target of, and speculatively executing through, an indirect branch. The speed of the L1 cache, and register renaming, often remove most or all of the extra cost of a non-inlined call. And the 80/20 rule applies in spades: Your test code's bottleneck is the internal processing done by puts, not the late binding you're trying to avoid.
To answer your question, you could improve your code by removing all that template stuff: it would be just as fast, and more maintainable (hence more practical to do actual optimization). Optimization of algorithms and data structures should often be done up-front; optimization of low-level instruction streams should never, ever, ever be done except after analyzing profiling results.

C++ Double dispatch with runtime polymorphism?

Is it possible to perform double dispatch with runtime polymorphism?
Say I have some classes, and some of those classes can be added/multiplied/etc., and I want to store those dynamically within another class that performs type erasure at runtime. Then say I want to perform basic operations on the data held within that class.
The way to handle this (as far as I'm aware) is to use double dispatch to specialize the operation. However, all of the solutions I have encountered rely on the fact that you have a numerable amount of types, and then use virtual function calls or dynamic_casts, if-else, and RTTI to deduce the type at runtime. Because the data held within the class isn't known until runtime, I can't create a bunch of virtual methods or do a brute force check on the types. So I figured the visitor pattern would be the best solution, but even then, I can't seem to get whether or not this is possible.
I have a wrapper class that holds a smart pointer to a nested polymorphic class to implement the type erasure and runtime polymorphism, but I can't figure out if it's possible to use the visitor pattern to do double dispatch on this.
Note that the code below is incomplete, it just shows my thought process.
class Wrapper {
private:
class Concept;
template<typename T> class Model;
class BaseVisitor {
public:
virtual ~Visitor() = default;
virtual void visit(Concept &) = 0;
};
template<typename T>
class Visitor : public BaseVisitor {
private:
T first_;
public:
Visitor(T first) : first_(first) {}
virtual void visit(Concept &other) override {
// perform addition
}
};
class Concept {
public:
virtual ~Concept() = default;
virtual void add(Concept &m) const = 0;
virtual void accept(BaseVisitor &visitor) const = 0;
};
template<typename T>
class Model final : public Concept {
private:
T data_;
public:
Model(T m)
: data_(m) {}
virtual void add(Concept &m) const override {
Visitor<T> v(data_);
m.accept(v);
};
virtual void accept(BaseVisitor &visitor) const override {
visitor.visit(*this);
};
};
std::shared_ptr<const Concept> ptr_;
// This isn't right, it just illustrates what I'm trying to do.
// friend Something operator+(Wrapper &lhs, Wrapper &rhs) {
// return (*lhs.ptr_).add(*rhs.ptr_);
// }
public:
template<typename T>
Wrapper(T value) : ptr_(std::make_shared<Model<T>>(value)) {}
};
I've looked into implementing double dispatch using function pointers, template specialization, and static type IDs as well, but I can't seem to figure out how to make it work.
Is this even possible?
EDIT
Based on the comments below, in order to be more specific and to give a little more background, I am using templated classes that use template functions to perform basic operations like addition and multiplication. However, I would also like to store those templated classes within a vector, hence the type erasure. Now, if I wanted to do operations on those classes after I perform the type erasure, I need some way to deduce the type for the templated function. However, since I can't easily get the internal held type back from the Wrapper, I am hoping that there is a way I can call the correct template function on the data held within the Wrapper::Model<T> class, whether that is a visitor pattern, static type IDs, whatever.
To be even more specific, I am working with classes to do delayed evaluation and symbolic computations, meaning I have classes such as Number<T>, which can be Number<int>, Number<double>, etc. and classes such as Variable, Complex<T> and all of the TMP combinations for various operations, such as Add<Mul<Variable, Variable>, Number<double>>, etc.
I can work with all of these fine at compile-time, but then I need to be able to store these in a vector -- something like std::vector<Wrapper> x = {Number<int>, Variable, Add<Number<double>, Variable>};. My best guess at this was to perform type erasure to store the expressions inside the polymorphic Wrapper. This serves double-duty to enable runtime parsing support of symbolic expressions.
However, the functions I wrote to handle the addition, such as
template<typename T1, typename T2>
const Add<T1, T2> operator+(const T1 &lhs, const T2 &rhs)
{ return Add<T1, T2>(lhs, rhs); }
can't accept Wrapper and pull the type out (due to the type erasure). I can, however, insert the Wrapper into the Add expression class, meaning I can carry around the hidden types. The problem is when I actually get down to evaluating the result of something like Add<Wrapper, Wrapper>. In order to know what this comes out to, I'd need to figure out what's actually inside or to do something along the lines of double dispatch.
The main problem is that the examples for double dispatch that most closely match my problem, like this question on SO, rely on the fact that I can write out all of the classes, such as Shapes, Rectangles. Since I can't explicitly do that, I'm wondering if there's a way to perform double dispatch to evaluate the expression based on the data held inside the Model<T> class above.

c++ - abstract class and alternative to virtual constructor

Let say I have the following code:
class Block{
private:
data Data;
public:
data getData();
Block(arg3 Arg3, arg4 Arg4);
};
Actually, there are several ways to build a block, but always with the same member Data and method getData(), the only difference is how to build the block. In other words, the only difference is the constructor...
Instead of writing a different class for each building process, I could factorize parts of my code, defining and declaring getData in an abstract class, if there were such thing as a virtual constructor in c++ that I could write differently for each derived class corresponding to a different building process.
I do not have a lot experience for this kind of things, so I wondered if there was an alternative to a virtual constructor ? or may be a different way to do this factorization ?
PS: I am aware of https://isocpp.org/wiki/faq/virtual-functions#virtual-ctors but it seems quite complex regarding what I want to do, which seems quite common... I just want to factorize shared code between several classes, which corresponds to everything except the constructor. And I want to force new classes corresponding to other building processes to implement a new constructor.
More details about my particular situation:
I have an algorithm where I use blocks and it does not depend on their building process, so I have implemented the algorithm using a template argument to represent a block indifferently of its building process. But I use a few methods and its constructor, so I need my classes representing blocks to all have the same kind of methods I need and the same constructor to use them as a template argument of my algorithm implementation. That is why I thought of abstract class, to force a newly implemented class representing blocks to have the methods and the constructor I need in the algorithm I implemented. May be it is a bad design pattern and that is why I am stuck...
EDIT
Thank you for your answers so far. I tried to be a little generic but I feel that it is actually too vague, even with the details I gave at the end. So here is what I thought to do: I have a Matrix class as follows
// Matrix.hpp
template<typename GenericBlock> class Matrix{
std::vector<GenericBlock> blocks;
Matrix(arg1 Arg1, arg2 Arg2);
};
template<typename GenericBlock>
Matrix<GenericBlock>::Matrix(arg1 Arg1, arg2 Arg2){
// Do stuff
GenericBlock B(arg3 Arg3, arg4 Arg4);
B.getData();
}
The blocks are actually compressed, and there exists several ways to compress them and it does not change anything in the class Matrix. To avoid writing a matrix class for each compression technics, I used a template argument as you saw. So I just need to write a class for each compression technics, but they must have the same methods and constructor arguments to be compatible with Matrix.
That is why I thought of doing an abstract class, for writing a class for each compression technics. In the abstract class, I would write everything needed in Matrix so that every derived class would be compatible with Matrix. My problem now in my example is: I can define getData in the abstract class because it is always the same (for example, Datacan be the number of rows). The only thing derived classes would really need to define is the constructor.
One solution would be to not have an abstract class and use a protected constructor may be. But it does not force newly derived class to reimplement a constructor. That is why I am stuck. But I think this problem is generic enough to interest other people. So is there an alternative to a virtual constructor in this case ? (may be a factory pattern but it seems quite complex for a such common problem) If not, is there a better way to implement a matrix class whose blocks can be built in different manners, i.e. whose constructor can be different from each other, while having the same data and a few method in common ?
PS: I am interested in compression technics that produce low rank matrices, that is why the data is always the same, but not the building process.
From what you have shared so far it is not clear why you need an abstract class or virtual constructors. A factory function for each way of building a block will do:
class Block {
Data data;
public:
Block(Data d) : data(std::move(d)) {}
Data getData();
};
Block createABlock() { return Block{Data{1.0, 2.0, 3.0}}; }
Block createBBlock() { return Block{Data{42.0, 3.14, 11.6}}; }
int main() {
auto b1 = createABlock();
auto b2 = createBBlock();
}
Live demo.
Perhaps this needs to be extended with an abstract factory so you can pass a generic block factory around:
using BlockFactory = std::function<Block()>;
int main() {
BlockFactory f = createABlock;
auto b3 = f();
}
EDIT:
Regarding your EDIT, what you have suggested works fine. You don't need a virtual constructor. The template type GenericBlock just has to satisfy the implicit interface defined by the template. It doesn't need to derive from a particular base class (although it could do). The only thing required of it, is it must have a constructor that takes a particular set of arguments and a getData method. What you have is compile time static polymorphism, virtual functions are for run time dynamic polymorphism.
Inheritance will work fine but as I said above I'd be tempted to use some sort of factory. You may not need to template the whole Matrix class as only the constructor needs the factory. If the factory is known at compile-time this can be passed as a template parameter:
class Matrix {
std::vector<Block> blocks;
public:
template<typename BlockFactory>
Matrix(BlockFactory f);
};
template<typename BlockFactory>
Matrix::Matrix(BlockFactory f){
// Do stuff...
Block B = f();
auto data = B.getData();
for (auto v : data)
std::cout << v << " ";
std::cout << "\n";
}
int main() {
Matrix ma(createABlock);
Matrix mb(createBBlock);
}
Live demo.
TL:DR, but if the data are the same for all Blocks, you don't even need multiple classes, but only multiple constructors.
class Block
{
enum { type1, type2, type3 };
int type;
data Data;
public:
Block(int x)
: type(type1), Data(x) {}
Block(std::string const& str)
: type(type2), Data(str) {}
Block(data const*x)
: type(type3), Data(data) {}
/* ... */
};
template<class T>struct tag_t{constexpr tag_t(){}; usong type=T;};
template<class T>constexpr tag_t<T> tag{};
this lets you pass types as values.
struct BlockA{};
struct BlockB{};
class Block {
enum BlockType { typeA, typeB };;
BlockType type;
data Data;
public:
Block(tag_t<BlockA>, int x)
: type(typeA), Data(x) {}
Block(tag_t<BlockB>, int x)
: type(typeB), Data(2*x+7) {}
/* ... */
};
the blocks are all the same type. The tag determines how they are constructed.
There is no alternative to a virtual constructor, because there is no virtual constructor to begin with. I know it can be hard to accept, but it is the truth.
Anyhow, you do not need anything like what a virtual constructor would be if it existed....
[..] the only difference is how to build the block. In other words, the
only difference is the constructor...
If the only difference is in the constructor, then simply make the constructor take a parameter that tells what type of block is required. Alternatively you can have some functions that construct blocks in different ways:
struct Block {
private:
Block(){}
friend Block createWoodenBlock();
friend Block createStoneBlock();
};
Block createWoodenBlock(){ return Block(); }
Block createStoneBlock(){ return Block(); }
int main() {
Block woody = createWoodenBlock();
Block stony = createStoneBlock();
}

Should I prefer mixins or function templates to add behavior to a set of unrelated types?

Mixins and function templates are two different ways of providing a behavior to a wide set of types, as long as these types meet some requirements.
For example, let's assume that I want to write some code that allows me to save an object to a file, as long as this object provides a toString member function (this is a rather silly example, but bear with me). A first solution is to write a function template like the following:
template <typename T>
void toFile(T const & obj, std::string const & filename)
{
std::ofstream file(filename);
file << obj.toString() << '\n';
}
...
SomeClass o1;
toFile(o1, "foo.txt");
SomeOtherType o2;
toFile(o2, "bar.txt");
Another solution is to use a mixin, using CRTP:
template <typename Derived>
struct ToFile
{
void toFile(std::string const & filename) const
{
Derived * that = static_cast<Derived const *>(this);
std::ofstream file(filename);
file << that->toString() << '\n';
}
};
struct SomeClass : public ToFile<SomeClass>
{
void toString() const {...}
};
...
SomeClass o1;
o.toFile("foo.txt");
SomeOtherType o2;
o2.toFile("bar.txt");
What are the pros and cons of these two approaches? Is there a favored one, and if so, why?
The first approach is much more flexible, as it can be made to work with any type that provides any way to be converted to a std::string (this can be achieved using traits-classes) without the need to modify that type. Your second approach would always require modification of a type in order to add functionality.
Pro function templates: the coupling is looser. You don't need to derive from anything to get the functionality in a new class; in your example, you only implement the toString method and that's it. You can even use a limited form of duck typing, since the type of toString isn't specified.
Pro mixins: nothing, strictly; your requirement is for something that works with unrelated classes and mixins cause them to be become related.
Edit: Alright, due to the way the C++ type system works, the mixin solution will strictly produce unrelated classes. I'd go with the template function solution, though.
I would like to propose an alternative, often forgotten because it is a mix of duck-typing and interfaces, and very few languages propose this feat (note: very close to Go's take to interfaces actually).
// 1. Ask for a free function to exist:
void toString(std::string& buffer, SomeClass const& sc);
// 2. Create an interface that exposes this function
class ToString {
public:
virtual ~ToString() {}
virtual void toString(std::string& buffer) const = 0;
}; // class ToString
// 3. Create an adapter class (bit of magic)
template <typename T>
class ToStringT final: public ToString {
public:
ToStringT(T const& t): t(t) {}
virtual void toString(std::string& buffer) const override {
toString(buffer, t);
}
private:
T t; // note: for reference you need a reference wrapper
// I won't delve into this right now, suffice to say
// it's feasible and only require one template overload
// of toString.
}; // class ToStringT
// 4. Create an adapter maker
template <typename T>
ToStringT<T> toString(T const& t) { return std::move(ToStringT<T>(t)); }
And now ? Enjoy!
void print(ToString const& ts); // aka: the most important const
int main() {
SomeClass sc;
print(toString(sc));
};
The two stages is a bit heavyweight, however it gives an astonishing degree of functionality:
No hard-wiring data / interface (thanks to duck-typing)
Low-coupling (thanks to abstract classes)
And also easy integration:
You can write an "adapter" for an already existing interface, and migrate from an OO code base to a more agile one
You can write an "interface" for an already existing set of overloads, and migrate from a Generic code base to a more clustered one
Apart from the amount of boiler-plate, it's really amazing how you seamlessly pick advantages from both worlds.
A few thoughts I had while writing this question:
Arguments in favor of template functions:
A function can be overloaded, so third-party and built-in types can be handled.
Arguments in favor of mixins:
Homogeneous syntax: the added behavior is invoked like any other member functions. However, it is well known that the interface of a C++ class includes not only its public member functions but also the free functions that operates on instances of this type, so this is just an aesthetic improvement.
By adding a non-template base class to the mixins, we obtain an interface (in the Java/C# sense) that can be use to handle all objects providing the behavior. For example, if we make ToFile<T> inherits from FileWritable (declaring a pure virtual toFile member function), we can have a collection of FileWritable without having to resort to complicated heterogeneous data structures.
Regarding usage, I'd say that function templates are more idiomatic in C++.

given abstract base class X, how to create another template class D<T> where T is the type of the class deriving from X?

I want to be able to accept a Message& object which references either a Message1 or Message2 class. I want to be able to create a MessageWithData<Message1> or MessageWithData<Message2> based on the underlying type of the Message& object. For example, see below:
class Message {};
class Message1 : public Message {};
class Message2 : public Message {};
template<typename Message1or2>
class MessageWithData : public Message1or2 { public: int x, y; }
class Handler()
{
public:
void process(const Message& message, int x, int y)
{
// create object messageWithData whose type is
// either a MessageWithData<Message1> or a MessageWithData<Message2>
// based on message's type.. how do I do this?
//
messageWithData.dispatch(...)
}
};
The messageWithData class essentially contains methods inherited from Message which allow it to be dynamically double dispatched back to the handler based on its type. My best solution so far has been to keep the data separate from the message type, and pass it all the way through the dynamic dispatch chain, but I was hoping to come closer to the true idiom of dynamic double dispatch wherein the message type contains the variable data.
(The method I'm more or less following is from http://jogear.net/dynamic-double-dispatch-and-templates)
You're trying to mix runtime and compile-time concepts, namely (runtime-)polymorphism and templates. Sorry, but that is not possible.
Templates operate on types at compile time, also called static types. The static type of message is Message, while its dynamic type might be either Message1 or Message2. Templates don't know anything about dynamic types and they can't operate on them. Go with either runtime polymorphism or compile-time polymorphism, sometimes also called static polymorphism.
The runtime polymorphism approach is the visitor pattern, with double dispatch. Here is an example of compile-time polymorphism, using the CRTP idiom:
template<class TDerived>
class Message{};
class Message1 : public Message<Message1>{};
class Message2 : public Message<Message2>{};
template<class TMessage>
class MessageWithData : public TMessage { public: int x, y; };
class Handler{
public:
template<class TMessage>
void process(Message<TMessage> const& m, int x, int y){
MessageWithData<TMessage> mwd;
mwd.x = 42;
mwd.y = 1337;
}
};
You have
void process(const Message& message, int x, int y)
{
// HERE
messageWithData.dispatch(...)
}
At HERE, you want to create either a MessageWithData<Message1> or a MessageWithData<Message2>, depending on whether message is an instance of Message1 or Message1.
But you cannot do that, because the class template MessageWithData<T> needs to know at compile time what T should be, but that type is not available at that point in the code until runtime by dispatching into message.
As has been mentioned, it is not possible to build your template as is.
I do not see any issue with passing additional parameters, though I would perhaps pack them into a single structure, for ease of manipulation.
Certainly I find it more idiomatic to use a supplementary Data parameter, rather than extending a class hierarchy to shoehorn all this into a pattern.
It is an anti-pattern to try to make a design fit a pattern. The proper way is to adapt the pattern so that it fits the design.
That being said...
There are several alternatives to your solution. Inheritance seems weird, but without the whole design at hand it may be your best bet.
It has been mentioned already that you cannot freely mix compile-time and run-time polymorphisms. I usually use Shims to circumvent the issue:
class Message {};
template <typename T> class MessageShim<T>: public Message {};
class Message1: public MessageShim<Message1> {};
The scheme is simple and allow you to benefit from the best of both worlds:
Message being non-template mean that you can apply traditional OO strategies
MessageShim<T> being template mean that you can apply traditional Generic Programming strategies
Once done, you should be able to get what you want, for better or worse.
As Xeo says, you probably shouldn't do this in this particular case - better design alternatives exist. That said, you can do it with RTTI, but it's generally frowned upon because your process() becomes a centralised maintenance point that needs to be updated as new derived classes are added. That's easily overlooked and prone to run-time errors.
If you must persue this for some reason, then at least generalise the facility so a single function uses RTTI-based runtime type determination to invoke arbitrary behaviour, as in:
#include <iostream>
#include <stdexcept>
struct Base
{
virtual ~Base() { }
template <class Op>
void for_rt_type(Op& op);
};
struct Derived1 : Base
{
void f() { std::cout << "Derived1::f()\n"; }
};
struct Derived2 : Base
{
void f() { std::cout << "Derived2::f()\n"; }
};
template <class Op>
void Base::for_rt_type(Op& op)
{
if (Derived1* p = dynamic_cast<Derived1*>(this))
op(p);
else if (Derived2* p = dynamic_cast<Derived2*>(this))
op(p);
else
throw std::runtime_error("unmatched dynamic type");
}
struct Op
{
template <typename T>
void operator()(T* p)
{
p->f();
}
};
int main()
{
Derived1 d1;
Derived2 d2;
Base* p1 = &d1;
Base* p2 = &d2;
Op op;
p1->for_rt_type(op);
p2->for_rt_type(op);
}
In the code above, you can substitute your own Op and have the same runtime-to-compiletime handover take place. It may or may not help to think of this as a factory method in reverse :-}.
As discussed, for_rt_type has to be updated for each derived type: particularly painful if one team "owns" the base class and other teams write derived classes. As with a lot of slightly hacky things, it's more practical and maintainable in support of private implementation rather than as an API feature of a low-level enterprise library. Wanting to use this is still typically a sign of bad design elsewhere, but not always: occasionally there are algorithms (Ops) that benefit enormously:
compile-time optimisations, dead code removal etc.
derived types only need same semantics, but details can vary
e.g. Derived1::value_type is int, Derived2::value_type is double - allows algorithms for each to be efficient and use appropriate rounding etc.. Similarly for different container types where only a shared API is exercised.
you can use template metaprogramming, SFINAE etc. to customise the behaviours in a derived-type specific way
Personally, I think knowledge of and ability to apply this technique (however rarely) is an important part of mastering polymorphism.