Separate class ownership and use, generate optimal (fast) code - c++

In general, my question was simple, I want to imlement some design pattern, which allows following:
there is exists some predefined interface (Interface class);
and exists class (Utilizer), which accepts another class (via pointer, reference, smart-pointer, whatever else...) implementing predefined interface, and stars using this class via the interface;
class Utilizer should be able to own other class passed to it (which implements Interface) and delete it when Utilizer is destroyed.
In managed languages (like C#, Java) this can be implemented in simple way: class Utilizer might accept reference to base class (Interface) and hold this reference in the class, and use interface via the reference. On destruction of Utilizer class, the garbage collector might delete class, which implements `Interface'.
In C++ we have no garbage collector... Ok, we can use some smart_pointer, but this might be not generic smart pointer, but smart pointer of some particular type (for example, unique_ptr with user specified deleter, because class, which implements Interface is resided in shared memory and regular operator delete() can't be applied to this class...)
And second nuisance: virtual functions. Of course, when you are using managed languages you may not notice this. But if you made Interface class as abstract base class (with virtual keyword), then you will notice, that in test function (see the code below) compiler performs indirect calls (via function pointers). This happens because compiler needs to access virtual functions table. The call via function pointer is not very heavy (few processor ticks, or event tens of ticks), but the major issue is that compiler doesn't see that happens next, after the indirection. Optimizer stops here. Functions can't be inlined anymore. And we get not optimal code, which doesn't reduces to few machine instructions (for example test function reduces in the example to loading of two constant and calling printf function), we get unoptimal "generic" implementation, which effectively nullifies all the benefits of C++.
There is typical solution to avoid getting of unoptimal code -- avoid using virtual functions (prefer CRTP pattern instead), avoid type erasure (in the example, Utilizer class might store not Accessor, but std::function<Interface<T>&()> -- this solution is nice, but indirection in std::function leads to generation of unoptimal code again).
And the essence of the question, how to implement the logic described above (class which owns other abstract, non some particular, class and uses it) in C++ effectively?
Not sure if I was able to clearly express my thought. Below is the my implementation with the comments. It generates optimal code (see disassembly of test function in live demo live demo), all is inlined as expected. But the whole implementation looks cumbersome.
I would like to hear how can I improve the code.
#include <utility>
#include <memory>
#include <functional>
#include <stdio.h>
#include <math.h>
// This type implements interface: later Utilizer class
// accept Accessor type, which was able to return reference
// to object of some type, which implements this interface,
// and Utilizer class uses returned object via this interface.
template <typename Impl> class Interface
{
public:
int oper(int arg) { return static_cast<Impl*>(this)->oper(arg); }
const char *name() const { return static_cast<const Impl*>(this)->name(); }
};
// Class which uses object, returned by Accessor class, via
// predefined interface of type Interface<Impl>.
// Utilizer class can perform operations on any class
// which inherited from Interface class, but Utilizer
// doesn't directly owns parficular instance of the
// class implementing Interface: Accessor serves for
// getting of particular implementation of Interface
// from somewhere.
template <typename Accessor> class Utilizer
{
private:
typedef typename std::remove_reference<decltype(std::declval<Accessor>()())>::type Impl;
Accessor accessor;
// This static_cast allows only such Accessor types, for
// which operator() returns class inherited from Interface
Interface<Impl>& get() const { return static_cast<Interface<Impl>&>(accessor()); }
public:
template <typename...Args> Utilizer(Args&& ...args) : accessor(std::forward<Args>(args)...) {}
// Following functions is the public interface of Utilizer class
// (this interface have no relations with Interface class,
// except of the fact, that implementation uses Interface class):
double func(int a, int b)
{
if (a > 0) return sqrt(get().oper(a) + b);
else return get().oper(b) * a;
}
const char *text() const
{
const char *result = get().name();
if (result == nullptr) return "unknown";
return result;
}
};
// This is implementation of Interface<Impl> interface
// (program may have multiple similar classes and Utilizer
// can work with any of these classes).
struct Implementation : public Interface<Implementation>
{
Implementation() { puts("Implementation()"); }
Implementation(const Implementation&) { puts("copy Implementation"); }
~Implementation() { puts("~Implementation()"); }
// Following functions are implementation of functions
// defined in Interface<Impl>:
int oper(int arg) { return arg + 42; }
const char *name() const { return "implementation"; }
};
// This is class which owns some particular implementation
// of the class inherited from Interface. This class only
// owns the class which was given to it and allows accessing
// this class via operator(). This class is intendent to be
// template argument for Utilizer class.
template <typename SmartPointer> struct Owner
{
SmartPointer p;
Owner(Owner&& other) : p(std::move(other.p)) {}
template <typename... Args> Owner(Args&&...args) : p(std::forward<Args>(args)...) {}
Implementation& operator()() const { return *p; }
};
typedef std::unique_ptr<Implementation> PtrType;
typedef Utilizer<Owner<PtrType> > UtilType;
void test(UtilType& utilizer)
{
printf("%f %s\n", utilizer.func(1, 2), utilizer.text());
}
int main()
{
PtrType t(new Implementation);
UtilType utilizer(std::move(t));
test(utilizer);
return 0;
}

Your CPU is smarter than you think. Modern CPUs are absolutely capable of guessing the target of, and speculatively executing through, an indirect branch. The speed of the L1 cache, and register renaming, often remove most or all of the extra cost of a non-inlined call. And the 80/20 rule applies in spades: Your test code's bottleneck is the internal processing done by puts, not the late binding you're trying to avoid.
To answer your question, you could improve your code by removing all that template stuff: it would be just as fast, and more maintainable (hence more practical to do actual optimization). Optimization of algorithms and data structures should often be done up-front; optimization of low-level instruction streams should never, ever, ever be done except after analyzing profiling results.

Related

Simplify an extensible "Perform Operation X on Data Y" framework

tl;dr
My goal is to conditionally provide implementations for abstract virtual methods in an intermediate workhorse template class (depending on template parameters), but to leave them abstract otherwise so that classes derived from the template are reminded by the compiler to implement them if necessary.
I am also grateful for pointers towards better solutions in general.
Long version
I am working on an extensible framework to perform "operations" on "data". One main goal is to allow XML configs to determine program flow, and allow users to extend both allowed data types and operations at a later date, without having to modify framework code.
If either one (operations or data types) is kept fixed architecturally, there are good patterns to deal with the problem. If allowed operations are known ahead of time, use abstract virtual functions in your data types (new data have to implement all required functionality to be usable). If data types are known ahead of time, use the Visitor pattern (where the operation has to define virtual calls for all data types).
Now if both are meant to be extensible, I could not find a well-established solution.
My solution is to declare them independently from one another and then register "operation X for data type Y" via an operation factory. That way, users can add new data types, or implement additional or alternative operations and they can be produced and configured using the same XML framework.
If you create a matrix of (all data types) x (all operations), you end up with a lot of classes. Hence, they should be as minimal as possible, and eliminate trivial boilerplate code as far as possible, and this is where I could use some inspiration and help.
There are many operations that will often be trivial, but might not be in specific cases, such as Clone() and some more (omitted here for "brevity"). My goal is to conditionally provide implementations for abstract virtual methods if appropriate, but to leave them abstract otherwise.
Some solutions I considered
As in example below: provide default implementation for trivial operations. Consequence: Nontrivial operations need to remember to override with their own methods. Can lead to run-time problems if some future developer forgets to do that.
Do NOT provide defaults. Consequence: Nontrivial functions need to be basically copy & pasted for every final derived class. Lots of useless copy&paste code.
Provide an additional template class derived from cOperation base class that implements the boilerplate functions and nothing else (template parameters similar to specific operation workhorse templates). Derived final classes inherit from their concrete operation base class and that template. Consequence: both concreteOperationBase and boilerplateTemplate need to inherit virtually from cOperation. Potentially some run-time overhead, from what I found on SO. Future developers need to let their operations inherit virtually from cOperation.
std::enable_if magic. Didn't get the combination of virtual functions and templates to work.
Here is a (fairly) minimal compilable example of the situation:
//Base class for all operations on all data types. Will be inherited from. A lot. Base class does not define any concrete operation interface, nor does it necessarily know any concrete data types it might be performed on.
class cOperation
{
public:
virtual ~cOperation() {}
virtual std::unique_ptr<cOperation> Clone() const = 0;
virtual bool Serialize() const = 0;
//... more virtual calls that can be either trivial or quite involved ...
protected:
cOperation(const std::string& strOperationID, const std::string& strOperatesOnType)
: m_strOperationID()
, m_strOperatesOnType(strOperatesOnType)
{
//empty
}
private:
std::string m_strOperationID;
std::string m_strOperatesOnType;
};
//Base class for all data types. Will be inherited from. A lot. Does not know any operations that might be performed on it.
struct cDataTypeBase
{
virtual ~cDataTypeBase() {}
};
Now, I'll define an example data type.
//Some concrete data type. Still does not know any operations that might be performed on it.
struct cDataTypeA : public cDataTypeBase
{
static const std::string& GetDataName()
{
static const std::string strMyName = "cDataTypeA";
return strMyName;
}
};
And here is an example operation. It defines a concrete operation interface, but does not know the data types it might be performed on.
//Some concrete operation. Does not know all data types it might be expected to work on.
class cConcreteOperationX : public cOperation
{
public:
virtual bool doSomeConcreteOperationX(const cDataTypeBase& dataBase) = 0;
protected:
cConcreteOperationX(const std::string& strOperatesOnType)
: cOperation("concreteOperationX", strOperatesOnType)
{
//empty
}
};
The following template is meant to be the boilerplate workhorse. It implements as much trivial and repetitive code as possible and is provided alongside the concrete operation base class - concrete data types are still unknown, but are meant to be provided as template parameters.
//ConcreteOperationTemplate: absorb as much common/trivial code as possible, so concrete derived classes can have minimal code for easy addition of more supported data types
template <typename ConcreteDataType, typename DerivedOperationType, bool bHasTrivialCloneAndSerialize = false>
class cConcreteOperationXTemplate : public cConcreteOperationX
{
public:
//Can perform datatype cast here:
virtual bool doSomeConcreteOperationX(const cDataTypeBase& dataBase) override
{
const ConcreteDataType* pCastData = dynamic_cast<const ConcreteDataType*>(&dataBase);
if (pCastData == nullptr)
{
return false;
}
return doSomeConcreteOperationXOnCastData(*pCastData);
}
protected:
cConcreteOperationXTemplate()
: cConcreteOperationX(ConcreteDataType::GetDataName()) //requires ConcreteDataType to have a static method returning something appropriate
{
//empty
}
private:
//Clone can be implemented here via CRTP
virtual std::unique_ptr<cOperation> Clone() const override
{
return std::unique_ptr<cOperation>(new DerivedOperationType(*static_cast<const DerivedOperationType*>(this)));
}
//TODO: Some Magic here to enable trivial serializations, but leave non-trivials abstract
//Problem with current code is that virtual bool Serialize() override will also be overwritten for bHasTrivialCloneAndSerialize == false
virtual bool Serialize() const override
{
return true;
}
virtual bool doSomeConcreteOperationXOnCastData(const ConcreteDataType& castData) = 0;
};
Here are two implementations of the example operation on the example data type. One of them will be registered as the default operation, to be used if the user does not declare anything else in the config, and the other is a potentially much more involved non-default operation that might take many additional parameters into account (these would then have to be serialized in order to be correctly re-instantiated on the next program run). These operations need to know both the operation and the data type they relate to, but could potentially be implemented at a much later time, or in a different software component where the specific combination of operation and data type are required.
//Implementation of operation X on type A. Needs to know both of these, but can be implemented if and when required.
class cConcreteOperationXOnTypeADefault : public cConcreteOperationXTemplate<cDataTypeA, cConcreteOperationXOnTypeADefault, true>
{
virtual bool doSomeConcreteOperationXOnCastData(const cDataTypeA& castData) override
{
//...do stuff...
return true;
}
};
//Different implementation of operation X on type A.
class cConcreteOperationXOnTypeASpecialSauce : public cConcreteOperationXTemplate<cDataTypeA, cConcreteOperationXOnTypeASpecialSauce/*, false*/>
{
virtual bool doSomeConcreteOperationXOnCastData(const cDataTypeA& castData) override
{
//...do stuff...
return true;
}
//Problem: Compiler does not remind me that cConcreteOperationXOnTypeASpecialSauce might need to implement this method
//virtual bool Serialize() override {}
};
int main(int argc, char* argv[])
{
std::map<std::string, std::map<std::string, std::unique_ptr<cOperation>>> mapOpIDAndDataTypeToOperation;
//...fill map, e.g. via XML config / factory method...
const cOperation& requestedOperation = *mapOpIDAndDataTypeToOperation.at("concreteOperationX").at("cDataTypeA");
//...do stuff...
return 0;
}
if you data types are not virtual (for each operation call you know both operation type and data type at compile time) you may consider following approach:
#include<iostream>
#include<string>
template<class T>
void empty(T t){
std::cout<<"warning about missing implementation"<<std::endl;
}
template<class T>
void simple_plus(T){
std::cout<<"simple plus"<<std::endl;
}
void plus_string(std::string){
std::cout<<"plus string"<<std::endl;
}
template<class Data, void Implementation(Data)>
class Operation{
public:
static void exec(Data d){
Implementation(d);
}
};
#define macro_def(OperationName) template<class T> class OperationName : public Operation<T, empty<T>>{};
#define macro_template_inst( TypeName, OperationName, ImplementationName ) template<> class OperationName<TypeName> : public Operation<TypeName, ImplementationName<TypeName>>{};
#define macro_inst( TypeName, OperationName, ImplementationName ) template<> class OperationName<TypeName> : public Operation<TypeName, ImplementationName>{};
// this part may be generated on base of .xml file and put into .h file, and then just #include generated.h
macro_def(Plus)
macro_template_inst(int, Plus, simple_plus)
macro_template_inst(double, Plus, simple_plus)
macro_inst(std::string, Plus, plus_string)
int main() {
Plus<int>::exec(2);
Plus<double>::exec(2.5);
Plus<float>::exec(2.5);
Plus<std::string>::exec("abc");
return 0;
}
Minus of this approach is that you'd have to compile project in 2 steps: 1) transform .xml to .h 2) compile project using generated .h file. On plus side compiler/ide (I use qtcreator with mingw) gives warning about unused parameter t in function
void empty(T t)
and stack trace where from it was called.

How to store type information, gathered from a constructor, at the class level to use in casting

I am trying to write a class that I can store and use type information in without the need for a template parameter.
I want to write something like this:
class Example
{
public:
template<typename T>
Example(T* ptr)
: ptr(ptr)
{
// typedef T EnclosedType; I want this be a avaialable at the class level.
}
void operator()()
{
if(ptr == NULL)
return;
(*(EnclosedType*)ptr)(); // so i can cast the pointer and call the () operator if the class has one.
}
private:
void* ptr;
}
I am not asking how to write an is_functor() class.
I want to know how to get type information in a constructor and store it at the class level. If that is impossible, a different solution to this would be appreciated.
I consider this as a good and valid question, however, there is no general solution beside using a template parameter at the class level. What you tried to achieve in your question -- using a typedef inside a function and then access this in the whole class -- is not possible.
Type erasure
Only if you impose certain restrictions onto your constructor parameters, there are some alternatives. In this respect, here is an example of type erasure where the operator() of some given object is stored inside a std::function<void()> variable.
struct A
{
template<typename T>
A(T const& t) : f (std::bind(&T::operator(), t)) {}
void operator()() const
{
f();
}
std::function<void()> f;
};
struct B
{
void operator()() const
{
std::cout<<"hello"<<std::endl;
}
};
int main()
{
A(B{}).operator()(); //prints "hello"
}
DEMO
Note, however, the assumptions underlying this approach: one assumes that all passed objects have an operator of a given signature (here void operator()) which is stored inside a std::function<void()> (with respect to storing the member-function, see here).
Inheritance
In a sense, type erasure is thus like "inheriting without a base class" -- one could instead use a common base class for all constructor parameter classes with a virtual bracket operator, and then pass a base class pointer to your constructor.
struct A_parameter_base
{
void operator()() const = 0;
};
struct B : public A_parameter_base
{
void operator()() const { std::cout<<"hello"<<std::endl; }
};
struct A
{
A(std::shared_ptr<A_parameter_base> _p) : p(_p) {}
void operator()()
{
p->operator();
}
std::shared_ptr<A_parameter_base> p;
}
That is similar to the code in your question, only that it does not use a void-pointer but a pointer to a specific base class.
Both approaches, type erasure and inheritance, are similar in their applications, but type erasure might be more convenient as one gets rid of a common base class. However, the inheritance approach has the further advantage that you can restore the original object via multiple dispatch
This also shows the limitations of both approaches. If your operator would not be void but instead would return some unknown varying type, you cannot use the above approach but have to use templates. The inheritance parallel is: you cannot have a virtual function template.
The practical answer is to store either a copy of your class, or a std::ref wrapped pseudo-reference to your class, in a std::function<void()>.
std::function type erases things it stores down to 3 concepts: copy, destroy and invoke with a fixed signature. (also, cast-back-to-original-type and typeid, more obscurely)
What it does is it remembers, at construction, how to do these operations to the passed in type, and stores a copy in a way it can perform those operations on it, then forgets everything else about the type.
You cannot remember everything about a type this way. But almost any operation with a fixed signature, or which can be intermediaried via a fixed signature operation, can be type erased down to.
The first typical way to do this are to create a private pure interface with those operations, then create a template implementation (templated on the type passed to the ctor) that implements each operation for that particular type. The class that does the type erasure then stores a (smart) pointer to the private interface, and forwards its public operations to it.
A second typical way is to store a void*, or a buffer of char, and a set of pointers to functions that implement the operations. The pointers to functions can be either stored locally in the type erasing class, or stored in a helper struct that is created statically for each type erased, and a pointer to the helper struct is stored in the type erasing class. The first way to store the function pointers is like C-style object properties: the second is like a manual vtable.
In any case, the function pointers usually take one (or more) void* and know how to cast them back to the right type. They are created in the ctor that knows the type, either as instances of a template function, or as local stateless lambdas, or the same indirectly.
You could even do a hybrid of the two: static pimpl instance pointers taking a void* or whatever.
Often using std::function is enough, manually writing type erasure is hard to get right compared to using std::function.
Another version to the first two answers we have here - that's closer to your current code:
class A{
public:
virtual void operator()=0;
};
template<class T>
class B: public A{
public:
B(T*t):ptr(t){}
virtual void operator(){(*ptr)();}
T*ptr;
};
class Example
{
public:
template<typename T>
Example(T* ptr)
: a(new B<T>(ptr))
{
// typedef T EnclosedType; I want this be a avaialable at the class level.
}
void operator()()
{
if(!a)
return;
(*a)();
}
private:
std::unique_ptr<A> a;
}

A standard way to avoid virtual functions

I have a library where there is a lot of small objects, which now all have virtual functions. It goes to such an extent that the size of the pointer to a virtual function table can exceed the size of the useful data in the object (it can often be just a structure with a single float in it). The objects are elements in a numerical simulation on a sparse graph, and as such cannot be easily merged / etc.
I'm not concerned as much about the cost of the virtual function call, rather about the cost of the storage. What is happening is that the pointer to the virtual function table is basically reducing the efficiency of the cache. I'm wondering if I would be better off with a type id stored as an integer, instead of the virtual function.
I cannot use static polymorphism, as all of my objects are in a single list, and I need to be able to perform operations on items, selected by an index (which is a runtime value - therefore there is no way to statically determine the type).
The question is: is there a design pattern or a common algorithm, that can dynamically call a function from an interface, given a list of types (e.g. in a typelist) and a type index?
The interface is defined and does not change much, but new objects will be declared in the future by (possibly less-skilled) users of the library and there should not be a large effort needed in doing so. Performance is paramount. Sadly, no C++11.
So far, I have perhaps a silly proof of concept:
typedef MakeTypelist(ClassA, ClassB, ClassC) TList; // list of types
enum {
num_types = 3 // number of items in TList
};
std::vector<CommonBase*> uniform_list; // pointers to the objects
std::vector<int> type_id_list; // contains type ids in range [0, num_types)
template <class Op, class L>
class Resolver { // helper class to make a list of functions
typedef typename L::Head T;
// specialized call to op.Op::operator ()<T>(p)
static void Specialize(CommonBase *p, Op op)
{
op(*(T*)p);
}
// add a new item to the list of the functions
static void BuildList(void (**function_list)(CommonBase*, Op))
{
*function_list = &Specialize;
Resolver<Op, typename L::Tail>::BuildList(function_list + 1);
}
};
template <class Op>
class Resolver<Op, TypelistEnd> { // specialization for the end of the list
static void BuildList(void (**function_list)(CommonBase*, Op))
{}
};
/**
* #param[in] i is index of item
* #param[in] op is a STL-style function object with template operator ()
*/
template <class Op>
void Resolve(size_t i, Op op)
{
void (*function_list[num_types])(CommonBase*, Op);
Resolver<Op, TList>::BuildList(function_list);
// fill the list of functions using the typelist
(*function_list[type_id_list[i]])(uniform_list[i], op);
// call the function
}
I have not looked into the assembly yet, but I believe that if made static, the function pointer array creation could be made virtually for free. Another alternative is to use a binary search tree generated on the typelist, which would enable inlining.
I ended up using the "thunk table" concept that I outlined in the question. For each operation, there is a single instance of a thunk table (which is static and is shared through a template - the compiler will therefore automatically make sure that there is only a single table instance per operation type, not per invokation). Thus my objects have no virtual functions whatsoever.
Most importantly - the speed gain from using simple function pointer instead of virtual functions is negligible (but it is not slower, either). What gains a lot of speed is implementing a decision tree and linking all the functions statically - that improved the runtime of some not very compute intensive code by about 40%.
An interesting side effect is being able to have "virtual" template functions, which is not usually possible.
One problem that I needed to solve was that all my objects needed to have some interface, as they would end up being accessed by some calls other than the functors. I devised a detached facade for that. A facade is a virtual class, declaring the interface of the objects. A detached facade is instance of this virtual class, specialized for a given class (for all in the list, operator [] returns detached facade for the type of the selected item).
class CDetachedFacade_Base {
public:
virtual void DoStuff(BaseType *pthis) = 0;
};
template <class ObjectType>
class CDetachedFacade : public CDetachedFacade_Base {
public:
virtual void DoStuff(BaseType *pthis)
{
static_cast<ObjectType>(pthis)->DoStuff();
// statically linked, CObjectType is a final type
}
};
class CMakeFacade {
BaseType *pthis;
CDetachedFacade_Base *pfacade;
public:
CMakeFacade(BaseType *p, CDetachedFacade_Base *f)
:pthis(p), pfacade(f)
{}
inline void DoStuff()
{
f->DoStuff(pthis);
}
};
To use this, one needs to do:
static CDetachedFacade<CMyObject> facade;
// this is generated and stored in a templated table
// this needs to be separate to avoid having to call operator new all the time
CMyObject myobj;
myobj.DoStuff(); // statically linked
BaseType *obj = &myobj;
//obj->DoStuff(); // can't do, BaseType does not have virtual functions
CMakeFacade obj_facade(obj, &facade); // choose facade based on type id
obj_facade.DoStuff(); // calls CMyObject::DoStuff()
This allows me to use the optimized thunk table in the high performance portion of the code and still have polymorphically behaving objects to be able to conveniently handle them where performance is not required.
CRTP is a compile time alternative to virtual functions:
template <class Derived>
struct Base
{
void interface()
{
// ...
static_cast<Derived*>(this)->implementation();
// ...
}
static void static_func()
{
// ...
Derived::static_sub_func();
// ...
}
};
struct Derived : Base<Derived>
{
void implementation();
static void static_sub_func();
};
It relies on the fact that definition of the member are not instantiated till they are called. So Base should refer to any member of Derived only in the definition of its member functions, never in prototypes or data members

Vector of pointers to instances of a templated class

I am implementing a task runtime system that maintains buffers for user-provided objects of various types. In addition, all objects are wrapped before they are stored into the buffers. Since the runtime doesn't know the types of objects that the user will provide, the Wrapper and the Buffer classes are templated:
template <typename T>
class Wrapper {
private:
T mdata;
public:
Wrapper() = default;
Wrapper(T& user_data) : mdata(user_data) {}
T& GetData() { return mdata; }
...
};
template <typename T>
class Buffer {
private:
std::deque<Wrapper<T>> items;
public:
void Write(Wrapper<T> wd) {
items.push_back(wd);
}
Wrapper<T> Read() {
Wrapper<T> tmp = items.front();
items.pop_front();
return tmp;
}
...
};
Now, the runtime system handles the tasks, each of which operates on a subset of aforementioned buffers. Thus, each buffer is operated by one or more tasks. This means that a task must keep references to the buffers since the tasks may share buffers.
This is where my problem is:
1) each task needs to keep references to a number of buffers (this number is unknown in compile time)
2) the buffers are of different types (based on the templeted Buffer class).
3) the task needs to use these references to access buffers.
There is no point to have a base class to the Buffer class and then use base class pointers since the methods Write and Read from the Buffer class are templeted and thus cannot be virtual.
So I was thinking to keep references as void pointers, where the Task class would look something like:
class Task {
private:
vector<void *> buffers;
public:
template<typename T>
void AddBuffer(Buffet<T>* bptr) {
buffers.push_back((void *) bptr);
}
template<typename T>
Buffer<T>* GetBufferPtr(int index) {
return some_way_of_cast(buffers[index]);
}
...
};
The problem with this is that I don't know how to get the valid pointer from the void pointer in order to access the Buffer. Namely, I don't know how to retain the type of the object pointed by buffers[index].
Can you help me with this, or suggest some other solution?
EDIT: The buffers are only the implementation detail of the runtime system and the user is not aware of their existence.
In my experience, when the user types are kept in user code, run-time systems handling buffers do not need to worry about the actual type of these buffer. Users can invoke operations on typed buffers.
class Task {
private:
vector<void *> buffers;
public:
void AddBuffer(char* bptr) {
buffers.push_back((void *) bptr);
}
char *GetBufferPtr(int index) {
return some_way_of_cast(buffers[index]);
}
...
};
class RTTask: public Task {
/* ... */
void do_stuff() {
Buffer<UserType1> b1; b1Id = b1.id();
Buffer<UserType2> b2; b2Id = b2.id();
AddBuffer(cast(&b1));
AddBuffer(cast(&b2));
}
void do_stuff2() {
Buffer<UserType1> *b1 = cast(GetBufferPtr(b1Id));
b1->push(new UserType1());
}
};
In these cases casts are in the user code. But perhaps you have a different problem. Also the Wrapper class may not be necessary if you can switch to pointers.
What you need is something called type erasure. It's way to hide the type(s) in a template.
The basic technique is the following:
- Have an abstract class with the behavior you want in declared in a type independent maner.
- Derive your template class from that class, implement its virtual methods.
Good news, you probably don't need to write your own, there boost::any already. Since all you need is get a pointer and get the object back, that should be enough.
Now, working with void* is a bad idea. As perreal mentioned, the code dealing with the buffers should not care about the type though. The good thing to do is to work with char*. That is the type that is commonly used for buffers (e.g. socket apis). It is safer than too: there is a special rule in the standard that allows safer conversion to char* (see aliasing rules).
This isn't exactly an answer to your question, but I just wanted to point out that the way you wrote
Wrapper<T> Read() {
makes it a mutator member function which returns by value, and as such, is not good practice as it forces the user write exception unsafe code.
For the same reason the STL stack::pop() member function returns void, not the object that was popped off the stack.

Should I prefer mixins or function templates to add behavior to a set of unrelated types?

Mixins and function templates are two different ways of providing a behavior to a wide set of types, as long as these types meet some requirements.
For example, let's assume that I want to write some code that allows me to save an object to a file, as long as this object provides a toString member function (this is a rather silly example, but bear with me). A first solution is to write a function template like the following:
template <typename T>
void toFile(T const & obj, std::string const & filename)
{
std::ofstream file(filename);
file << obj.toString() << '\n';
}
...
SomeClass o1;
toFile(o1, "foo.txt");
SomeOtherType o2;
toFile(o2, "bar.txt");
Another solution is to use a mixin, using CRTP:
template <typename Derived>
struct ToFile
{
void toFile(std::string const & filename) const
{
Derived * that = static_cast<Derived const *>(this);
std::ofstream file(filename);
file << that->toString() << '\n';
}
};
struct SomeClass : public ToFile<SomeClass>
{
void toString() const {...}
};
...
SomeClass o1;
o.toFile("foo.txt");
SomeOtherType o2;
o2.toFile("bar.txt");
What are the pros and cons of these two approaches? Is there a favored one, and if so, why?
The first approach is much more flexible, as it can be made to work with any type that provides any way to be converted to a std::string (this can be achieved using traits-classes) without the need to modify that type. Your second approach would always require modification of a type in order to add functionality.
Pro function templates: the coupling is looser. You don't need to derive from anything to get the functionality in a new class; in your example, you only implement the toString method and that's it. You can even use a limited form of duck typing, since the type of toString isn't specified.
Pro mixins: nothing, strictly; your requirement is for something that works with unrelated classes and mixins cause them to be become related.
Edit: Alright, due to the way the C++ type system works, the mixin solution will strictly produce unrelated classes. I'd go with the template function solution, though.
I would like to propose an alternative, often forgotten because it is a mix of duck-typing and interfaces, and very few languages propose this feat (note: very close to Go's take to interfaces actually).
// 1. Ask for a free function to exist:
void toString(std::string& buffer, SomeClass const& sc);
// 2. Create an interface that exposes this function
class ToString {
public:
virtual ~ToString() {}
virtual void toString(std::string& buffer) const = 0;
}; // class ToString
// 3. Create an adapter class (bit of magic)
template <typename T>
class ToStringT final: public ToString {
public:
ToStringT(T const& t): t(t) {}
virtual void toString(std::string& buffer) const override {
toString(buffer, t);
}
private:
T t; // note: for reference you need a reference wrapper
// I won't delve into this right now, suffice to say
// it's feasible and only require one template overload
// of toString.
}; // class ToStringT
// 4. Create an adapter maker
template <typename T>
ToStringT<T> toString(T const& t) { return std::move(ToStringT<T>(t)); }
And now ? Enjoy!
void print(ToString const& ts); // aka: the most important const
int main() {
SomeClass sc;
print(toString(sc));
};
The two stages is a bit heavyweight, however it gives an astonishing degree of functionality:
No hard-wiring data / interface (thanks to duck-typing)
Low-coupling (thanks to abstract classes)
And also easy integration:
You can write an "adapter" for an already existing interface, and migrate from an OO code base to a more agile one
You can write an "interface" for an already existing set of overloads, and migrate from a Generic code base to a more clustered one
Apart from the amount of boiler-plate, it's really amazing how you seamlessly pick advantages from both worlds.
A few thoughts I had while writing this question:
Arguments in favor of template functions:
A function can be overloaded, so third-party and built-in types can be handled.
Arguments in favor of mixins:
Homogeneous syntax: the added behavior is invoked like any other member functions. However, it is well known that the interface of a C++ class includes not only its public member functions but also the free functions that operates on instances of this type, so this is just an aesthetic improvement.
By adding a non-template base class to the mixins, we obtain an interface (in the Java/C# sense) that can be use to handle all objects providing the behavior. For example, if we make ToFile<T> inherits from FileWritable (declaring a pure virtual toFile member function), we can have a collection of FileWritable without having to resort to complicated heterogeneous data structures.
Regarding usage, I'd say that function templates are more idiomatic in C++.