In my work I have a lot of loops with many inner function calls; performance is critical here, and the overhead of virtual function calls is unacceptable, so I try to avoid dynamic polymorphism by using CRTP, like so:
template<class DType>
struct BType {
DType& impl(){ return *static_cast<DType*>(this); }
void Func(){ impl().Func(); }
};
struct MyType : public BType<MyType> {
void Func(){ /* do work */ }
};
template<class DType>
void WorkLoop(BType<DType>* func){
for (int i=0;i<ni;++i){ func->func(); }
}
struct Worker {
void DoWork(){ WorkLoop(&thing) };
private:
MyType thing;
};
Worker worker;
worker.DoWork();
Aside: is the correct way to actually use a CRTP class? Now I need the actual type to depend on a runtime user option, and normally dynamic polymorphism with an abstract base class / strategy pattern would be the right design, but I can't afford the virtual function calls. One way to do this seems to be with some branching:
struct Worker {
void DoWork(){
if (option=="optionA"){
TypeA thing;
WorkLoop(thing); }
else if (option=="optionB"){
TypeB thing;
WorkLoop(thing); }
...
But this seems like a lousy design. Passing it as a template parameter here (or using policy based design) seems like an option:
template<class T>
struct Worker {
void DoWork(){ WorkLoop(&thing) };
T thing;
};
if (option=="optionA"){
Worker<TypeA> worker; worker.DoWork() } ...
but here worker only has scope in the if branch, and I'd need it to have a life the length of the program. Additionally, the relevant user options would probably specify 4+ "policies", each of those with several options (say 4), so it seems like you'd quickly have a nasty problem where a templated class could take 1 of 4*4*4*4 template combinations.
Also, moving the loop logic into the types is not an option - if it were the virtual function call overhead would be negligible and I'd use normal polymorphism. The actual control of the loops could be rather complicated and will vary at runtime.
Would this suggest that I should try and build a custom iterator and pass that as a function argument and use normal polymorphism, or would this incur similar overhead?
What is a good design for selecting classes at run-time without resorting to pointers to abstract base classes?
You have a classic problem of runtime-to-compile-time dispatch: "Additionally, the relevant user options would probably specify extra policies, each of those with several options". Your code has to support many combinations of options which you do not know at compile time.
It means you have to write some code for every possible combination and then dispatch user's choice onto one of the combinations. It implies you have to have some ugly and not-so-efficient piece of code where you parse user's runtime decisions and dispatch them onto predefined templates.
To keep efficiency as high as possible you want to do this dispatch at very high-level, as close to entry points as possible. On the other side, your low-level code can templatized as much as you like.
It means dispatch can have several down-steps from non-template code to mix of templates and options to fully templetized.
Usually it is achieved better with tags and policies, not CRTP, but it depends closely on your algorithms and options.
Related
This question might fall into "wanting the best of all worlds" but it is a real design problem that needs at least a better solution.
Structure needed:
In order of importance, here's the requirements that have me stuck
We need templates, whether on the class or function level. We are highly dependent on template objects in arguments of functions at this point. So if anything leaves the model below, its virtual functions (to my knowledge).
We want to decouple the call from selection. By that we want the user to declare a Math Object and have the background figure it out, preferably at runtime.
We want there to be a default, like shown in the above diagram.
In my company's program, we have a crucial algorithm generator that is dependent on both compile-time and runtime polymorphism, namely template classes and virtual inheritance. We have it working, but it is fragile, hard to read and develop and has certain features that won't work on higher optimization levels (meaning we are relying on undefined behavior somewhere). A brief outline of the code is as follows.
// Math.hpp
#include <dataTypes.hpp>
// Base class. Actually handles CPU Version of execution
template <typename T>
class Math {
// ...
// Example function. Parameters vary in type and number
// Variable names commented out to avoid compile warnings
virtual void exFunc ( DataType<T> /*d*/, float /*f*/ )
{
ERROR_NEED_CODE; // Macro defined to throw error with message
}
// 50+ other functions...
};
//============================================================
// exampleFuncs.cpp
#include<Math.hpp>
template <> void Math<float>::exFunc ( DataType<float> d, float f)
{
// Code Here.
}
Already, we can see some problems, and we haven't gotten to the main issue. Due to the sheer number of functions in this class, we don't want to define all in the header file. Template functionality is lost as a result. Second, with the virtual functions with the template class, we need to define each function in the class anyways, but we just shoot an error and return garbage (if return needed).
//============================================================
// GpuMath.hpp
#include <Math.hpp>
// Derived class. Using CUDA to resolve same math issues
GpuMath_F : Math<float> { ... };
The functionality here is relatively simple, but I noticed that again, we give up template features. I'm not sure it needs to be that way, but the previous developers felt constrained to declare a new class for each needed type (3 currently. Times that by 50 or so functions, and we have severe level of overhead).
Finally, When functionality is needed. We use a Factory to create the right template type object and stores it in a Math pointer.
// Some other class, normally template
template <typename T>
class OtherObject {
Math<T>* math_;
OtherObject() {
math_ = Factory::get().template createMath<T> ();
// ...
}
// ...
};
The factory is omitted here. It gets messy and doesn't help us much. The point is that we store all versions of Math Objects in the base class.
Can you point me in the right direction for other techniques that are alternative to inheritance? Am I looking for a variation of Policy Design? Is There a template trick?
Thanks for reading and thanks in advance for your input.
As has been discussed many times before, templates with virtual features don't jive well together. It is best to choose one or the other.
Approach 1 : Helper Class
The first and best option we have so far does just that, opting out of the virtual features for a wrapper class.
class MathHelper {
Math cpuMath;
GpuMath gpuMath;
bool cuda_; //True if gpuMath is wanted
template <typename T>
void exFunc ( DataType<T> d, float f )
{
if (cuda_)
gpuMath.exFunc( d, f );
else
cpuMath.exFunc( d, f );
}
// 50+ functions...
};
First, you might have noticed that the functions are templated rather than the class. It structurally is more convenient.
Pros
Gains full access to templates in both CPU and GPU classes.
Improved customization for each and every function. Choice of what is default.
Non-invasive changes to previous structure. For example, if this MathHelper was just called Math and we had CpuMath and GpuMath as the implementation, the instantiation and use can almost be the same as above, and stay exactly the same if we let Factory handle the MathHelper.
Cons
Explicit if/else and declaration of every function.
Mandatory definition of every function in MathHelper AND at least one of the other Math objects.
As a result, repeated code everywhere.
Approach 2: Macro
This one attempts to reduce the repeated code above. Somewhere, we have a Math function.
class Math {
CpuMath cpuMath;
GpuMath gpuMath;
// Some sort of constructor
static Math& math() { /*static getter*/ }
};
This math helper uses a static getter function similar to Exam 1 shown here. We have base class CpuMath that contains no virtual functions and derived class GpuMath. Again, templating is on function level.
Then from there, any time we want a math function we use this macro:
#define MATH (func, ret...) \
do { \
if (math.cuda_) \
ret __VA_OPT__(=) math().cuda.func; \
else \
ret __VA_OPT__(=) math().cpu.func; \
} while (0)
Pros
Remove repeat code of previous wrapper.
Again, full power of templates unlocked
Cons
Not as customizable as above wrapper
Initially much more invasive. Every time a Math function is accessed, it has to change from val = math_.myFunc(...), to MATH (myFunc(...), val). Because editors don't do good error checking on macros, this has potentially to cause many errors in the editing process.
Base class must have every function derived class have, since it is default.
Again, if any other creative ways around to implement this design would be appreciated. I found this to be a fun exercise either way, and would love to continue learning from it.
I'm in a situation where I have a class, let's call it Generic. This class has members and attributes, and I plan to use it in a std::vector<Generic> or similar, processing several instances of this class.
Also, I want to specialize this class, the only difference between the generic and specialized objects would be a private method, which does not access any member of the class (but is called by other methods). My first idea was to simply declare it virtual and overload it in specialized classes like this:
class Generic
{
// all other members and attributes
private:
virtual float specialFunc(float x) const =0;
};
class Specialized_one : public Generic
{
private:
virtual float specialFunc(float x) const{ return x;}
};
class Specialized_two : public Generic
{
private:
virtual float specialFunc(float x) const{ return 2*x; }
}
And thus I guess I would have to use a std::vector<Generic*>, and create and destroy the objects dynamically.
A friend suggested me using a std::function<> attribute for my Generic class, and give the specialFunc as an argument to the constructor but I am not sure how to do it properly.
What would be the advantages and drawbacks of these two approaches, and are there other (better ?) ways to do the same thing ? I'm quite curious about it.
For the details, the specialization of each object I instantiate would be determined at runtime, depending on user input. And I might end up with a lot of these objects (not yet sure how many), so I would like to avoid any unnecessary overhead.
virtual functions and overloading model an is-a relationship while std::function models a has-a relationship.
Which one to use depends on your specific use case.
Using std::function is perhaps more flexible as you can easily modify the functionality without introducing new types.
Performance should not be the main decision point here unless this code is provably (i.e. you measured it) the tight loop bottleneck in your program.
First of all, let's throw performance out the window.
If you use virtual functions, as you stated, you may end up with a lot of classes with the same interface:
class generic {
virtual f(float x);
};
class spec1 : public generic {
virtual f(float x);
};
class spec2 : public generic {
virtual f(float x);
};
Using std::function<void(float)> as a member would allow you to avoid all the specializations:
class meaningful_class_name {
std::function<void(float)> f;
public:
meaningful_class_name(std::function<void(float)> const& p_f) : f(p_f) {}
};
In fact, if this is the ONLY thing you're using the class for, you might as well just remove it, and use a std::function<void(float)> at the level of the caller.
Advantages of std::function:
1) Less code (1 class for N functions, whereas the virtual method requires N classes for N functions. I'm making the assumption that this function is the only thing that's going to differ between classes).
2) Much more flexibility (You can pass in capturing lambdas that hold state if you want to).
3) If you write the class as a template, you could use it for all kinds of function signatures if needed.
Using std::function solves whatever problem you're attempting to tackle with virtual functions, and it seems to do it better. However, I'm not going to assert that std::function will always be better than a bunch of virtual functions in several classes. Sometimes, these functions have to be private and virtual because their implementation has nothing to do with any outside callers, so flexibility is NOT an advantage.
Disadvantages of std::function:
1) I was about to write that you can't access the private members of the generic class, but then I realized that you can modify the std::function in the class itself with a capturing lambda that holds this. Given the way you outlined the class however, this shouldn't be a problem since it seems to be oblivious to any sort of internal state.
What would be the advantages and drawbacks of these two approaches, and are there other (better ?) ways to do the same thing ?
The issue I can see is "how do you want your class defined?" (as in, what is the public interface?)
Consider creating an API like this:
class Generic
{
// all other members and attributes
explicit Generic(std::function<float(float)> specialFunc);
};
Now, you can create any instance of Generic, without care. If you have no idea what you will place in specialFunc, this is the best alternative ("you have no idea" means that clients of your code may decide in one month to place a function from another library there, an identical function ("receive x, return x"), accessing some database for the value, passing a stateful functor into your function, or whatever else).
Also, if the specialFunc can change for an existing instance (i.e. create instance with specialFunc, use it, change specialFunc, use it again, etc) you should use this variant.
This variant may be imposed on your code base by other constraints. (for example, if want to avoid making Generic virtual, or if you need it to be final for other reasons).
If (on the other hand) your specialFunc can only be a choice from a limited number of implementations, and client code cannot decide later they want something else - i.e. you only have identical function and doubling the value - like in your example - then you should rely on specializations, like in the code in your question.
TLDR: Decide based on the usage scenarios of your class.
Edit: regarding beter (or at least alternative) ways to do this ... You could inject the specialFunc in your class on an "per needed" basis:
That is, instead of this:
class Generic
{
public:
Generic(std::function<float(float> f) : specialFunc{f} {}
void fancy_computation2() { 2 * specialFunc(2.); }
void fancy_computation4() { 4 * specialFunc(4.); }
private:
std::function<float(float> specialFunc;
};
You could write this:
class Generic
{
public:
Generic() {}
void fancy_computation2(std::function<float(float> f) { 2 * f(2.); }
void fancy_computation4(std::function<float(float> f) { 4 * f(4.); }
private:
};
This offers you more flexibility (you can use different special functions with single instance), at the cost of more complicated client code. This may also be a level of flexibility that you do not want (too much).
My question is related to the topic here.
Suppose I have the following simplified structure:
struct Base
{/* ... abstract implementation ...*/};
template<int i> //simplified. In my real code, some other classes follow.
struct Derived : public Base
{/* ... implementation ...*/};
Now, for instance in order to obtain random creation at runtime, I can set up an easy factory which takes my integer and returns the corresponding base pointer:
std::unique_ptr<Base> createDerived(int i) //again, in the real code, some more enums follow to determine the other classes
{
if(i==1) {return std::unique_ptr<Derived<1> >(new Derived<1>());}
else if(i==2) {return std::unique_ptr<Derived<2> >(new Derived<2>());}
// ...
else if(i==10000 /*say*/) {return std::unique_ptr<Derived<10000> >(new Derived<10000>());}
}
However, in the linked thread, the answerers advise against doing this.
So, my question is why? Is this already what people call bad design? The only disadvantages I see here is that
the source code may blow up (...however, my classes are "small", i.e. contain only a few and small data members)
I have to maintain the factory (unless I use some of the clever way for automatic registering which are around).
On the other hand, one can draw all the advantages like flexibility and efficiency of the generic design of the derived class, and also use this whole inheritance-Base-class-pointers-thing if required.
To me it somehow seems like getting the best of both worlds ... what are you thinking?
I currently have a C++ interface
void foo() ...
and lots of implementions
void A::foo();
void B::foo();
Currently foo is defined as follows
struct Wrapper
{
void foo()
{
if(state == STATE_A) a.foo();
else if(state == STATE_B) b.foo();
}
union {
A a;
B b;
};
State state;
}
//....
Wrapper a(STATE_A, existingA);
Wrapper b(STATE_B, existingB);
a.foo(); b.foo();
Is there a cleaner way to do this? I have multiple foo() like functions and multiple A/B like classes. It's getting tedious/error prone to write all the cases.
Note that I cannot use virtual functions (this runs inside a N^5 loop... with 10 million+ executions / second). I want the compiler to inline this, hard.
I have thought of collecting A's, B's, etc together and computing them in a data oriented fashion, but unfortunately I can't do that (due to algorithm concerns)
I want the compiler to inline this, hard.
That's not going to happen.
You're using runtime polymorphism. By definition, the compiler cannot know which function will be called at call time. You are going to pay for virtual dispatch, whether you do it manually or let the compiler do it.
The absolute most inlining you will get is in the calls to the member functions. It still has to do a conditional branch based on a memory access (fetching the "type") to get to the "inline" part. And every new "state" you add will add another condition to that branch. At best, this will become a state table... which is no different from just a virtual function pointer: it fetches from a memory address, and uses that to branch to a particular piece of code.
Just like a vtable pointer, only you wasted your time implementing something the compiler could do for you.
I strongly advise you to profile this instead of simply assuming that your hand-written method can beat the compiler.
If you've decided to abandon language-level polymorphism, then you should use a boost.variant and appropriate visitors instead. Your code would look like this:
typedef boost::variant<A, B> Wrapper;
struct FooVisitor : public boost::static_visitor<>
{
template <typename T> void operator()(T &t) {t.foo()};
};
You will have to make a FooVisitor for every function you want to call. To call it, you do this:
Wrapper a = existingA;
boost::apply_visitor(FooVisitor(), a);
Obviously, you can wrap that in a simple function:
void CallFoo(Wrapper &a) {boost::apply_visitor(FooVisitor(), a);}
Indeed, you can make a whole template family of these:
template<typename Visitor>
void Call(Wrapper &a) {boost::apply_visitor(Visitor(), a);}
Note that parameter passing is not allowed (you have to store the parameters in the visitor itself), but they can have return values (you have to put the return type in the boost::static_visitor<Typename_Here> declaration of your visitor).
Also note that boost::variant objects have value semantics, so copies will copy the internal object. You can also use the boost::get() syntax to get the actual type, but I would not suggest it unless you really need it. Just use visitors.
You have two choices. You can do the function selection at compile time, or you can do it at run time. If it's run time you're not going to do better than the existing virtual mechanism. If it's compile time you need different code for each type you're going to use, but you can use templates to automate the process.
template<typename T>
struct Wrapper
{
void foo()
{
t.foo();
}
T t;
};
Of course this example is highly abstracted and I can't see any difference between using the Wrapper class and the template type directly. You'll have to flesh out your example a little more to get a better answer.
When implementing polymorphic behavior in C++ one can either use a pure virtual method or one can use function pointers (or functors). For example an asynchronous callback can be implemented by:
Approach 1
class Callback
{
public:
Callback();
~Callback();
void go();
protected:
virtual void doGo() = 0;
};
//Constructor and Destructor
void Callback::go()
{
doGo();
}
So to use the callback here, you would need to override the doGo() method to call whatever function you want
Approach 2
typedef void (CallbackFunction*)(void*)
class Callback
{
public:
Callback(CallbackFunction* func, void* param);
~Callback();
void go();
private:
CallbackFunction* iFunc;
void* iParam;
};
Callback::Callback(CallbackFunction* func, void* param) :
iFunc(func),
iParam(param)
{}
//Destructor
void go()
{
(*iFunc)(iParam);
}
To use the callback method here you will need to create a function pointer to be called by the Callback object.
Approach 3
[This was added to the question by me (Andreas); it wasn't written by the original poster]
template <typename T>
class Callback
{
public:
Callback() {}
~Callback() {}
void go() {
T t; t();
}
};
class CallbackTest
{
public:
void operator()() { cout << "Test"; }
};
int main()
{
Callback<CallbackTest> test;
test.go();
}
What are the advantages and disadvantages of each implementation?
Approach 1 (Virtual Function)
"+" The "correct way to do it in C++
"-" A new class must be created per callback
"-" Performance-wise an additional dereference through VF-Table compared to Function Pointer. Two indirect references compared to Functor solution.
Approach 2 (Class with Function Pointer)
"+" Can wrap a C-style function for C++ Callback Class
"+" Callback function can be changed after callback object is created
"-" Requires an indirect call. May be slower than functor method for callbacks that can be statically computed at compile-time.
Approach 3 (Class calling T functor)
"+" Possibly the fastest way to do it. No indirect call overhead and may be inlined completely.
"-" Requires an additional Functor class to be defined.
"-" Requires that callback is statically declared at compile-time.
FWIW, Function Pointers are not the same as Functors. Functors (in C++) are classes that are used to provide a function call which is typically operator().
Here is an example functor as well as a template function which utilizes a functor argument:
class TFunctor
{
public:
void operator()(const char *charstring)
{
printf(charstring);
}
};
template<class T> void CallFunctor(T& functor_arg,const char *charstring)
{
functor_arg(charstring);
};
int main()
{
TFunctor foo;
CallFunctor(foo,"hello world\n");
}
From a performance perspective, Virtual functions and Function Pointers both result in an indirect function call (i.e. through a register) although virtual functions require an additional load of the VFTABLE pointer prior to loading the function pointer. Using Functors (with a non-virtual call) as a callback are the highest performing method to use a parameter to template functions because they can be inlined and even if not inlined, do not generate an indirect call.
Approach 1
Easier to read and understand
Less possibility of errors (iFunc cannot be NULL, you're not using a void *iParam, etc
C++ programmers will tell you that this is the "right" way to do it in C++
Approach 2
Slightly less typing to do
VERY slightly faster (calling a virtual method has some overhead, usually the same of two simple arithmetic operations.. So it most likely won't matter)
That's how you would do it in C
Approach 3
Probably the best way to do it when possible. It will have the best performance, it will be type safe, and it's easy to understand (it's the method used by the STL).
The primary problem with Approach 2 is that it simply doesn't scale. Consider the equivalent for 100 functions:
class MahClass {
// 100 pointers of various types
public:
MahClass() { // set all 100 pointers }
MahClass(const MahClass& other) {
// copy all 100 function pointers
}
};
The size of MahClass has ballooned, and the time to construct it has also significantly increased. Virtual functions, however, are O(1) increase in the size of the class and the time to construct it- not to mention that you, the user, must write all the callbacks for all the derived classes manually which adjust the pointer to become a pointer to derived, and must specify function pointer types and what a mess. Not to mention the idea that you might forget one, or set it to NULL or something equally stupid but totally going to happen because you're writing 30 classes this way and violating DRY like a parasitic wasp violates a caterpillar.
Approach 3 is only usable when the desired callback is statically knowable.
This leaves Approach 1 as the only usable approach when dynamic method invocation is required.
It's not clear from your example if you're creating a utility class or not. Is you Callback class intended to implement a closure or a more substantial object that you just didn't flesh out?
The first form:
Is easier to read and understand,
Is far easier to extend: try adding methods pause, resume and stop.
Is better at handling encapsulation (presuming doGo is defined in the class).
Is probably a better abstraction, so easier to maintain.
The second form:
Can be used with different methods for doGo, so it's more than just polymorphic.
Could allow (with additional methods) changing the doGo method at run-time, allowing the instances of the object to mutate their functionality after creation.
Ultimately, IMO, the first form is better for all normal cases. The second has some interesting capabilities, though -- but not ones you'll need often.
One major advantage of the first method is it has more type safety. The second method uses a void * for iParam so the compiler will not be able to diagnose type problems.
A minor advantage of the second method is that it would be less work to integrate with C. But if you're code base is only C++, this advantage is moot.
Function pointers are more C-style I would say. Mainly because in order to use them you usually must define a flat function with the same exact signature as your pointer definition.
When I write C++ the only flat function I write is int main(). Everything else is a class object. Out of the two choices I would choose to define an class and override your virtual, but if all you want is to notify some code that some action happened in your class, neither of these choices would be the best solution.
I am unaware of your exact situation but you might want to peruse design patterns
I would suggest the observer pattern. It is what I use when I need to monitor a class or wait for some sort of notification.
For example, let us look at an interface for adding read functionality to a class:
struct Read_Via_Inheritance
{
virtual void read_members(void) = 0;
};
Any time I want to add another source of reading, I have to inherit from the class and add a specific method:
struct Read_Inherited_From_Cin
: public Read_Via_Inheritance
{
void read_members(void)
{
cin >> member;
}
};
If I want to read from a file, database, or USB, this requires 3 more separate classes. The combinations start to be come very ugly with multiple objects and multiple sources.
If I use a functor, which happens to resemble the Visitor design pattern:
struct Reader_Visitor_Interface
{
virtual void read(unsigned int& member) = 0;
virtual void read(std::string& member) = 0;
};
struct Read_Client
{
void read_members(Reader_Interface & reader)
{
reader.read(x);
reader.read(text);
return;
}
unsigned int x;
std::string& text;
};
With the above foundation, objects can read from different sources just by supplying different readers to the read_members method:
struct Read_From_Cin
: Reader_Visitor_Interface
{
void read(unsigned int& value)
{
cin>>value;
}
void read(std::string& value)
{
getline(cin, value);
}
};
I don't have to change any of the object's code (a good thing because it is already working). I can also apply the reader to other objects.
Generally, I use inheritance when I am performing generic programming. For example, if I have a Field class, then I can create Field_Boolean, Field_Text and Field_Integer. In can put pointers to their instances into a vector<Field *> and call it a record. The record can perform generic operations on the fields, and doesn't care or know what kind of a field is processed.
Change to pure virtual, first off. Then inline it. That should negate any method overhead call at all, so long as inlining doesn't fail (and it won't if you force it).
May as well use C, because this is the only real useful major feature of C++ compared to C. You will always call method and it can't be inlined, so it will be less efficient.