I am trying to analyze the trade offs between various methods of achieving polymorphism. I need a list of objects with some similarities and some differences in member functions. The options I see are as follows:
have a flag in each object, and a switch statement in each function.
The value of the flag directs each object to its specific section of
each function.
have an array of member function pointers in the object, which are
assigned upon construction. Then, I call that function pointer to
get the correct member function.
have an virtual base class with several derived classes. One
drawback to this is that my list will now have to contain pointers,
and not the objects themselves.
My understanding is that the pointer lookups from the list in option 3 will take longer than the member function lookups of option 2 because of the guaranteed proximity of member functions.
What are some of the benefits/drawbacks of these options? My priority is performance over readability.
Is there any other method for polymorphism?
have a flag in each object, and a switch statement in each function. The value of the flag directs each object to its specific section of each function
OK, so this could make sense if very little code varies based on the flag.
This minimises the amount of (duplicated) code which has to fit in cache, and avoids any function call indirection. Under some circumstances these benefits could outweigh the extra cost of the switch statement.
have an array of member function pointers in the object, which are assigned upon construction. Then, I call that function pointer to get the correct member function
You save one indirection (to the vtable), but also make your objects bigger so fewer fit in cache. It's impossible to say which will dominate, so you'll just have to profile, but it isn't an obvious win
have an virtual base class with several derived classes. One drawback to this is that my list will now have to contain pointers, and not the objects themselves
If the your code paths are different enough that separating them completely is reasonable, this is the cleanest solution. If you need to optimise it, you can either use a specialised allocator to ensure they're sequential (even if not sequential in your container), or move the objects directly into your container using a clever wrapper similar to Boost.Any. You'll still get the vtable indirection, but I'd prefer this to #2 unless profiling shows it's really a problem.
So, there are several questions you should answer before you can decide:
how much code is shared, and how much varies?
how big are the objects, and will a table of inline function pointers materially affect your cache miss stats?
and, after you've answered those, you should just profile anyway.
One way to achieve faster polymorphism is through the CRTP idiom and static polymorphism:
template<typename T>
struct base
{
void f()
{
static_cast<T*>( this )->f_impl();
}
};
struct foo : public base<foo>
{
void f_impl()
{
std::cout << "foo!" << std::endl;
}
};
struct bar : public base<bar>
{
void f_impl()
{
std::cout << "bar!" << std::endl;
}
};
struct quux : public base<quux>
{
void f_impl()
{
std::cout << "quux!" << std::endl;
}
};
template<typename T>
void call_f( const base<T>& something )
{
something.f();
}
int main()
{
foo my_foo;
bar my_bar;
quux my_quux;
call_f( my_foo );
call_f( my_bar );
call_f( my_quux );
}
This outputs:
foo!
bar!
quux!
Static-polymorphism performs far better than virtual dispatch, because the compiler knows which function will be called at compile-time, and it could inline everything.
Even if it provides dynamic binding, it cannot perform polymorphism in the common heterogeneous-container way, because every instance of the base class is a different type.
However, that could be achieved with something like boost::any.
With a switch statement, if you want to add a new class then you need to modify everywhere where the class is switched on, which may be in various places in your code base. There may also be places outside your code base that need to be modified, but perhaps you know this isn't the case in this scenario.
With an array of member function pointers within each member, the only downside is that you duplicate that memory for every object. If you know there's only one or two "virtual" functions though then it's a good option.
As for virtual functions, you are right in that you have to heap allocate them (or manual manage the memory), but it is the most extensible option.
If you aren't after extensible, then (1) or (2) may be your best option. As always, the only way to tell is to measure. I know that many compilers will implement a switch statement in some cases by a jump table, which essentially comes out the same as a virtual function table. For small numbers of case statement they may just use binary search branching.
Measure!
Related
We can overload functions by giving them a different number of parameters. For example, functions someFunc() and someFunc(int i) can do completely different things.
Is it possible to achieve the same effect on classes? For example, having one class name but creating one class if a function is not called and a different class if that function is not called. For example, If I have a dataStorage class, I want the internal implementation to be a list if only add is called, but want it to be a heap if both add and pop are called.
I am trying to implement this in C++, but I am curious if this is even possible. Examples in other languages would also help. Thanks!
The type of an object must be completely known at the point of definition. The type cannot depend on what is done with the object later.
For the dataStorage example, you could define dataStorage as an abstract class. For example:
struct dataStorage {
virtual ~dataStorage() = default;
virtual void add(dataType data) = 0;
// And anything else necessarily common to all implementations.
};
There could be a "default" implementation that uses a list.
struct dataList : public dataStorage {
void add(dataType data) override;
// And whatever else is needed.
};
There could be another implementation that uses a heap.
struct dataHeap : public dataStorage {
void add(dataType data) override;
void pop(); // Maybe return `dataType`, if desired
// And whatever else is needed.
};
Functions that need only to add data would work on references to dataStorage. Functions that need to pop data would work on references to dataHeap. When you define an object, you would choose dataList if the compiler allows it, dataHeap otherwise. (The compiler would not allow passing a dataList object to a function that requires a dataHeap&.) This is similar to what you asked for, except it does require manual intervention. On the bright side, you can use the compiler to tell you which decision to make.
A downside of this approach is that changes can get messy. There is additional maintenance and runtime overhead compared to simply always using a heap (one class, no inheritance). You should do some performance measurements to ensure that the cost is worth it. Sometimes simplicity is the best design, even if it is not optimal in all cases.
I'm in a situation where I have a class, let's call it Generic. This class has members and attributes, and I plan to use it in a std::vector<Generic> or similar, processing several instances of this class.
Also, I want to specialize this class, the only difference between the generic and specialized objects would be a private method, which does not access any member of the class (but is called by other methods). My first idea was to simply declare it virtual and overload it in specialized classes like this:
class Generic
{
// all other members and attributes
private:
virtual float specialFunc(float x) const =0;
};
class Specialized_one : public Generic
{
private:
virtual float specialFunc(float x) const{ return x;}
};
class Specialized_two : public Generic
{
private:
virtual float specialFunc(float x) const{ return 2*x; }
}
And thus I guess I would have to use a std::vector<Generic*>, and create and destroy the objects dynamically.
A friend suggested me using a std::function<> attribute for my Generic class, and give the specialFunc as an argument to the constructor but I am not sure how to do it properly.
What would be the advantages and drawbacks of these two approaches, and are there other (better ?) ways to do the same thing ? I'm quite curious about it.
For the details, the specialization of each object I instantiate would be determined at runtime, depending on user input. And I might end up with a lot of these objects (not yet sure how many), so I would like to avoid any unnecessary overhead.
virtual functions and overloading model an is-a relationship while std::function models a has-a relationship.
Which one to use depends on your specific use case.
Using std::function is perhaps more flexible as you can easily modify the functionality without introducing new types.
Performance should not be the main decision point here unless this code is provably (i.e. you measured it) the tight loop bottleneck in your program.
First of all, let's throw performance out the window.
If you use virtual functions, as you stated, you may end up with a lot of classes with the same interface:
class generic {
virtual f(float x);
};
class spec1 : public generic {
virtual f(float x);
};
class spec2 : public generic {
virtual f(float x);
};
Using std::function<void(float)> as a member would allow you to avoid all the specializations:
class meaningful_class_name {
std::function<void(float)> f;
public:
meaningful_class_name(std::function<void(float)> const& p_f) : f(p_f) {}
};
In fact, if this is the ONLY thing you're using the class for, you might as well just remove it, and use a std::function<void(float)> at the level of the caller.
Advantages of std::function:
1) Less code (1 class for N functions, whereas the virtual method requires N classes for N functions. I'm making the assumption that this function is the only thing that's going to differ between classes).
2) Much more flexibility (You can pass in capturing lambdas that hold state if you want to).
3) If you write the class as a template, you could use it for all kinds of function signatures if needed.
Using std::function solves whatever problem you're attempting to tackle with virtual functions, and it seems to do it better. However, I'm not going to assert that std::function will always be better than a bunch of virtual functions in several classes. Sometimes, these functions have to be private and virtual because their implementation has nothing to do with any outside callers, so flexibility is NOT an advantage.
Disadvantages of std::function:
1) I was about to write that you can't access the private members of the generic class, but then I realized that you can modify the std::function in the class itself with a capturing lambda that holds this. Given the way you outlined the class however, this shouldn't be a problem since it seems to be oblivious to any sort of internal state.
What would be the advantages and drawbacks of these two approaches, and are there other (better ?) ways to do the same thing ?
The issue I can see is "how do you want your class defined?" (as in, what is the public interface?)
Consider creating an API like this:
class Generic
{
// all other members and attributes
explicit Generic(std::function<float(float)> specialFunc);
};
Now, you can create any instance of Generic, without care. If you have no idea what you will place in specialFunc, this is the best alternative ("you have no idea" means that clients of your code may decide in one month to place a function from another library there, an identical function ("receive x, return x"), accessing some database for the value, passing a stateful functor into your function, or whatever else).
Also, if the specialFunc can change for an existing instance (i.e. create instance with specialFunc, use it, change specialFunc, use it again, etc) you should use this variant.
This variant may be imposed on your code base by other constraints. (for example, if want to avoid making Generic virtual, or if you need it to be final for other reasons).
If (on the other hand) your specialFunc can only be a choice from a limited number of implementations, and client code cannot decide later they want something else - i.e. you only have identical function and doubling the value - like in your example - then you should rely on specializations, like in the code in your question.
TLDR: Decide based on the usage scenarios of your class.
Edit: regarding beter (or at least alternative) ways to do this ... You could inject the specialFunc in your class on an "per needed" basis:
That is, instead of this:
class Generic
{
public:
Generic(std::function<float(float> f) : specialFunc{f} {}
void fancy_computation2() { 2 * specialFunc(2.); }
void fancy_computation4() { 4 * specialFunc(4.); }
private:
std::function<float(float> specialFunc;
};
You could write this:
class Generic
{
public:
Generic() {}
void fancy_computation2(std::function<float(float> f) { 2 * f(2.); }
void fancy_computation4(std::function<float(float> f) { 4 * f(4.); }
private:
};
This offers you more flexibility (you can use different special functions with single instance), at the cost of more complicated client code. This may also be a level of flexibility that you do not want (too much).
I have a class that is a core component of a performance sensitive code path, so I am trying to optimize it as much as possible. The class used to be:
class Widget
{
Widget(int n) : N(n) {}
.... member functions that use the constant value N ....
const int N; // just initialized, will never change
}
The arguments to the constructor are known at compile time, so I have changed this class to a template, so that N can be compiled into the functions:
template<int N>
class Widget
{
.... member functions that use N ....
}
I have another class with a method:
Widget & GetWidget(int index);
However, after templating Widget, each widget has a different type so I cannot define the function like this anymore. I considered different inheritance options, but I'm not sure that the performance gain from the template would outweigh the cost of inherited function invocations.
SO, my question is this:
I am pretty sure I want the best of both worlds (compile-time / run-time), and it may not be possible. But, is there a way to gain the performance of knowing N at compile time, but still being able to return Widgets as the same type?
Thanks!
The issue here is that if you store the widgets as the same type, then the code that retrieves the widgets from that store (by calling GetWidget) doesn't know N at compile time[*]. The code that calls the constructor knows N, but the code that uses the object has to cope with multiple possibilities.
Since the performance hit (if any) is likely to be in the code that uses the widgets, rather than the code that creates them, you can't avoid doing something in the critical code that depends on runtime information.
It may be that a virtual call to a function implemented in your class template, is faster than a non-virtual call to a function that uses N without knowing the value:
class Widget {
public:
virtual ~Widget() {}
virtual void function() = 0;
};
template <int N>
class WidgetImpl : public Widget {
public:
virtual void function() { use N; }
};
The optimizer can probably do its best job when N is known, since it can optimally unroll loops, transform arithmetic, and so on. But with the virtual call you're looking at one big disadvantage to start with, which is that none of the calls can be inlined (and I would guess a virtual call is less likely to be predicted than a non-virtual call when not inlined). The gain from inlining with unknown N could be more than the gain of knowing N, or it could be less. Try them both and see.
For a more far-fetched effort, if there are a reasonably small number of common cases you might even see an improvement by implementing your critical widget function as something like:
switch(n) {
case 1: /* do something using 1 */; break;
case 2: /* do the same thing using 2 */; break;
default: /* do the same thing using n */; break;
};
"do something" for all cases but the default could be a call to a function templated on the constant, then the default is the same code with a function parameter instead of a template parameter. Or it could all be calls to the same function (with a function parameter), but relying on the compiler to inline the call before optimization in the cases where the parameter is constant, for the same result as if it was templated.
Not massively maintainable, and it's usually a bad idea to second-guess the optimizer like this, but maybe you know what the common cases are, and the compiler doesn't.
[*] If the calling code does know the value of N at compile time, then you could replace GetWidget with a function template like this:
template <int N>
Widget<N> &getWidget(int index) {
return static_cast<Widget<N> &>(whatever you have already);
}
But I assume the caller doesn't know, because if it did then you probably wouldn't be asking...
You need to declare a non-templated type from which the templated type inherits, and then store the widgets as pointers to the non-templated base class. That is the only (type-safe) way to accomplish what you are looking for.
However, it is probably cleaner to keep the non-templated version. Have you profiled your code to see that the loops on the runtime-configured version are actually a bottleneck?
I guess the following is not an option?
template <int N>
Widget<N> & GetWidget();
Anyway, as soon as you’re managing several widget types together you cannot make them templated anymore since you can’t store objects of different type in one container.
The non-templated base class proposed by Michael is a solution but since it will incur virtual function call costs I’m guessing that making the class templated hasn’t got any benefits.
If your types are finite and known, you could use a boost::variant as an argument to your constructor.
The variant class template is a safe,
generic, stack-based discriminated
union container, offering a simple
solution for manipulating an object
from a heterogeneous set of types in a
uniform manner. Whereas standard
containers such as std::vector may be
thought of as "multi-value, single
type," variant is "multi-type, single
value."
here is some pseudo code
boost::variant< int, double, std::string > variant;
const variant foo( 1 );
const variant bar( 3.14 );
const variant baz( "hello world" );
const Widget foo_widget( foo );
const Widget bar_widget( bar );
const Widget baz_widget( baz );
Alternatively, you could use a boost::any for more flexibility.
You could write a templated GetWidget function. That would require you to know the type when you call GetWidget:
w = GetWidget<Box>(index);
When implementing polymorphic behavior in C++ one can either use a pure virtual method or one can use function pointers (or functors). For example an asynchronous callback can be implemented by:
Approach 1
class Callback
{
public:
Callback();
~Callback();
void go();
protected:
virtual void doGo() = 0;
};
//Constructor and Destructor
void Callback::go()
{
doGo();
}
So to use the callback here, you would need to override the doGo() method to call whatever function you want
Approach 2
typedef void (CallbackFunction*)(void*)
class Callback
{
public:
Callback(CallbackFunction* func, void* param);
~Callback();
void go();
private:
CallbackFunction* iFunc;
void* iParam;
};
Callback::Callback(CallbackFunction* func, void* param) :
iFunc(func),
iParam(param)
{}
//Destructor
void go()
{
(*iFunc)(iParam);
}
To use the callback method here you will need to create a function pointer to be called by the Callback object.
Approach 3
[This was added to the question by me (Andreas); it wasn't written by the original poster]
template <typename T>
class Callback
{
public:
Callback() {}
~Callback() {}
void go() {
T t; t();
}
};
class CallbackTest
{
public:
void operator()() { cout << "Test"; }
};
int main()
{
Callback<CallbackTest> test;
test.go();
}
What are the advantages and disadvantages of each implementation?
Approach 1 (Virtual Function)
"+" The "correct way to do it in C++
"-" A new class must be created per callback
"-" Performance-wise an additional dereference through VF-Table compared to Function Pointer. Two indirect references compared to Functor solution.
Approach 2 (Class with Function Pointer)
"+" Can wrap a C-style function for C++ Callback Class
"+" Callback function can be changed after callback object is created
"-" Requires an indirect call. May be slower than functor method for callbacks that can be statically computed at compile-time.
Approach 3 (Class calling T functor)
"+" Possibly the fastest way to do it. No indirect call overhead and may be inlined completely.
"-" Requires an additional Functor class to be defined.
"-" Requires that callback is statically declared at compile-time.
FWIW, Function Pointers are not the same as Functors. Functors (in C++) are classes that are used to provide a function call which is typically operator().
Here is an example functor as well as a template function which utilizes a functor argument:
class TFunctor
{
public:
void operator()(const char *charstring)
{
printf(charstring);
}
};
template<class T> void CallFunctor(T& functor_arg,const char *charstring)
{
functor_arg(charstring);
};
int main()
{
TFunctor foo;
CallFunctor(foo,"hello world\n");
}
From a performance perspective, Virtual functions and Function Pointers both result in an indirect function call (i.e. through a register) although virtual functions require an additional load of the VFTABLE pointer prior to loading the function pointer. Using Functors (with a non-virtual call) as a callback are the highest performing method to use a parameter to template functions because they can be inlined and even if not inlined, do not generate an indirect call.
Approach 1
Easier to read and understand
Less possibility of errors (iFunc cannot be NULL, you're not using a void *iParam, etc
C++ programmers will tell you that this is the "right" way to do it in C++
Approach 2
Slightly less typing to do
VERY slightly faster (calling a virtual method has some overhead, usually the same of two simple arithmetic operations.. So it most likely won't matter)
That's how you would do it in C
Approach 3
Probably the best way to do it when possible. It will have the best performance, it will be type safe, and it's easy to understand (it's the method used by the STL).
The primary problem with Approach 2 is that it simply doesn't scale. Consider the equivalent for 100 functions:
class MahClass {
// 100 pointers of various types
public:
MahClass() { // set all 100 pointers }
MahClass(const MahClass& other) {
// copy all 100 function pointers
}
};
The size of MahClass has ballooned, and the time to construct it has also significantly increased. Virtual functions, however, are O(1) increase in the size of the class and the time to construct it- not to mention that you, the user, must write all the callbacks for all the derived classes manually which adjust the pointer to become a pointer to derived, and must specify function pointer types and what a mess. Not to mention the idea that you might forget one, or set it to NULL or something equally stupid but totally going to happen because you're writing 30 classes this way and violating DRY like a parasitic wasp violates a caterpillar.
Approach 3 is only usable when the desired callback is statically knowable.
This leaves Approach 1 as the only usable approach when dynamic method invocation is required.
It's not clear from your example if you're creating a utility class or not. Is you Callback class intended to implement a closure or a more substantial object that you just didn't flesh out?
The first form:
Is easier to read and understand,
Is far easier to extend: try adding methods pause, resume and stop.
Is better at handling encapsulation (presuming doGo is defined in the class).
Is probably a better abstraction, so easier to maintain.
The second form:
Can be used with different methods for doGo, so it's more than just polymorphic.
Could allow (with additional methods) changing the doGo method at run-time, allowing the instances of the object to mutate their functionality after creation.
Ultimately, IMO, the first form is better for all normal cases. The second has some interesting capabilities, though -- but not ones you'll need often.
One major advantage of the first method is it has more type safety. The second method uses a void * for iParam so the compiler will not be able to diagnose type problems.
A minor advantage of the second method is that it would be less work to integrate with C. But if you're code base is only C++, this advantage is moot.
Function pointers are more C-style I would say. Mainly because in order to use them you usually must define a flat function with the same exact signature as your pointer definition.
When I write C++ the only flat function I write is int main(). Everything else is a class object. Out of the two choices I would choose to define an class and override your virtual, but if all you want is to notify some code that some action happened in your class, neither of these choices would be the best solution.
I am unaware of your exact situation but you might want to peruse design patterns
I would suggest the observer pattern. It is what I use when I need to monitor a class or wait for some sort of notification.
For example, let us look at an interface for adding read functionality to a class:
struct Read_Via_Inheritance
{
virtual void read_members(void) = 0;
};
Any time I want to add another source of reading, I have to inherit from the class and add a specific method:
struct Read_Inherited_From_Cin
: public Read_Via_Inheritance
{
void read_members(void)
{
cin >> member;
}
};
If I want to read from a file, database, or USB, this requires 3 more separate classes. The combinations start to be come very ugly with multiple objects and multiple sources.
If I use a functor, which happens to resemble the Visitor design pattern:
struct Reader_Visitor_Interface
{
virtual void read(unsigned int& member) = 0;
virtual void read(std::string& member) = 0;
};
struct Read_Client
{
void read_members(Reader_Interface & reader)
{
reader.read(x);
reader.read(text);
return;
}
unsigned int x;
std::string& text;
};
With the above foundation, objects can read from different sources just by supplying different readers to the read_members method:
struct Read_From_Cin
: Reader_Visitor_Interface
{
void read(unsigned int& value)
{
cin>>value;
}
void read(std::string& value)
{
getline(cin, value);
}
};
I don't have to change any of the object's code (a good thing because it is already working). I can also apply the reader to other objects.
Generally, I use inheritance when I am performing generic programming. For example, if I have a Field class, then I can create Field_Boolean, Field_Text and Field_Integer. In can put pointers to their instances into a vector<Field *> and call it a record. The record can perform generic operations on the fields, and doesn't care or know what kind of a field is processed.
Change to pure virtual, first off. Then inline it. That should negate any method overhead call at all, so long as inlining doesn't fail (and it won't if you force it).
May as well use C, because this is the only real useful major feature of C++ compared to C. You will always call method and it can't be inlined, so it will be less efficient.
Suppose you have the following code:
int main(int argc, char** argv) {
Foo f;
while (true) {
f.doSomething();
}
}
Which of the following two implementations of Foo are preferred?
Solution 1:
class Foo {
private:
void doIt(Bar& data);
public:
void doSomething() {
Bar _data;
doIt(_data);
}
};
Solution 2:
class Foo {
private:
Bar _data;
void doIt(Bar& data);
public:
void doSomething() {
doIt(_data);
}
};
In plain english: if I have a class with a method that gets called very often, and this method defines a considerable amount of temporary data (either one object of a complex class, or a large number of simple objects), should I declare this data as private members of the class?
On the one hand, this would save the time spent on constructing, initializing and destructing the data on each call, improving performance. On the other hand, it tramples on the "private member = state of the object" principle, and may make the code harder to understand.
Does the answer depend on the size/complexity of class Bar? What about the number of objects declared? At what point would the benefits outweigh the drawbacks?
From a design point of view, using temporaries is cleaner if that data is not part of the object state, and should be preferred.
Never make design choices on performance grounds before actually profiling the application. You might just discover that you end up with a worse design that is actually not any better than the original design performance wise.
To all the answers that recommend to reuse objects if construction/destruction cost is high, it is important to remark that if you must reuse the object from one invocation to another, in many cases the object must be reset to a valid state between method invocations and that also has a cost. In many such cases, the cost of resetting can be comparable to construction/destruction.
If you do not reset the object state between invocations, the two solutions could yield different results, as in the first call, the argument would be initialized and the state would probably be different between method invocations.
Thread safety has a great impact on this decision also. Auto variables inside a function are created in the stack of each of the threads, and as such are inherently thread safe. Any optimization that pushes those local variable so that it can be reused between different invocations will complicate thread safety and could even end up with a performance penalty due to contention that can worsen the overall performance.
Finally, if you want to keep the object between method invocations I would still not make it a private member of the class (it is not part of the class) but rather an implementation detail (static function variable, global in an unnamed namespace in the compilation unit where doOperation is implemented, member of a PIMPL...[the first 2 sharing the data for all objects, while the latter only for all invocations in the same object]) users of your class do not care about how you solve things (as long as you do it safely, and document that the class is not thread safe).
// foo.h
class Foo {
public:
void doOperation();
private:
void doIt( Bar& data );
};
// foo.cpp
void Foo::doOperation()
{
static Bar reusable_data;
doIt( reusable_data );
}
// else foo.cpp
namespace {
Bar reusable_global_data;
}
void Foo::doOperation()
{
doIt( reusable_global_data );
}
// pimpl foo.h
class Foo {
public:
void doOperation();
private:
class impl_t;
boost::scoped_ptr<impl_t> impl;
};
// foo.cpp
class Foo::impl_t {
private:
Bar reusable;
public:
void doIt(); // uses this->reusable instead of argument
};
void Foo::doOperation() {
impl->doIt();
}
First of all it depends on the problem being solved. If you need to persist the values of temporary objects between calls you need a member variable. If you need to reinitialize them on each invokation - use local temporary variables. It a question of the task at hand, not of being right or wrong.
Temporary variables construction and destruction will take some extra time (compared to just persisting a member variable) depending on how complex the temporary variables classes are and what their constructors and destructors have to do. Deciding whether the cost is significant should only be done after profiling, don't try to optimize it "just in case".
I'd declare _data as temporary variable in most cases. The only drawback is performance, but you'll get way more benefits. You may want to try Prototype pattern if constructing and destructing are really performance killers.
If it is semantically correct to preserve a value of Bar inside Foo, then there is nothing wrong with making it a member - it is then that every Foo has-a bar.
There are multiple scenarios where it might not be correct, e.g.
if you have multiple threads performing doSomething, would they need all separate Bar instances, or could they accept a single one?
would it be bad if state from one computation carries over to the next computation.
Most of the time, issue 2 is the reason to create local variables: you want to be sure to start from a clean state.
Like a lot of coding answers it depends.
Solution 1 is a lot more thread-safe. So if doSomething were being called by many threads I'd go for Solution 1.
If you're working in a single threaded environment and the cost of creating the Bar object is high, then I'd go for Solution 2.
In a single threaded env and if the cost of creating Bar is low, then I think i'd go for Solution 1.
You have already considered "private member=state of the object" principle, so there is no point in repeating that, however, look at it in another way.
A bunch of methods, say a, b, and c take the data "d" and work on it again and again. No other methods of the class care about this data. In this case, are you sure a, b and c are in the right class?
Would it be better to create another smaller class and delegate, where d can be a member variable? Such abstractions are difficult to think of, but often lead to great code.
Just my 2 cents.
Is that an extremely simplified example? If not, what's wrong with doing it this
void doSomething(Bar data);
int main() {
while (true) {
doSomething();
}
}
way? If doSomething() is a pure algorithm that needs some data (Bar) to work with, why would you need to wrap it in a class? A class is for wrapping a state (data) and the ways (member functions) to change it.
If you just need a piece of data then use just that: a piece of data. If you just need an algorithm, then use a function. Only if you need to keep a state (data values) between invocations of several algorithms (functions) working on them, a class might be the right choice.
I admit that the borderlines between these are blurred, but IME they make a good rule of thumb.
If it's really that temporary that costs you the time, then i would say there is nothing wrong with including it into your class as a member. But note that this will possibly make your function thread-unsafe if used without proper synchronization - once again, this depends on the use of _data.
I would, however, mark such a variable as mutable. If you read a class definition with a member being mutable, you can immediately assume that it doesn't account for the value of its parent object.
class Foo {
private:
mutable Bar _data;
private:
void doIt(Bar& data);
public:
void doSomething() {
doIt(_data);
}
};
This will also make it possible to use _data as a mutable entity inside a const function - just like you could use it as a mutable entity if it was a local variable inside such a function.
If you want Bar to be initialised only once (due to cost in this case). Then I'd move it to a singleton pattern.