Private members vs temporary variables in C++

Private members vs temporary variables in C++ - c++

Suppose you have the following code:
int main(int argc, char** argv) {
Foo f;
while (true) {
f.doSomething();
}
}
Which of the following two implementations of Foo are preferred?
Solution 1:
class Foo {
private:
void doIt(Bar& data);
public:
void doSomething() {
Bar _data;
doIt(_data);
}
};
Solution 2:
class Foo {
private:
Bar _data;
void doIt(Bar& data);
public:
void doSomething() {
doIt(_data);
}
};
In plain english: if I have a class with a method that gets called very often, and this method defines a considerable amount of temporary data (either one object of a complex class, or a large number of simple objects), should I declare this data as private members of the class?
On the one hand, this would save the time spent on constructing, initializing and destructing the data on each call, improving performance. On the other hand, it tramples on the "private member = state of the object" principle, and may make the code harder to understand.
Does the answer depend on the size/complexity of class Bar? What about the number of objects declared? At what point would the benefits outweigh the drawbacks?

From a design point of view, using temporaries is cleaner if that data is not part of the object state, and should be preferred.
Never make design choices on performance grounds before actually profiling the application. You might just discover that you end up with a worse design that is actually not any better than the original design performance wise.
To all the answers that recommend to reuse objects if construction/destruction cost is high, it is important to remark that if you must reuse the object from one invocation to another, in many cases the object must be reset to a valid state between method invocations and that also has a cost. In many such cases, the cost of resetting can be comparable to construction/destruction.
If you do not reset the object state between invocations, the two solutions could yield different results, as in the first call, the argument would be initialized and the state would probably be different between method invocations.
Thread safety has a great impact on this decision also. Auto variables inside a function are created in the stack of each of the threads, and as such are inherently thread safe. Any optimization that pushes those local variable so that it can be reused between different invocations will complicate thread safety and could even end up with a performance penalty due to contention that can worsen the overall performance.
Finally, if you want to keep the object between method invocations I would still not make it a private member of the class (it is not part of the class) but rather an implementation detail (static function variable, global in an unnamed namespace in the compilation unit where doOperation is implemented, member of a PIMPL...[the first 2 sharing the data for all objects, while the latter only for all invocations in the same object]) users of your class do not care about how you solve things (as long as you do it safely, and document that the class is not thread safe).
// foo.h
class Foo {
public:
void doOperation();
private:
void doIt( Bar& data );
};
// foo.cpp
void Foo::doOperation()
{
static Bar reusable_data;
doIt( reusable_data );
}
// else foo.cpp
namespace {
Bar reusable_global_data;
}
void Foo::doOperation()
{
doIt( reusable_global_data );
}
// pimpl foo.h
class Foo {
public:
void doOperation();
private:
class impl_t;
boost::scoped_ptr<impl_t> impl;
};
// foo.cpp
class Foo::impl_t {
private:
Bar reusable;
public:
void doIt(); // uses this->reusable instead of argument
};
void Foo::doOperation() {
impl->doIt();
}

First of all it depends on the problem being solved. If you need to persist the values of temporary objects between calls you need a member variable. If you need to reinitialize them on each invokation - use local temporary variables. It a question of the task at hand, not of being right or wrong.
Temporary variables construction and destruction will take some extra time (compared to just persisting a member variable) depending on how complex the temporary variables classes are and what their constructors and destructors have to do. Deciding whether the cost is significant should only be done after profiling, don't try to optimize it "just in case".

I'd declare _data as temporary variable in most cases. The only drawback is performance, but you'll get way more benefits. You may want to try Prototype pattern if constructing and destructing are really performance killers.

If it is semantically correct to preserve a value of Bar inside Foo, then there is nothing wrong with making it a member - it is then that every Foo has-a bar.
There are multiple scenarios where it might not be correct, e.g.
if you have multiple threads performing doSomething, would they need all separate Bar instances, or could they accept a single one?
would it be bad if state from one computation carries over to the next computation.
Most of the time, issue 2 is the reason to create local variables: you want to be sure to start from a clean state.

Like a lot of coding answers it depends.
Solution 1 is a lot more thread-safe. So if doSomething were being called by many threads I'd go for Solution 1.
If you're working in a single threaded environment and the cost of creating the Bar object is high, then I'd go for Solution 2.
In a single threaded env and if the cost of creating Bar is low, then I think i'd go for Solution 1.

You have already considered "private member=state of the object" principle, so there is no point in repeating that, however, look at it in another way.
A bunch of methods, say a, b, and c take the data "d" and work on it again and again. No other methods of the class care about this data. In this case, are you sure a, b and c are in the right class?
Would it be better to create another smaller class and delegate, where d can be a member variable? Such abstractions are difficult to think of, but often lead to great code.
Just my 2 cents.

Is that an extremely simplified example? If not, what's wrong with doing it this
void doSomething(Bar data);
int main() {
while (true) {
doSomething();
}
}
way? If doSomething() is a pure algorithm that needs some data (Bar) to work with, why would you need to wrap it in a class? A class is for wrapping a state (data) and the ways (member functions) to change it.
If you just need a piece of data then use just that: a piece of data. If you just need an algorithm, then use a function. Only if you need to keep a state (data values) between invocations of several algorithms (functions) working on them, a class might be the right choice.
I admit that the borderlines between these are blurred, but IME they make a good rule of thumb.

If it's really that temporary that costs you the time, then i would say there is nothing wrong with including it into your class as a member. But note that this will possibly make your function thread-unsafe if used without proper synchronization - once again, this depends on the use of _data.
I would, however, mark such a variable as mutable. If you read a class definition with a member being mutable, you can immediately assume that it doesn't account for the value of its parent object.
class Foo {
private:
mutable Bar _data;
private:
void doIt(Bar& data);
public:
void doSomething() {
doIt(_data);
}
};
This will also make it possible to use _data as a mutable entity inside a const function - just like you could use it as a mutable entity if it was a local variable inside such a function.

If you want Bar to be initialised only once (due to cost in this case). Then I'd move it to a singleton pattern.

Related

Is there a way to overload classes in a way similar to function overloading?

We can overload functions by giving them a different number of parameters. For example, functions someFunc() and someFunc(int i) can do completely different things.
Is it possible to achieve the same effect on classes? For example, having one class name but creating one class if a function is not called and a different class if that function is not called. For example, If I have a dataStorage class, I want the internal implementation to be a list if only add is called, but want it to be a heap if both add and pop are called.
I am trying to implement this in C++, but I am curious if this is even possible. Examples in other languages would also help. Thanks!

The type of an object must be completely known at the point of definition. The type cannot depend on what is done with the object later.
For the dataStorage example, you could define dataStorage as an abstract class. For example:
struct dataStorage {
virtual ~dataStorage() = default;
virtual void add(dataType data) = 0;
// And anything else necessarily common to all implementations.
};
There could be a "default" implementation that uses a list.
struct dataList : public dataStorage {
void add(dataType data) override;
// And whatever else is needed.
};
There could be another implementation that uses a heap.
struct dataHeap : public dataStorage {
void add(dataType data) override;
void pop(); // Maybe return `dataType`, if desired
// And whatever else is needed.
};
Functions that need only to add data would work on references to dataStorage. Functions that need to pop data would work on references to dataHeap. When you define an object, you would choose dataList if the compiler allows it, dataHeap otherwise. (The compiler would not allow passing a dataList object to a function that requires a dataHeap&.) This is similar to what you asked for, except it does require manual intervention. On the bright side, you can use the compiler to tell you which decision to make.
A downside of this approach is that changes can get messy. There is additional maintenance and runtime overhead compared to simply always using a heap (one class, no inheritance). You should do some performance measurements to ensure that the cost is worth it. Sometimes simplicity is the best design, even if it is not optimal in all cases.

static object vs singleton class object in c++?

I just want to know when should I use singleton class over static object and vice versa. Because same can be used to have only one instance.
One reason I can guess about the storage difference in memory.
Is there any other reason to choose one over the other.

The main difference between a singleton and a static object is that the singleton guarantees that there can only be one instance of a particular class type, whereas a static object is only a single instance of a particular type. This means that the fundamental difference is whether it makes sense for the class to have multiple instances or not. For example, let's say that you have a factory that builds widgets.
class WidgetFactory {
public:
int numberOfWidgetsBuiltInTheWholeApplication();
// returns number_of_widgets_built
std::shared_ptr<Widget> build();
// increments number_of_widgets_built
private:
size_t number_of_widgets_built;
};
Let's also assume that we want to guarantee that we can determine the number of Widgets that were created in the entire application, and enforcing that there is only one WidgetFactory is an easy way to do this. This is where we would use a singleton. If we were to just use a static object, like this:
WidgetFactory& theWidgetFactory()
{
static WidgetFactory widget_factory;
return widget_factory;
}
we would have no guarantee that theWidgetFactory().numberOfWidgetsBuiltInTheWholeApplication() was actually the real total, since someone else could come along and make their own WidgetFactory instance.
However, if we make WidgetFactory be a singleton, we now have that guarantee. We could do that perhaps by making the constructors private:
class WidgetFactory {
public:
int numberOfWidgetsBuiltInTheWholeApplication();
std::shared_ptr<Widget> build();
private:
WidgetFactory();
WidgetFactory(const WidgetFactory &); // not implemented
friend WidgetFactory& theWidgetFactory();
};
Now only theWidgetFactory() function is allowed to create the instance. There are other ways that we could make numberOfWidgetsBuiltInTheWholeApplication() function properly without using a singleton, but perhaps in some situations that solution is more complicated or has other downsides.

where only static initialization is involved, a global constant is ok. where dynamic initialization is involved, you want to avoid the static initialization fiasco. hence if you must, singleton should be preferred.
however, mutable singletons are generally just global variables in disguise.
therefore, to the extent practically possible, avoid them.

c++ switch vs. member function pointer vs. virtual inheritance

I am trying to analyze the trade offs between various methods of achieving polymorphism. I need a list of objects with some similarities and some differences in member functions. The options I see are as follows:
have a flag in each object, and a switch statement in each function.
The value of the flag directs each object to its specific section of
each function.
have an array of member function pointers in the object, which are
assigned upon construction. Then, I call that function pointer to
get the correct member function.
have an virtual base class with several derived classes. One
drawback to this is that my list will now have to contain pointers,
and not the objects themselves.
My understanding is that the pointer lookups from the list in option 3 will take longer than the member function lookups of option 2 because of the guaranteed proximity of member functions.
What are some of the benefits/drawbacks of these options? My priority is performance over readability.
Is there any other method for polymorphism?

have a flag in each object, and a switch statement in each function. The value of the flag directs each object to its specific section of each function
OK, so this could make sense if very little code varies based on the flag.
This minimises the amount of (duplicated) code which has to fit in cache, and avoids any function call indirection. Under some circumstances these benefits could outweigh the extra cost of the switch statement.
have an array of member function pointers in the object, which are assigned upon construction. Then, I call that function pointer to get the correct member function
You save one indirection (to the vtable), but also make your objects bigger so fewer fit in cache. It's impossible to say which will dominate, so you'll just have to profile, but it isn't an obvious win
have an virtual base class with several derived classes. One drawback to this is that my list will now have to contain pointers, and not the objects themselves
If the your code paths are different enough that separating them completely is reasonable, this is the cleanest solution. If you need to optimise it, you can either use a specialised allocator to ensure they're sequential (even if not sequential in your container), or move the objects directly into your container using a clever wrapper similar to Boost.Any. You'll still get the vtable indirection, but I'd prefer this to #2 unless profiling shows it's really a problem.
So, there are several questions you should answer before you can decide:
how much code is shared, and how much varies?
how big are the objects, and will a table of inline function pointers materially affect your cache miss stats?
and, after you've answered those, you should just profile anyway.

One way to achieve faster polymorphism is through the CRTP idiom and static polymorphism:
template<typename T>
struct base
{
void f()
{
static_cast<T*>( this )->f_impl();
}
};
struct foo : public base<foo>
{
void f_impl()
{
std::cout << "foo!" << std::endl;
}
};
struct bar : public base<bar>
{
void f_impl()
{
std::cout << "bar!" << std::endl;
}
};
struct quux : public base<quux>
{
void f_impl()
{
std::cout << "quux!" << std::endl;
}
};
template<typename T>
void call_f( const base<T>& something )
{
something.f();
}
int main()
{
foo my_foo;
bar my_bar;
quux my_quux;
call_f( my_foo );
call_f( my_bar );
call_f( my_quux );
}
This outputs:
foo!
bar!
quux!
Static-polymorphism performs far better than virtual dispatch, because the compiler knows which function will be called at compile-time, and it could inline everything.
Even if it provides dynamic binding, it cannot perform polymorphism in the common heterogeneous-container way, because every instance of the base class is a different type.
However, that could be achieved with something like boost::any.

With a switch statement, if you want to add a new class then you need to modify everywhere where the class is switched on, which may be in various places in your code base. There may also be places outside your code base that need to be modified, but perhaps you know this isn't the case in this scenario.
With an array of member function pointers within each member, the only downside is that you duplicate that memory for every object. If you know there's only one or two "virtual" functions though then it's a good option.
As for virtual functions, you are right in that you have to heap allocate them (or manual manage the memory), but it is the most extensible option.
If you aren't after extensible, then (1) or (2) may be your best option. As always, the only way to tell is to measure. I know that many compilers will implement a switch statement in some cases by a jump table, which essentially comes out the same as a virtual function table. For small numbers of case statement they may just use binary search branching.
Measure!

C++ static , extern uses with global data

I'm new to C++ and OOP in general and have been trying to learn efficient or 'correct' ways to do things, but am still having trouble.
I'm creating a DataStore class that holds data for other classes/objects. There will only ever be one instance/object of this class; however, there doesn't really need to be an object/instance since it's global data, right. In this case I feel like it's just a way to provide scope. So, I want to directly change the class members instead of passing around the object. I have read about static and _extern, but I can't decide if either would be viable, or if anything else would be better.
Right now I'm passing the one created object around to change it's data, but I would rather the class be accessed as 'itself' instead of by 'an instance of itself' while still retaining the idea of it being an object.

Typically, this sort of problem (where you need one, but only ever one - and you are SURE you never ever need more), is solved by using a "singleton" pattern.
class Singleton
{
public:
static Singleton* getInstance()
{
if (!instance) instance = new Singleton();
return instance;
}
int getStuff() { return stuff; }
private:
Singleton() { stuff = 42; }
static Singleton *instance;
int stuff;
};
then in some suitiable .cpp file>
static Singleton *instance;
Or use a global variable directly:
class Something
{
public:
Something() { stuff = 42; }
int getStuff() { return stuff; }
private:
int stuff;
}
extern Something global_something; // Make sure everyone can find it.
In ONE .cpp file:
Something global_something;
Since BOTH of these are essentially a global variable solution, I expect someone disliking global variables will downvote it, but if you don't want to pass around your class object everywhere, a global variable is not a terrible idea. You just have to be aware that global variables are not necessarily a great idea as a solution in general. It can be hard to follow what is going on, and it certainly gets messy if you suddenly need more than one (because you decided to change the code to support two different storages, or whatever) - but this applies to a singleton too.

EDIT: In a comment OP explained the data store will be read by code running in multiple threads, and updated by code in one thread. My previous answer no longer applies. Here's a better answer.
Don't use a global variable to hold the store's instance. This will open the door for many subtle bugs that can haunt you for a long while. You should give your reading threads read-only access to the store. Your writing thread should get read-write access.
Make sure your read methods in the data store are properly marked as const. Then create a single instance of the data store, and put a pointer to it in a const global variable. Your writing thread should have another mechanism of getting a non-const pointer (add a GetInstance public static method, as suggested by #Mats).
My previous answer:
If you're certain there will always be just one data store instance, don't pass it around.
Global variables are frowned upon, and some languages (Java and C#) outlawed them altogether. So in C# and Java you use static class members instead, which are practically the same thing (with exactly the same problems).
If you can put your single instance in a a const global variable, you should be fine.
If you're doing any kind of multithreading, you'll need to make sure your store is thread-safe, or else really bad things will happen.

I do this for object that have 1 instance most of time during execution of program.
class Object {
private:
Object();
friend Object & GetObject();
public:
...
};
inline Object & GetObject() {
static Object O;
return O;
}
1) this is less verbose than singleton.
2) this avoid pitfall of global object, such as undefined initialization order.

you can use a controversial Singleton pattern or you can use one of PARAMETERISE FROM ABOVE approaches described in Mark Radford (Overload Journal #57 – Oct 2003) SINGLETON - the anti-pattern! article.
PARAMETERISE FROM ABOVE approach (in his opinion) strengthen encapsulation and ease initialisation difficulties.
The classic lazy evaluated and correctly destroyed singleton:
class S
{
public:
static S& getInstance()
{
static S instance; // Guaranteed to be destroyed.
// Instantiated on first use.
return instance;
}
private:
S() {}; // Constructor? (the {} brackets) are needed here.
// Dont forget to declare these two. You want to make sure they
// are unaccessable otherwise you may accidently get copies of
// your singleton appearing.
S(S const&); // Don't Implement
void operator=(S const&); // Don't implement
};
But note: this is not thread-safe.
see here for good StackOverflow post about Singletons

Do you consider multiple initialization steps "poor form"?

I'm writing a physics simulation (Ising model) in C++ that operates on square lattices. The heart of my program is my Ising class with a constructor that calls for the row and column dimensions of the lattice. I have two other methods to set other parameters of the system (temperature & initial state) that must get called before evolving the system! So, for instance, a sample program might look like this
int main() {
Ising system(30, 30);
system.set_state(up);
system.set_temperature(2);
for(int t = 0; t < 1000; t++) {
system.step();
}
return 0;
}
If the system.set_*() methods aren't called prior to system.step(), system.step() throws an exception alerting the user to the problem. I implemented it this way to simplify my constructor; is this bad practice?

It is recommended to put all mandatory parameters in the constructor whenever possible (there are exceptions of course, but these should be rare - I have seen one real-world example so far). This way you make your class both easier and safer to use.
Note also that by simplifying your constructor you make the client code more complicated instead, which IMO is a bad tradeoff. The constructor is written only once, but caller code may potentially need to be written many times more (increasing both the sheer amount of code to be written and the chance of errors).

Not at all, IMO. I face the same thing when loading data from external files. When the objects are created (ie, their respective ctors are called), the data is still unavailable and can only be retrieved at a later stage. So I split the initialisation in different stages:
constructor
initialisation (called by the framework engine when an object is activated for the first time)
activation (called each time an object is activated).
This is very specific to the framework I'm developing, but there is no way to deal with everything using just the constructor.
However, if you know the variables at the moment the ctor is called, it's better not to complicate the code. It's a possible source of headaches for anyone using your code.

IMO this is poor form if all of these initialization steps must be invoked every time. One of the goals of well-designed software is to minimize the opportunities to screw up, and having multiple methods which must be invoked before an object is "usable" simply makes it harder to get right. If these calls were optional then having them as separate methods would be fine.
Share and enjoy.

The entire point in a class is to present some kind of abstraction. As a user of a class, I should be able to assume that it behaves like the abstraction it models.
And part of that is that the class must always be valid. Once the object has been created (by calling the constructor), the class must be in a meaningful, valid state. It should be ready to use. If it isn't, then it is no longer a good abstraction.

If the initialization methods must be called in a specific order then I would wrap the call to them in their own method as this indicates that the methods are not atomic on their own so the 'knowledge' of how they should be called should be held in one place.
Well that's my opinion, anyway!

I'd say that setting the initial conditions should be separate from the constructor if you plan to initialize and run more than one transient on the same lattice.
If you run a transient and stop, then it's possible to move setting the initial conditions inside the constructor, but it means that you have to pass in the parameter values in order to do this.
I fully agree with the idea that an object should be 100% ready to be used after its constructor is called, but I think that's separate from the physics of setting the initial temperature field. The object could be fully usable, yet have every node in the problem at the same temperature of absolute zero. A uniform temperature field in an insulated body isn't of much interest from a heat transfer point of view.

As another commentator pointed out, having to call a bunch of initialisation functions is poor form. I would wrap this up in a class:
class SimulationInfo
{
private:
int x;
int y;
int state;
int temperature;
public:
SimulationArgs() : x(30), y (30), state(up), temperature(2) { }; // default ctor
// custom constructors here!
// properties
int x() const { return x; };
int y() const { return y; };
int state() const { return state; };
int temperature() const { return temperature; };
}; // eo class SimulationInfo
class Simulation
{
private:
Ising m_system;
public:
Simulation(const SimulationInfo& _info) : m_system(_info.x(), _info.y())
{
m_system.set_state(_info.state());
m_system.set_temperature(_info.temperature());
} // eo ctor
void simulate(int _steps)
{
for(int step(0); step < _steps; ++steps)
m_system.step();
} // eo simulate
}; // eo class Simulation
There are otherways, but this makes things infinitely more usable from a default setup:
SimulationInfo si; // accept all defaults
Simulation sim(si);
sim.simulate(1000);

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Private members vs temporary variables in C++ - c++

I'd declare _data as temporary variable in most cases. The only drawback is performance, but you'll get way more benefits. You may want to try Prototype pattern if constructing and destructing are really performance killers.

If you want Bar to be initialised only once (due to cost in this case). Then I'd move it to a singleton pattern.

Related

Is there a way to overload classes in a way similar to function overloading?

static object vs singleton class object in c++?

c++ switch vs. member function pointer vs. virtual inheritance

C++ static , extern uses with global data

Do you consider multiple initialization steps "poor form"?

Categories

Resources