Ensuring there's only one of every state - c++

I'm writing a game engine, and right now I was thinking about how I could ensure that ever state in the game (be it an entity state, a game state, etc) has only one instance. Singletons come to mind, but that seems like overkill. One other thing I thought about is nameless classes:
class EntityState
{
public:
virtual void foo() = 0;
};
class : public EntityState
{
public:
void foo() {}
} walkLeftState;
// Because it inherits from EntityState, a named class, I can also
// pass it as a parameter:
void Entity::changeState(EntityState* state) {}
But the problem is I want to add Python scripting, and I don't think Python has nameless classes. What other options are there?
Edit: Why I need only one of every state.
I need a away to identify what state an entity has. I could do it with magic values (i.e. an ID string), but that's a terribly ugly solution, imo. I'd much rather do it by comparing pointers, which I'm ensured to always be unique. But I can't identify by pointers unless there's a single instance of every state...
Edit 2: A solution...
I went for function objects in the end. Seeing as the State classes would only contain a function and nothing else, it didn't really make sense having the class in the first place, now that I think about it. In the end, I think I'll go for something like this:
typedef void (*EntityStateFunc)(Entity* entity, unsigned long currentTime);
namespace entitystate
{
void walkLeft(Entity* entity, unsigned long currentTime);
void stand(Entity* entity, unsigned long currentTime);
}
This should pretty much solve everything: no singletons, no complaints that it might make testing harder, it's simple... Only possible disadvantage is that I'll pretty much have my hands tied if a state ever needs to be more than a function; can anyone think of a scenario where a state needs to be something more complex than this?

Just create one of every state.
There is no reason to make your code so complicated by attempting to enforce this.
Do you write a function like this:
int foo() {
int x = 5;
x++;
return x;
}
And then go, oh my god, I only need one integer variable inside this function... I must enforce this? No.

You could avoid creating a hard constraint, and instead ensure that you're notified if and when you ever create more than you expected:
template <typename T>
struct expected_unique {
static int &getcount() {
static int count = 0;
return count;
}
static void object_created() {
int &lcount = getcount();
++lcount;
if (lcount > 1) {
std::err << "More than one " << typeid(T).name() << " " << lcount << "\n";
}
}
expected_unique() {
object_created();
}
expected_unique(const expected_unique&) {
object_created();
}
// optionally, if you only want to check no more than one at a time
// rather than no more than one ever.
~expected_unique() {
--getcount();
}
};
class WalkLeftState : public EntityState, private expected_unique<WalkLeftState> {
};
Obviously it's not thread-safe, you could make it so with a lock or with atomic int operations.
If there really is only one function, another alternative is:
class EntityState
{
void (*foo_func)();
public:
EntityState(void(*f)()) : foo_func(f) {}
bool operator==(const EntityState &rhs) {
return foo_func == rhs.foo_func;
}
bool operator!=(const EntityState &rhs) {
return !(*this == rhs);
}
void foo() { foo_func(); }
};
void walk_left_foo() {
}
EntityState walkLeftState(walk_left_foo);
Now, it doesn't matter whether or not there are multiple instances of EntityState using the same function, because comparison is performed according to whether the two states involved execute the same routine. So just switch your existing pointer comparisons to object comparisons.
However, if there's more than one virtual function in EntityState in real life, then this would be pretty unwieldy.

Singletons probably are overkill here. It's laudable to be defensive in
your programming, and to prevent misuse, but in this case, the simplest
solution is just to define each of the state classes in an unnamed
namespace in the source file. You not going to create multiple
instances in this one file, and no one else can even name them, much
less define an instance of one. (You can also leave them unnamed, but
that means no user defined destructor.)

I need a away to identify what state an entity has.
Why? To be able to switch on the state? That's exactly what the State pattern should prevent you from doing in the first place. It replaces "switching on states" with polymorphism. That is, instead of this:
switch (state.getState())
{
case WALK_LEFT:
--x;
break;
case WALK_RIGHT:
++x;
break;
case WALK_UP:
--y;
break;
case WALK_DOWN:
++y;
break;
}
you just say:
state.step();
and let the concrete step member functions do the right thing.

As I "hinted at" in the comments, enforcing a "only one instance may exist" constraint is nearly always the wrong thing to do. Actually, I'm willing to go out on a limb and say that it is always the wrong thing to do.
Instead, make it impossible to accidentally create new instances, but allow the programmer (you) to do so when you really want to.
In C++, there are two common ways in which instances might "accidentally" get created:
EntityState st = otherState;
EntityState st2;
Copy constructors are probably the #1 offender. It's easy to forget the &, so suddenly instead of creating a reference to the object, you create a copy. So prevent this by making the class noncopyable.
The default constructor is a less serious issue, but you might, for example, create a class member of type EntityState, and forget to initialize it. And voilá, the default constructor gives you a new instance of the class.
So prevent that too. Declare a constructor taking one or more parameters (which you don't accidentally call), ensure that no default constructor exists, and that the copy constructor is private.
Then you have to consciously think about it and want to create an instance before it happens. And so, as long as the programmer is sane, you'll only have one instance whenever you want just one instance to exist.

Related

C++ Get/Set accessors - how do I avoid typing repetitive code?

I'm writing a pretty large library, and I find myself writing almost identical accessors all the time. I already have several dozen accessors such as the one below.
Question: How can I declare/implement accessors to save typing all this repetitive code? (No #defines please; I'm looking for C++ constructs.)
Update: Yes, I do need accessor functions, because I need to take pointers to these accessors for something called Property Descriptors, which enable huge savings in my GUI code (non-library).
.h file
private:
bool _visible;
public:
bool GetVisible() const { return _visible; }
void SetVisible (bool value);
// Repeat for Get/SetFlashing, Get/SetColor, Get/SetLineWidth, etc.
.cpp file
void Element::SetVisible (bool value)
{
_visible = value;
this->InvalidateSelf(); // Call method in base class
// ...
// A bit more code here, identical in 90% of my setters.
// ...
}
// Repeat for Get/SetFlashing, Get/SetColor, Get/SetLineWidth, etc.
I find myself writing almost identical accessors all the time. I already have several dozen accessors such as the one below.
This is a sure design smell that you are writing accessors "for the sake of it". Do you really need them all? Do you really need a low-level public "get" and "set" operation for each one? It's unlikely.
After all, if all you're doing is writing a getter and a setter for each private data member, and each one has the same logic, you may as well have just made the data members public.
Rather your class should have meaningful and semantic operations that, in the course of their duties, may or may not make use of private data members. You will find that each of these meaningful operations is quite different from the rest, and so your problem with repetitive code is vanquished.
As n.m. said:
Easy: avoid accessors. Program your classes to do something, rather than have something.
Even for those operations which have nothing more to them, like controlling visibility, you should have a bool isVisible() const, and a void show(), and a void hide(). You'll find that when you start coding like this it will promote a move away from boilerplate "for the sake of it" getters & setters.
Whilst I think Lightness Races in Orbit makes a very good point, there is also a few ways that can be used to implement "repeating code", which can be applied, assuming we do indeed have a class that have "many things that are similar that need to be controlled individually, so kind of continuing on this, say we have a couple of methods like this:
void Element::Show()
{
visible = true;
Invalidate();
// More code goes here.
}
void Element::Hide()
{
visible = false;
Invalidate();
// More code goes here.
}
Now, to my view, this breaks the DRY (Do not Repeat Yourself) principle, so we should probably do something like this:
void Element::UpdateProperty(bool &property, bool newValue)
{
property = value;
Invalidate();
// More code goes here.
}
Now, we can implement Show and Hide, Flash, Unflash, Shaded etc by doing this, avoiding repetition inside each function.
void Element::Show()
{
UpdateProperty(visible, true);
}
If the type isn't always bool, e.g. there is a position, we can do:
template<typename T>void Element::UpdateProperty(T &property, T newValue)
{
property = value;
Invalidate();
// More code goes here.
}
and the MoveTo becomes:
void Element::MoveTo(Point p)
{
UpdateProperty(position, p);
}
Edit based on previously undisclosed information added to question:
Obviously the above technique can equally be applied to any form of function that does this sort of work:
void Element::SetVisible(bool value)
{
UpdateProperty(visible, value);
}
will work just as well as for Show described above. It doesn't mean you can get away from declaring the functions, but it reduces the need for code inside the function.
I agree with Lightness. You should design your classes for the task at hand, and if you need so many getters and setters, you may be doing something wrong.
That said, most good IDEs allow you to generate simple getters and setters, and some might even allow you to customize them. You might save the repetitive code as a template and select the code fragment whenever needed.
You may also use a customizable editor like emacs and Vim (with Ultisnips) and create some custom helping functions to make your job easy. The task is ripe for automation.
The only time you should ever write a get/set set of functions in any language is if it does something other than just read or write to a simple variable; don't bother wrapping up access to data if all you're doing is make it harder for people to read. If that's what you're doing, stop doing anything.
If you ever do want a set of get/set functions, don't call them get and set -- use assignment and type casting (and do it cleverly). That way you can make your code more readable instead of less.
This is very inelegant:
class get_set {
int value;
public:
int get() { return value; }
void set(int v) { value = v; }
};
This is a bit better
class get_set_2 {
value_type value;
bool needs_updating;
public:
operator value_type const & () {
if (needs_updating) update(); // details to be found elsewhere
return value;
}
get_set_2& operator = (value_type t) {
update(t); // details to be found elsewhere
return *this;
}
};
If you're not doing the second pattern, don't do anything.
I'm a tad late again, but I wanted to answer because I don't totally agree with some other here, and think there's additional points to lay out.
It's difficult to say for sure if your access methods are code smells without seeing a larger codebase, or have more information about intent. Everyone here is right about one thing: access method are generally to be avoided unless they do some 'significant work', or they expose data for the purpose of generic-ism (particularly in libraries).
So, we can go ahead and call methods like the idiomatic data() from STL containers, 'trivial access method'.
Why not use trivial access methods?
First, as others have noted, this can lead to an over-exposure of implementation details. At it's best such exposure makes for tedious code, and at it's worse it can lead to obfuscation of ownership semantics, resource leaks, or fatal exceptions. Exposure is fundamentally opposite of object orientation, because each object ought to manage its own data, and operations.
Secondly, code tends to become long, hard to test, and hard to maintain, as you have noted.
When to use trivial access methods?
Usually when their intent is specific, and non-trivial. For example, the STL containers data() function exists to intentionally expose implementation details for the purposes of genericism for the standard library.
Procedural style-structs
Breaking away from directly object-oriented styles, as implementation sometimes does; you may want to consider a simple struct (or class if you prefer) which acts as a data carrier; that is, they have all, or mostly, public properties. I would advise using a struct only for simple holders. This is opposed to a class ought to be used to establish some invariant in the constructor. In addition to private methods, static methods are a good way to illustrate invariants in a class. For example, a validation method. The invariant establishment on public data is also very good for immutable data.
An example:
// just holds some fields
struct simple_point {
int x, y;
};
// holds from fields, but asserts invariant that coordinates
// must be in [0, 10].
class small_point {
public:
int x, y;
small_point() noexcept : x{}, y{} {}
small_point(int u, int v)
{
if (!small_point::valid(u) || !small_point::valid(u)) {
throw std::invalid_argument("small_point: Invalid coordinate.");
}
x = u;
y = v;
}
static valid(int v) noexcept { return 0 <= v && v <= 10; }
};

When do I need anonymous class in C++?

There's a feature called anonymous class in C++. It's similar with anonymous struct in C. I think this feature is invented because of some needs, but I can't figure out what that is.
Can I have some example which really needs anonymous class?
The feature is there because struct and class are the same thing - anything you can do with one, you can do with the other. It serves exactly the same purpose as an anonymous struct in C; when you want to group some stuff together and declare one or more instances of it, but don't need to refer to that type by name.
It's less commonly used in C++, partly because C++ designs tend to be more type-oriented, and partly because you can't declare constructors or destructors for anonymous classes.
It is not really needed in a strict sense and never was. I.e. you could always assign a name, for example anonymous1, anonymous2 etc. But keeping track of more names than necessary is always a hassle.
Where it is helpfull is at any place where one wants to group data without giving a name to that group. I could come up with a several examples:
class foo {
class {
public:
void validate( int x ) { m_x = x; }
bool valid() { return m_exists; }
private:
int m_x;
bool m_exists;
} maybe_x;
};
In this case the int and the bool logically belong together, so it makes sense to group them. However for this concrete example it probably makes sense to create an actual optional type or use one of the available ones, because this pattern is most likely used at other places as well. In other cases this pattern of grouping might be so special, that it deserves to stay in that class only.
I really do assume though, that anonymous classes are rarely used (I have only used them a couple of times in my live probably). Often when one want's to group data, this is not class or scope specific but also a grouping which also makes sense at other places.
Maybe it was sometimes helpful to make nested functions like:
void foo() {
class {
void operator()(){
}
} bar;
bar();
}
But now we have lambdas and anonymous classes are left only for compatibility reasons.
The use of anonymous classes is for preserving compatibility with existing C code.
Example:
In some C code, the use of typedef in conjunction with anonymous structures is prevalent.
There is an example of anonymous structs that can be used with Qt 5's Signal/Slot system with ANY class and without the QObject derivative requirement:
void WorkspaceWidget::wwShowEvent()
{
//Show event: query a reload of the saved state and geometry
gcmessage("wwShowEvent "+ this->title());
struct{void* t; void operator()(){ static_cast<WorkspaceWidget*>(t)->wwReloadWindowState(); }}f;
f.t=this;
QObject::connect( &reloadStateTimer, &QTimer::timeout, f);
reloadStateTimer.start();
}
void WorkspaceWidget::wwReloadWindowState()
{
gcmessage( dynamic_cast<QObject*>(this)->metaObject()->className());
}
Basically, I need to connect a timer signal to a non-QObject derived class, but want to pass mt "this" properly.
QObject::connect can be connected to ordinary function in Qt 5, so this anonymous class is actually a functor that keeps the this pointer in itself, still passing the slot connection.
Also you can do things with auto in anonymous (vs2015)
struct {
auto* operator->() {return this;}
//do other functions
} mystruct;

Conventions for accessor methods (getters and setters) in C++

Several questions about accessor methods in C++ have been asked on SO, but none was able satisfy my curiosity on the issue.
I try to avoid accessors whenever possible, because, like Stroustrup and other famous programmers, I consider a class with many of them a sign of bad OO. In C++, I can in most cases add more responsibility to a class or use the friend keyword to avoid them. Yet in some cases, you really need access to specific class members.
There are several possibilities:
1. Don't use accessors at all
We can just make the respective member variables public. This is a no-go in Java, but seems to be OK with the C++ community. However, I'm a bit worried about cases were an explicit copy or a read-only (const) reference to an object should be returned, is that exaggerated?
2. Use Java-style get/set methods
I'm not sure if it's from Java at all, but I mean this:
int getAmount(); // Returns the amount
void setAmount(int amount); // Sets the amount
3. Use objective C-style get/set methods
This is a bit weird, but apparently increasingly common:
int amount(); // Returns the amount
void amount(int amount); // Sets the amount
In order for that to work, you will have to find a different name for your member variable. Some people append an underscore, others prepend "m_". I don't like either.
Which style do you use and why?
From my perspective as sitting with 4 million lines of C++ code (and that's just one project) from a maintenance perspective I would say:
It's ok to not use getters/setters if members are immutable (i.e. const) or simple with no dependencies (like a point class with members X and Y).
If member is private only it's also ok to skip getters/setters. I also count members of internal pimpl-classes as private if the .cpp unit is smallish.
If member is public or protected (protected is just as bad as public) and non-const, non-simple or has dependencies then use getters/setters.
As a maintenance guy my main reason for wanting to have getters/setters is because then I have a place to put break points / logging / something else.
I prefer the style of alternative 2. as that's more searchable (a key component in writing maintainable code).
2) is the best IMO, because it makes your intentions clearest. set_amount(10) is more meaningful than amount(10), and as a nice side effect allows a member named amount.
Public variables is usually a bad idea, because there's no encapsulation. Suppose you need to update a cache or refresh a window when a variable is updated? Too bad if your variables are public. If you have a set method, you can add it there.
I never use this style. Because it can limit the future of your class design and explicit geters or setters are just as efficient with a good compilers.
Of course, in reality inline explicit getters or setters create just as much underlying dependency on the class implementation. THey just reduce semantic dependency. You still have to recompile everything if you change them.
This is my default style when I use accessor methods.
This style seems too 'clever' to me. I do use it on rare occasions, but only in cases where I really want the accessor to feel as much as possible like a variable.
I do think there is a case for simple bags of variables with possibly a constructor to make sure they're all initialized to something sane. When I do this, I simply make it a struct and leave it all public.
That is a good style if we just want to represent pure data.
I don't like it :) because get_/set_ is really unnecessary when we can overload them in C++.
STL uses this style, such as std::streamString::str and std::ios_base::flags, except when it should be avoided! when? When method's name conflicts with other type's name, then get_/set_ style is used, such as std::string::get_allocator because of std::allocator.
In general, I feel that it is not a good idea to have too many getters and setters being used by too many entities in the system. It is just an indication of a bad design or wrong encapsulation.
Having said that, if such a design needs to be refactored, and the source code is available, I would prefer to use the Visitor Design pattern. The reason is:
a. It gives a class an opportunity to
decide whom to allow access to its
private state
b. It gives a class an
opportunity to decide what access to
allow to each of the entities who are
interested in its private state
c. It
clearly documents such exteral access
via a clear class interface
Basic idea is:
a) Redesign if possible else,
b)
Refactor such that
All access to class state is via a well known individualistic
interface
It should be possible to configure some kind of do's and don'ts
to each such interface, e.g. all
access from external entity GOOD
should be allowed, all access from
external entity BAD should be
disallowed, and external entity OK
should be allowed to get but not set (for example)
I would not exclude accessors from use. May for some POD structures, but I consider them a good thing (some accessors might have additional logic, too).
It doesn't realy matters the naming convention, if you are consistent in your code. If you are using several third party libraries, they might use different naming conventions anyway. So it is a matter of taste.
I've seen the idealization of classes instead of integral types to refer to meaningful data.
Something like this below is generally not making good use of C++ properties:
struct particle {
float mass;
float acceleration;
float velocity;
} p;
Why? Because the result of p.mass*p.acceleration is a float and not force as expected.
The definition of classes to designate a purpose (even if it's a value, like amount mentioned earlier) makes more sense, and allow us to do something like:
struct amount
{
int value;
amount() : value( 0 ) {}
amount( int value0 ) : value( value0 ) {}
operator int()& { return value; }
operator int()const& { return value; }
amount& operator = ( int const newvalue )
{
value = newvalue;
return *this;
}
};
You can access the value in amount implicitly by the operator int. Furthermore:
struct wage
{
amount balance;
operator amount()& { return balance; }
operator amount()const& { return balance; }
wage& operator = ( amount const& newbalance )
{
balance = newbalance;
return *this;
}
};
Getter/Setter usage:
void wage_test()
{
wage worker;
(amount&)worker = 100; // if you like this, can remove = operator
worker = amount(105); // an alternative if the first one is too weird
int value = (amount)worker; // getting amount is more clear
}
This is a different approach, doesn't mean it's good or bad, but different.
An additional possibility could be :
int& amount();
I'm not sure I would recommend it, but it has the advantage that the unusual notation can refrain users to modify data.
str.length() = 5; // Ok string is a very bad example :)
Sometimes it is maybe just the good choice to make:
image(point) = 255;
Another possibility again, use functional notation to modify the object.
edit::change_amount(obj, val)
This way dangerous/editing function can be pulled away in a separate namespace with it's own documentation. This one seems to come naturally with generic programming.
Let me tell you about one additional possiblity, which seems the most conscise.
Need to read & modify
Simply declare that variable public:
class Worker {
public:
int wage = 5000;
}
worker.wage = 8000;
cout << worker.wage << endl;
Need just to read
class Worker {
int _wage = 5000;
public:
inline int wage() {
return _wage;
}
}
worker.wage = 8000; // error !!
cout << worker.wage() << endl;
The downside of this approach is that you need to change all the calling code (add parentheses, that is) when you want to change the access pattern.
variation on #3, i'm told this could be 'fluent' style
class foo {
private: int bar;
private: int narf;
public: foo & bar(int);
public: int bar();
public: foo & narf(int);
public: int narf();
};
//multi set (get is as expected)
foo f; f.bar(2).narf(3);

Private members vs temporary variables in C++

Suppose you have the following code:
int main(int argc, char** argv) {
Foo f;
while (true) {
f.doSomething();
}
}
Which of the following two implementations of Foo are preferred?
Solution 1:
class Foo {
private:
void doIt(Bar& data);
public:
void doSomething() {
Bar _data;
doIt(_data);
}
};
Solution 2:
class Foo {
private:
Bar _data;
void doIt(Bar& data);
public:
void doSomething() {
doIt(_data);
}
};
In plain english: if I have a class with a method that gets called very often, and this method defines a considerable amount of temporary data (either one object of a complex class, or a large number of simple objects), should I declare this data as private members of the class?
On the one hand, this would save the time spent on constructing, initializing and destructing the data on each call, improving performance. On the other hand, it tramples on the "private member = state of the object" principle, and may make the code harder to understand.
Does the answer depend on the size/complexity of class Bar? What about the number of objects declared? At what point would the benefits outweigh the drawbacks?
From a design point of view, using temporaries is cleaner if that data is not part of the object state, and should be preferred.
Never make design choices on performance grounds before actually profiling the application. You might just discover that you end up with a worse design that is actually not any better than the original design performance wise.
To all the answers that recommend to reuse objects if construction/destruction cost is high, it is important to remark that if you must reuse the object from one invocation to another, in many cases the object must be reset to a valid state between method invocations and that also has a cost. In many such cases, the cost of resetting can be comparable to construction/destruction.
If you do not reset the object state between invocations, the two solutions could yield different results, as in the first call, the argument would be initialized and the state would probably be different between method invocations.
Thread safety has a great impact on this decision also. Auto variables inside a function are created in the stack of each of the threads, and as such are inherently thread safe. Any optimization that pushes those local variable so that it can be reused between different invocations will complicate thread safety and could even end up with a performance penalty due to contention that can worsen the overall performance.
Finally, if you want to keep the object between method invocations I would still not make it a private member of the class (it is not part of the class) but rather an implementation detail (static function variable, global in an unnamed namespace in the compilation unit where doOperation is implemented, member of a PIMPL...[the first 2 sharing the data for all objects, while the latter only for all invocations in the same object]) users of your class do not care about how you solve things (as long as you do it safely, and document that the class is not thread safe).
// foo.h
class Foo {
public:
void doOperation();
private:
void doIt( Bar& data );
};
// foo.cpp
void Foo::doOperation()
{
static Bar reusable_data;
doIt( reusable_data );
}
// else foo.cpp
namespace {
Bar reusable_global_data;
}
void Foo::doOperation()
{
doIt( reusable_global_data );
}
// pimpl foo.h
class Foo {
public:
void doOperation();
private:
class impl_t;
boost::scoped_ptr<impl_t> impl;
};
// foo.cpp
class Foo::impl_t {
private:
Bar reusable;
public:
void doIt(); // uses this->reusable instead of argument
};
void Foo::doOperation() {
impl->doIt();
}
First of all it depends on the problem being solved. If you need to persist the values of temporary objects between calls you need a member variable. If you need to reinitialize them on each invokation - use local temporary variables. It a question of the task at hand, not of being right or wrong.
Temporary variables construction and destruction will take some extra time (compared to just persisting a member variable) depending on how complex the temporary variables classes are and what their constructors and destructors have to do. Deciding whether the cost is significant should only be done after profiling, don't try to optimize it "just in case".
I'd declare _data as temporary variable in most cases. The only drawback is performance, but you'll get way more benefits. You may want to try Prototype pattern if constructing and destructing are really performance killers.
If it is semantically correct to preserve a value of Bar inside Foo, then there is nothing wrong with making it a member - it is then that every Foo has-a bar.
There are multiple scenarios where it might not be correct, e.g.
if you have multiple threads performing doSomething, would they need all separate Bar instances, or could they accept a single one?
would it be bad if state from one computation carries over to the next computation.
Most of the time, issue 2 is the reason to create local variables: you want to be sure to start from a clean state.
Like a lot of coding answers it depends.
Solution 1 is a lot more thread-safe. So if doSomething were being called by many threads I'd go for Solution 1.
If you're working in a single threaded environment and the cost of creating the Bar object is high, then I'd go for Solution 2.
In a single threaded env and if the cost of creating Bar is low, then I think i'd go for Solution 1.
You have already considered "private member=state of the object" principle, so there is no point in repeating that, however, look at it in another way.
A bunch of methods, say a, b, and c take the data "d" and work on it again and again. No other methods of the class care about this data. In this case, are you sure a, b and c are in the right class?
Would it be better to create another smaller class and delegate, where d can be a member variable? Such abstractions are difficult to think of, but often lead to great code.
Just my 2 cents.
Is that an extremely simplified example? If not, what's wrong with doing it this
void doSomething(Bar data);
int main() {
while (true) {
doSomething();
}
}
way? If doSomething() is a pure algorithm that needs some data (Bar) to work with, why would you need to wrap it in a class? A class is for wrapping a state (data) and the ways (member functions) to change it.
If you just need a piece of data then use just that: a piece of data. If you just need an algorithm, then use a function. Only if you need to keep a state (data values) between invocations of several algorithms (functions) working on them, a class might be the right choice.
I admit that the borderlines between these are blurred, but IME they make a good rule of thumb.
If it's really that temporary that costs you the time, then i would say there is nothing wrong with including it into your class as a member. But note that this will possibly make your function thread-unsafe if used without proper synchronization - once again, this depends on the use of _data.
I would, however, mark such a variable as mutable. If you read a class definition with a member being mutable, you can immediately assume that it doesn't account for the value of its parent object.
class Foo {
private:
mutable Bar _data;
private:
void doIt(Bar& data);
public:
void doSomething() {
doIt(_data);
}
};
This will also make it possible to use _data as a mutable entity inside a const function - just like you could use it as a mutable entity if it was a local variable inside such a function.
If you want Bar to be initialised only once (due to cost in this case). Then I'd move it to a singleton pattern.

Optional function parameters: Use default arguments (NULL) or overload the function?

I have a function that processes a given vector, but may also create such a vector itself if it is not given.
I see two design choices for such a case, where a function parameter is optional:
Make it a pointer and make it NULL by default:
void foo(int i, std::vector<int>* optional = NULL) {
if(optional == NULL){
optional = new std::vector<int>();
// fill vector with data
}
// process vector
}
Or have two functions with an overloaded name, one of which leaves out the argument:
void foo(int i) {
std::vector<int> vec;
// fill vec with data
foo(i, vec);
}
void foo(int i, const std::vector<int>& optional) {
// process vector
}
Are there reasons to prefer one solution over the other?
I slightly prefer the second one because I can make the vector a const reference, since it is, when provided, only read, not written. Also, the interface looks cleaner (isn't NULL just a hack?). And the performance difference resulting from the indirect function call is probably optimized away.
Yet, I often see the first solution in code. Are there compelling reasons to prefer it, apart from programmer laziness?
I would not use either approach.
In this context, the purpose of foo() seems to be to process a vector. That is, foo()'s job is to process the vector.
But in the second version of foo(), it is implicitly given a second job: to create the vector. The semantics between foo() version 1 and foo() version 2 are not the same.
Instead of doing this, I would consider having just one foo() function to process a vector, and another function which creates the vector, if you need such a thing.
For example:
void foo(int i, const std::vector<int>& optional) {
// process vector
}
std::vector<int>* makeVector() {
return new std::vector<int>;
}
Obviously these functions are trivial, and if all makeVector() needs to do to get it's job done is literally just call new, then there may be no point in having the makeVector() function. But I'm sure that in your actual situation these functions do much more than what is being shown here, and my code above illustrates a fundamental approach to semantic design: give one function one job to do.
The design I have above for the foo() function also illustrates another fundamental approach that I personally use in my code when it comes to designing interfaces -- which includes function signatures, classes, etc. That is this: I believe that a good interface is 1) easy and intuitive to use correctly, and 2) difficult or impossible to use incorrectly . In the case of the foo() function we are implictly saying that, with my design, the vector is required to already exist and be 'ready'. By designing foo() to take a reference instead of a pointer, it is both intuitive that the caller must already have a vector, and they are going to have a hard time passing in something that isn't a ready-to-go vector.
I would definitely favour the 2nd approach of overloaded methods.
The first approach (optional parameters) blurs the definition of the method as it no longer has a single well-defined purpose. This in turn increases the complexity of the code, making it more difficult for someone not familiar with it to understand it.
With the second approach (overloaded methods), each method has a clear purpose. Each method is well-structured and cohesive. Some additional notes:
If there's code which needs to be duplicated into both methods, this can be extracted out into a separate method and each overloaded method could call this external method.
I would go a step further and name each method differently to indicate the differences between the methods. This will make the code more self-documenting.
While I do understand the complaints of many people regarding default parameters and overloads, there seems to be a lack of understanding to the benefits that these features provide.
Default Parameter Values:
First I want to point out that upon initial design of a project, there should be little to no use for defaults if well designed. However, where defaults' greatest assets comes into play is with existing projects and well established APIs. I work on projects that consist of millions of existing lines of code and do not have the luxury to re-code them all. So when you wish to add a new feature which requires an extra parameter; a default is needed for the new parameter. Otherwise you will break everyone that uses your project. Which would be fine with me personally, but I doubt your company or users of your product/API would appreciate having to re-code their projects on every update. Simply, Defaults are great for backwards compatibility! This is usually the reason you will see defaults in big APIs or existing projects.
Function Overrides:
The benefit of function overrides is that they allow for the sharing of a functionality concept, but with with different options/parameters. However, many times I see function overrides lazily used to provide starkly different functionality, with just slightly different parameters. In this case they should each have separately named functions, pertaining to their specific functionality (As with the OP's example).
These, features of c/c++ are good and work well when used properly. Which can be said of most any programming feature. It is when they are abused/misused that they cause problems.
Disclaimer:
I know that this question is a few years old, but since these answers came up in my search results today (2012), I felt this needed further addressing for future readers.
I agree, I would use two functions. Basically, you have two different use cases, so it makes sense to have two different implementations.
I find that the more C++ code I write, the fewer parameter defaults I have - I wouldn't really shed any tears if the feature was deprecated, though I would have to re-write a shed load of old code!
A references can't be NULL in C++, a really good solution would be to use Nullable template.
This would let you do things is ref.isNull()
Here you can use this:
template<class T>
class Nullable {
public:
Nullable() {
m_set = false;
}
explicit
Nullable(T value) {
m_value = value;
m_set = true;
}
Nullable(const Nullable &src) {
m_set = src.m_set;
if(m_set)
m_value = src.m_value;
}
Nullable & operator =(const Nullable &RHS) {
m_set = RHS.m_set;
if(m_set)
m_value = RHS.m_value;
return *this;
}
bool operator ==(const Nullable &RHS) const {
if(!m_set && !RHS.m_set)
return true;
if(m_set != RHS.m_set)
return false;
return m_value == RHS.m_value;
}
bool operator !=(const Nullable &RHS) const {
return !operator==(RHS);
}
bool GetSet() const {
return m_set;
}
const T &GetValue() const {
return m_value;
}
T GetValueDefault(const T &defaultValue) const {
if(m_set)
return m_value;
return defaultValue;
}
void SetValue(const T &value) {
m_value = value;
m_set = true;
}
void Clear()
{
m_set = false;
}
private:
T m_value;
bool m_set;
};
Now you can have
void foo(int i, Nullable<AnyClass> &optional = Nullable<AnyClass>()) {
//you can do
if(optional.isNull()) {
}
}
I usually avoid the first case. Note that those two functions are different in what they do. One of them fills a vector with some data. The other doesn't (just accept the data from the caller). I tend to name differently functions that actually do different things. In fact, even as you write them, they are two functions:
foo_default (or just foo)
foo_with_values
At least I find this distinction cleaner in the long therm, and for the occasional library/functions user.
I, too, prefer the second one. While there are not much difference between the two, you are basically using the functionality of the primary method in the foo(int i) overload and the primary overload would work perfectly without caring about existence of lack of the other one, so there is more separation of concerns in the overload version.
In C++ you should avoid allowing valid NULL parameters whenever possible. The reason is that it substantially reduces callsite documentation. I know this sounds extreme but I work with APIs that take upwards of 10-20 parameters, half of which can validly be NULL. The resulting code is almost unreadable
SomeFunction(NULL, pName, NULL, pDestination);
If you were to switch it to force const references the code is simply forced to be more readable.
SomeFunction(
Location::Hidden(),
pName,
SomeOtherValue::Empty(),
pDestination);
I'm squarely in the "overload" camp. Others have added specifics about your actual code example but I wanted to add what I feel are the benefits of using overloads versus defaults for the general case.
Any parameter can be "defaulted"
No gotcha if an overriding function uses a different value for its default.
It's not necessary to add "hacky" constructors to existing types in order to allow them to have default.
Output parameters can be defaulted without needing to use pointers or hacky global objects.
To put some code examples on each:
Any parameter can be defaulted:
class A {}; class B {}; class C {};
void foo (A const &, B const &, C const &);
inline void foo (A const & a, C const & c)
{
foo (a, B (), c); // 'B' defaulted
}
No danger of overriding functions having different values for the default:
class A {
public:
virtual void foo (int i = 0);
};
class B : public A {
public:
virtual void foo (int i = 100);
};
void bar (A & a)
{
a.foo (); // Always uses '0', no matter of dynamic type of 'a'
}
It's not necessary to add "hacky" constructors to existing types in order to allow them to be defaulted:
struct POD {
int i;
int j;
};
void foo (POD p); // Adding default (other than {0, 0})
// would require constructor to be added
inline void foo ()
{
POD p = { 1, 2 };
foo (p);
}
Output parameters can be defaulted without needing to use pointers or hacky global objects:
void foo (int i, int & j); // Default requires global "dummy"
// or 'j' should be pointer.
inline void foo (int i)
{
int j;
foo (i, j);
}
The only exception to the rule re overloading versus defaults is for constructors where it's currently not possible for a constructor to forward to another. (I believe C++ 0x will solve that though).
I would favour a third option:
Separate into two functions, but do not overload.
Overloads, by nature, are less usable. They require the user to become aware of two options and figure out what the difference between them is, and if they're so inclined, to also check the documentation or the code to ensure which is which.
I would have one function that takes the parameter,
and one that is called "createVectorAndFoo" or something like that (obviously naming becomes easier with real problems).
While this violates the "two responsibilities for function" rule (and gives it a long name), I believe this is preferable when your function really does do two things (create vector and foo it).
Generally I agree with others' suggestion to use a two-function approach. However, if the vector created when the 1-parameter form is used is always the same, you could simplify things by instead making it static and using a default const& parameter instead:
// Either at global scope, or (better) inside a class
static vector<int> default_vector = populate_default_vector();
void foo(int i, std::vector<int> const& optional = default_vector) {
...
}
The first way is poorer because you cannot tell if you accidentally passed in NULL or if it was done on purpose... if it was an accident then you have likely caused a bug.
With the second one you can test (assert, whatever) for NULL and handle it appropriately.