"State pattern" vs "one member function per state"? - c++

My class has 3 states. In each state it does some work, and goes to other state, or remains in the same state (in 95% or more cases it will stay in the same state). I can implement state pattern (I assume you know it). The alternative, which I pretty like, is this:
I have a member function per state, and also a pointer to member function, which points to the current state function. When in a state I want to go to another state, I just point that function pointer to another state function. (maybe this isn't completely equivalent to state pattern, but in my case it works fine).
Those two ways are almost identical, I think.
So, my questions are:
Which solution is better (depends on what)?
Is it worth to declare a class per state (which will have only one function)? I think that would be artificial.
What about performance? isn't creating new object of state class (in case of state pattern) bring with it a slight overhead? (Sure state classes shouldn't have members, but anyway it should cost something)

You don't really mention the constraints under which your program will run, so it's hard comment specifically about overheads of one implementation over the other, so I'll just make a comment about code maintainability.
Personally I think that unless your state machine is extremely simple and will stay simple, then declaring a class per state is far more maintainable, extensible & readable. A good rule of thumb might be that if you can't look at the code in your class and keep the entire picture in your head, then your class is probably doing too much. The small overhead you pay in declaring a class per state is likely to be well worth the productivity gains you will get from writing modular code (or anyone else who ends up maintaining it). I've come across far too many 'uber' classes that are essentially one big (very hard too maintain) state machine that probably started out as a simple state machine, to recommend otherwise.
The 'S' and 'O' portions of the SOLID acronym (https://en.wikipedia.org/wiki/SOLID_(object-oriented_design) are always good things to keep in mind.

It depends if you need to access private members of your object or not. If not, then an out-of-class implementation breaks your code in smaller fragments and may be preferable because of this (but this is non objective : the two solutions have pros and cons).
It's not necessary, but adds a layer of abstraction and loosen the coupling. Using an interface, you can change each implementation without affecting the others (e.g. adding class fields...)
Doesn't matter so much, allocating a new empty class or calling a function have same magnitude of overhead.

Related

C++: When is method redefinition preferred over virtual method override? [duplicate]

I know that virtual functions have an overhead of dereferencing to call a method. But I guess with modern architectural speed it is almost negligible.
Is there any particular reason why all functions in C++ are not virtual as in Java?
From my knowledge, defining a function virtual in a base class is sufficient/necessary. Now when I write a parent class, I might not know which methods would get over-ridden. So does that mean that while writing a child class someone would have to edit the parent class. This sounds like inconvenient and sometimes not possible?
Update:
Summarizing from Jon Skeet's answer below:
It's a trade-off between explicitly making someone realize that they are inheriting functionality [which has potential risks in themselves [(check Jon's response)] [and potential small performance gains] with a trade-off for less flexibility, more code changes, and a steeper learning curve.
Other reasons from different answers:
Virtual functions cannot be in-lined because inlining have to happen at runtime. This have performance impacts when you expect you functions benefits from inlining.
There might be potentially other reasons, and I would love to know and summarize them.
There are good reasons for controlling which methods are virtual beyond performance. While I don't actually make most of my methods final in Java, I probably should... unless a method is designed to be overridden, it probably shouldn't be virtual IMO.
Designing for inheritance can be tricky - in particular it means you need to document far more about what might call it and what it might call. Imagine if you have two virtual methods, and one calls the other - that must be documented, otherwise someone could override the "called" method with an implementation which calls the "calling" method, unwittingly creating a stack overflow (or infinite loop if there's tail call optimization). At that point you've then got less flexibility in your implementation - you can't switch it round at a later date.
Note that C# is a similar language to Java in various ways, but chose to make methods non-virtual by default. Some other people aren't keen on this, but I certainly welcome it - and I'd actually prefer that classes were uninheritable by default too.
Basically, it comes down to this advice from Josh Bloch: design for inheritance or prohibit it.
One of the main C++ principles is: you only pay for what you use ("zero overhead principle"). If you don't need the dynamic dispatch mechanism, you shouldn't pay for its overhead.
As the author of the base class, you should decide which methods should be allowed to be overridden. If you're writing both, go ahead and refactor what you need. But it works this way, because there has to be a way for the author of the base class to control its use.
But I guess with modern architectural speed it is almost negligible.
This assumption is wrong, and, I guess, the main reason for this decision.
Consider the case of inlining. C++’ sort function performs much faster than C’s otherwise similar qsort in some scenarios because it can inline its comparator argument, while C cannot (due to use of function pointers). In extreme cases, this can mean performance differences of as much as 700% (Scott Meyers, Effective STL).
The same would be true for virtual functions. We’ve had similar discussions before; for instance, Is there any reason to use C++ instead of C, Perl, Python, etc?
Most answers deal with the overhead of virtual functions, but there are other reasons not to make any function in a class virtual, as the fact that it will change the class from standard-layout to, well, non-standard-layout, and that can be a problem if you need to serialize binary data. That is solved differently in C#, for example, by having structs being a different family of types than classes.
From the design point of view, every public function establishes a contract between your type and the users of the type, and every virtual function (public or not) establishes a different contract with the classes that extend your type. The greater the number of such contracts that you sign the less room for changes that you have. As a matter of fact, there are quite a few people, including some well known writers, that defend that the public interface should never contain virtual functions, as your compromise to your clients might be different from the compromises you require from your extensions. That is, the public interfaces shows what you do for your clients, while the virtual interface shows how others might help you in doing it.
Another effect of virtual functions is that they always get dispatched to the final overrider (unless you explicitly qualify the call), and that means that any function that is needed to maintain your invariants (think the state of the private variables) should not be virtual: if a class extends it, it will have to either make an explicit qualified call back to the parent or else would break the invariants at your level.
This is similar to the example of the infinite loop/stack overflow that #Jon Skeet mentioned, just in a different way: you have to document in each function whether it accesses any private attributes so that extensions will ensure that the function is called at the right time. And that in turn means that you are breaking encapsulation and you have a leaking abstraction: Your internal details are now part of the interface (documentation + requirements on your extensions), and you cannot modify them as you wish.
Then there is performance... there will be an impact in performance, but in most cases that is overrated, and it could be argued that only in the few cases where performance is critical, you would fall back and declare the functions non-virtual. Then again, that might not be simple on a built product, since the two interfaces (public + extensions) are already bound.
You forget one thing. The overhead is also in memory, that is you add a virtual table and a pointer to that table for each object. Now if you have an object which has significant number of instances expected then it is not negligible. example, million instance equals 4 Mega byte. I agree that for simple application this is not much, but for real time devices such as routers this counts.
I'm rather late to the party here, so I'll add one thing that I haven't noticed covered in other answers, and summarise quickly...
Usability in shared memory: a typical implementation of virtual dispatch has a pointer to a class-specific virtual dispatch table in each object. The addresses in these pointers are specific to the process creating them, which means multi-process systems accessing objects in shared memory can't dispatch using another process's object! That's an unacceptable limitation given shared memory's importance in high-performance multi-process systems.
Encapsulation: the ability of a class designer to control the members accessed by client code, ensuring class semantics and invariants are maintained. For example, if you derive from std::string (I may get a few comments for daring to suggest that ;-P) then you can use all the normal insert / erase / append operations and be sure that - provided you don't do anything that's always undefined behaviour for std::string like pass bad position values to functions - the std::string data will be sound. Someone checking or maintaining your code doesn't have to check if you've changed the meaning of those operations. For a class, encapsulation ensures freedom to later modify the implementation without breaking client code. Another perspective on the same statement: client code can use the class any way it likes without being sensitive to the implementation details. If any function can be changed in a derived class, that whole encapsulation mechanism is simply blown away.
Hidden dependencies: when you know neither what other functions are dependent on the one you're overriding, nor that the function was designed to be overridden, then you can't reason about the impact of your change. For example, you think "I've always wanted this", and change std::string::operator[]() and at() to consider negative values (after a type-cast to signed) to be offsets backwards from the end of the string. But, perhaps some other function was using at() as a kind of assertion that an index was valid - knowing it'll throw otherwise - before attempting an insertion or deletion... that code might go from throwing in a Standard-specified way to having undefined (but likely lethal) behaviour.
Documentation: by making a function virtual, you're documenting that it is an intended point of customisation, and part of the API for client code to use.
Inlining - code side & CPU usage: virtual dispatch complicates the compiler's job of working out when to inline function calls, and could therefore provide worse code in terms of both space/bloat and CPU usage.
Indirection during calls: even if an out-of-line call is being made either way, there's a small performance cost for virtual dispatch that may be significant when calling trivially simple functions repeatedly in performance critical systems. (You have to read the per-object pointer to the virtual dispatch table, then the virtual dispatch table entry itself - means the VDT pages are consuming cache too.)
Memory usage: the per-object pointers to virtual dispatch tables may represent significant wasted memory, especially for arrays of small objects. This means less objects fit in cache, and can have a significant performance impact.
Memory layout: it's essential for performance, and highly convenient for interoperability, that C++ can define classes with the exact memory layout of member data specified by network or data standards of various libraries and protocols. That data often comes from outside your C++ program, and may be generated in another language. Such communications and storage protocols won't have "gaps" for pointers to virtual dispatch tables, and as discussed earlier - even if they did, and the compiler somehow let you efficiently inject the correct pointers for your process over incoming data, that would frustrate multi-process access to the data. Crude-but-practical pointer/size based serialisation/deserialisation/comms code would also be made more complicated and potentially slower.
Pay per use (in Bjarne Stroustrup words).
Seems like this question might have some answers Virtual functions should not be used excessively - Why ?. In my opinion the one thing that stands out is that it just add more complexity in terms of knowing what can be done with inheritance.
Yes, it's because of performance overhead. Virtual methods are called using virtual tables and indirection.
In Java all methods are virtual and the overhead is also present. But, contrary to C++, the JIT compiler profiles the code during run-time and can in-line those methods which don't use this property. So, JVM knows where it's really needed and where not thus freeing You from making the decision on your own.
The issues is that while Java compiles to code that runs on a virtual machine, that same guarantee can't be made for C++. It common to use C++ as a more organized replacement for C, and C has a 1:1 translation to assembly.
If you consider that 9 out of 10 microprocessors in the world are not in a personal computer or a smartphone, you'll see the issue when you further consider that there are a lot of processors that need this low level access.
C++ was designed to avoid that hidden deferencing if you didn't need it, thus keeping that 1:1 nature. Some of the first C++ code actually had an intermediate step of being translated to C before running through a C-to-assembly compiler.
Java method calls are far more efficient than C++ due to runtime optimization.
What we need is to compile C++ into bytecode and run it on JVM.

A question about encapsulation and inheritence practices

I've heard people saying that having protected members kind of breaks the point of encapsulation and is not the best practice, one should design the program such that derived classes will not need to have access to private base class members.
An example situation
Now, imagine the following scenario, a simple 8bit game, we have bunch of different objects, such as, regular boxes act as obstacles, spikes, coins, moving platforms etc. List can go on.
All of them have x and y coordinates, a rectangle that specifies size of the object, and collision box, and a texture. Also they can share functions like setting position, rendering, loading the texture, checking for collision etc.
But some of them also need to modify base members, e.g. boxes can be pushed around so they might need a move function, some objects may be moving by themselves or maybe some blocks change texture in-game.
Therefore a base class like object can really come in handy, but that would either require ton of getters - setters or having private members to be protected instead. Either way, compromises encapsulation.
Given the anecdotal context, which would be a better practice:
1. Have a common base class with shared functions and members, declared as protected. Be able to use common functions, pass the reference of base class to non-member functions which only needs to access shared properties. But compromise encapsulation.
2. Have a separate class for each, declare the member variables as private and don't compromise encapsulation.
3. A better way that I couldn't have thought.
I don't think encapsulation is highly vital and probably way to go for that anecdote would be just having protected members, but my goal with this question is writing a well practiced, standard code, rather than solving that specific problem.
Thanks in advance.
First off, I'm going to start by saying there is not a one-size fits all answer to design. Different problems require different solutions; however there are design patterns that often may be more maintainable over time than others.
Indeed, a lot of suggestions for design make them better in a team environment -- but good practices are also useful for solo projects as well so that it can be easier to understand and change in the future.
Sometimes the person who needs to understand your code will be you, a year from now -- so keep that in mind😊
I've heard people saying that having protected members kind of breaks the point of encapsulation
Like any tool, it can be misused; but there is nothing about protected access that inherently breaks encapsulation.
What defines the encapsulation of your object is the intended projected API surface area. Sometimes, that protected member is logically part of the surface-area -- and this is perfectly valid.
If misused, protected members can give clients access to mutable members that may break a class's intended invariants -- which would be bad. An example of this would be if you were able to derive a class exposing a rectangle, and were able to set the width/height to a negative value. Functions in the base class, such as compute_area could suddenly yield wrong values -- and cause cascading failures that should otherwise have been guarded against by better encapsulated.
As for the design of your example in question:
Base classes are not necessarily a bad thing, but can easily be overused and can lead to "god" classes that unintentionally expose too much functionality in an effort to share logic. Over time this can become a maintenance burden and just an overall confusing mess.
Your example sounds better suited to composition, with some smaller interfaces:
Things like a point and a vector type would be base-types to produce higher-order compositions like rectangle.
This could then be composed together to create a model which handles general (logical) objects in 2D space that have collision.
intersection/collision logic can be handled from an outside utility class
Rendering can be handled from a renderable interface, where any class that needs to render extends from this interface.
intersection handling logic can be handled by an intersectable interface, which determines behaviors of an object on intersection (this effectively abstracts each of the game objects into raw behaviors)
etc
encapsulation is not a security thing, its a neatness thing (and hence a supportability, readability ..). you have to assume that people deriving classes are basically sensible. They are after all writing programs either of their own using your base classes (so who cares), or they are writing in a team with you
The primary purpose of "encapsulation" in object-oriented programming is to limit direct access to data in order to minimize dependencies, and where dependencies must exist, to express those in terms of functions not data.
This is ties in with Design by Contract where you allow "public" access to certain functions and reserve the right to modify others arbitrarily, at any time, for any reason, even to the point of removing them, by expressing those as "protected".
That is, you could have a game object like:
class Enemy {
public:
int getHealth() const;
}
Where the getHealth() function returns an int value expressing the health. How does it derive this value? It's not for the caller to know or care. Maybe it's byte 9 of a binary packet you just received. Maybe it's a string from a JSON object. It doesn't matter.
Most importantly because it doesn't matter you're free to change how getHealth() works internally without breaking any code that's dependent on it.
However, if you're exposing a public int health property that opens up a whole world of problems. What if that is manipulated incorrectly? What if it's set to an invalid value? How do you trap access to that property being manipulated?
It's much easier when you have setHealth(const int health) where you can do things like:
clamp it to a particular range
trigger an event when it exceeds certain bounds
update a saved game state
transmit an update over the network
hook in other "observers" which might need to know when that value is manipulated
None of those things are easily implemented without encapsulation.
protected is not just a "get off my lawn" thing, it's an important tool to ensure that your implementation is used correctly and as intended.

Long delegation chains in C++

This is definitely subjective, but I'd like to try to avoid it
becoming argumentative. I think it could be an interesting question if
people treat it appropriately.
In my several recent projects I used to implement architectures where long delegation chains are a common thing.
Dual delegation chains can be encountered very often:
bool Exists = Env->FileSystem->FileExists( "foo.txt" );
And triple delegation is not rare at all:
Env->Renderer->GetCanvas()->TextStr( ... );
Delegation chains of higher order exist but are really scarce.
In above mentioned examples no NULL run-time checks are performed since the objects used are always there and are vital to the functioning of the program and
explicitly constructed when execution starts. Basically I used to split a delegation chain in these cases:
1) I reuse the object obtained through a delegation chain:
{ // make C invisible to the parent scope
clCanvas* C = Env->Renderer->GetCanvas();
C->TextStr( ... );
C->TextStr( ... );
C->TextStr( ... );
}
2) An intermediate object somewhere in the middle of the delegation chain should be checked for NULL before usage. Eg.
clCanvas* C = Env->Renderer->GetCanvas();
if ( C ) C->TextStr( ... );
I used to fight the case (2) by providing proxy objects so that a method can be invoked on non-NULL object leading to an empty result.
My questions are:
Is either of cases (1) or (2) a pattern or an antipattern?
Is there a better way to deal with long delegation chains in C++?
Here are some pros and cons I considered while making my choice:
Pros:
it is very descriptive: it is clear out of 1 line of code where did the object came from
long delegation chains look nice
Cons:
interactive debugging is labored since it is hard to inspect more than one temporary object in the delegation chain
I would like to know other pros and cons of the long delegation chains. Please, present your reasoning and vote based on how well-argued opinion is and not how well you agree with it.
I wouldn't go so far to call either an anti-pattern. However, the first has the disadvantage that your variable C is visible even after it's logically relevant (too gratuitous scoping).
You can get around this by using this syntax:
if (clCanvas* C = Env->Renderer->GetCanvas()) {
C->TextStr( ... );
/* some more things with C */
}
This is allowed in C++ (while it's not in C) and allows you to keep proper scope (C is scoped as if it were inside the conditional's block) and check for NULL.
Asserting that something is not NULL is by all means better than getting killed by a SegFault. So I wouldn't recommend simply skipping these checks, unless you're a 100% sure that that pointer can never ever be NULL.
Additionally, you could encapsulate your checks in an extra free function, if you feel particularly dandy:
template <typename T>
T notNULL(T value) {
assert(value);
return value;
}
// e.g.
notNULL(notNULL(Env)->Renderer->GetCanvas())->TextStr();
In my experience, chains like that often contains getters that are less than trivial, leading to inefficiencies. I think that (1) is a reasonable approach. Using proxy objects seems like an overkill. I would rather see a crash on a NULL pointer rather than using a proxy objects.
Such long chain of delegation should not happens if you follow the Law of Demeter. I've often argued with some of its proponents that they where holding themselves to it too conscientiously, but if you come to the point to wonder how best to handle long delegation chains, you should probably be a little more compliant with its recommendations.
Interesting question, I think this is open to interpretation, but:
My Two Cents
Design patterns are just reusable solutions to common problems which are generic enough to be widely applied in object oriented (usually) programming. Many common patterns will start you out with interfaces, inheritance chains, and/or containment relationships that will result in you using chaining to call things to some extent. The patterns are not trying to solve a programming issue like this though - chaining is just a side effect of them solving the functional problems at hand. So, I wouldn't really consider it a pattern.
Equally, anti-patterns are approaches that (in my mind) counter-act the purpose of design patterns. For example, design patterns are all about structure and the adaptability of your code. People consider a singleton an anti-pattern because it (often, not always) results in spider-web like code due to the fact that it inherently creates a global, and when you have many, your design deteriorates fast.
So, again, your chaining problem doesn't necessarily indicate good or bad design - it's not related to the functional objectives of patterns or the drawbacks of anti-patterns. Some designs just have a lot of nested objects even when designed well.
What to do about it:
Long delegation chains can definitely be a pain in the butt after a while, and as long as your design dictates that the pointers in those chains won't be reassigned, I think saving a temporary pointer to the point in the chain you're interested in is completely fine (function scope or less preferably).
Personally though, I'm against saving a permanent pointer to a part of the chain as a class member as I've seen that end up in people having 30 pointers to sub objects permanently stored, and you lose all conception of how the objects are laid out in the pattern or architecture you're working with.
One other thought - I'm not sure if I like this or not, but I've seen some people create a private (for your sanity) function that navigates the chain so you can recall that and not deal with issues about whether or not your pointer changes under the covers, or whether or not you have nulls. It can be nice to wrap all that logic up once, put a nice comment at the top of the function stating which part of the chain it gets the pointer from, and then just use the function result directly in your code instead of using your delegation chain each time.
Performance
My last note would be that this wrap-in-function approach as well as your delegation chain approach both suffer from performance drawbacks. Saving a temporary pointer lets you avoid the extra two dereferences potentially many times if you're using these objects in a loop. Equally, storing the pointer from the function call will avoid the over head of an extra function call every loop cycle.
For bool Exists = Env->FileSystem->FileExists( "foo.txt" ); I'd rather go for an even more detailed breakdown of your chain, so in my ideal world, there are the following lines of code:
Environment* env = GetEnv();
FileSystem* fs = env->FileSystem;
bool exists = fs->FileExists( "foo.txt" );
and why? Some reasons:
readability: my attention gets lost till I have to read to the end of the line in case of bool Exists = Env->FileSystem->FileExists( "foo.txt" ); It's just too long for me.
validity: regardles that you mentioned the objects are, if your company tomorrow hires a new programmer and he starts writing code, the day after tomorrow the objects might not be there. These long lines are pretty unfriendly, new people might get scared of them and will do something interesting such as optimising them... which will take more experienced programmer extra time to fix.
debugging: if by any chance (and after you have hired the new programmer) the application throws a segmentation fault in the long list of chain it is pretty difficult to find out which object was the guilty one. The more detailed the breakdown the more easier to find the location of the bug.
speed: if you need to do lots of calls for getting the same chain elements, it might be faster to "pull out" a local variable from the chain instead of calling a "proper" getter function for it. I don't know if your code is production or not, but it seems to miss the "proper" getter function, instead it seems to use only the attribute.
Long delegation chains are a bit of a design smell to me.
What a delegation chain tells me is that one piece of code has deep access to an unrelated piece of code, which makes me think of high coupling, which goes against the SOLID design principles.
The main problem I have with this is maintainability. If you're reaching two levels deep, that is two independent pieces of code that could evolve on their own and break under you. This quickly compounds when you have functions inside the chain, because they can contain chains of their own - for example, Renderer->GetCanvas() could be choosing the canvas based on information from another hierarchy of objects and it is difficult to enforce a code path that does not end up reaching deep into objects over the life time of the code base.
The better way would be to create an architecture that obeyed the SOLID principles and used techniques like Dependency Injection and Inversion Of Control to guarantee your objects always have access to what they need to perform their duties. Such an approach also lends itself well to automated and unit testing.
Just my 2 cents.
If it is possible I would use references instead of pointers. So delegates are guaranteed to return valid objects or throw exception.
clCanvas & C = Env.Renderer().GetCanvas();
For objects which can not exist i will provide additional methods such as has, is, etc.
if ( Env.HasRenderer() ) clCanvas* C = Env.Renderer().GetCanvas();
If you can guarantee that all the objects exist, I don't really see a problem in what you're doing. As others have mentioned, even if you think that NULL will never happen, it may just happen anyway.
This being said, I see that you use bare pointers everywhere. What I would suggest is that you start using smart pointers instead. When you use the -> operator, a smart pointer will usually throw if the pointer is NULL. So you avoid a SegFault. Not only that, if you use smart pointers, you can keep copies and the objects don't just disappear under your feet. You have to explicitly reset each smart pointer before the pointer goes to NULL.
This being said, it wouldn't prevent the -> operator from throwing once in a while.
Otherwise I would rather use the approach proposed by AProgrammer. If object A needs a pointer to object C pointed by object B, then the work that object A is doing is probably something that object B should actually be doing. So A can guarantee that it has a pointer to B at all time (because it holds a shared pointer to B and thus it cannot go NULL) and thus it can always call a function on B to do action Z on object C. In function Z, B knows whether it always has a pointer to C or not. That's part of its B's implementation.
Note that with C++11 you have std::smart_ptr<>, so use it!

Pointer dereferencing overhead vs branching / conditional statements

In heavy loops, such as ones found in game applications, there could be many factors that decide what part of the loop body is executed (for example, a character object will be updated differently depending on its current state) and so instead of doing:
void my_loop_function(int dt) {
if (conditionX && conditionY)
doFoo();
else
doBar();
...
}
I am used to using a function pointer that points to a certain logic function corresponding to the character's current state, as in:
void (*updater)(int);
void something_happens() {
updater = &doFoo;
}
void something_else_happens() {
updater = &doBar;
}
void my_loop_function(int dt) {
(*updater)(dt);
...
}
And in the case where I don't want to do anything, I define a dummy function and point to it when I need to:
void do_nothing(int dt) { }
Now what I'm really wondering is: am I obsessing about this needlessly? The example given above of course is simple; sometimes I need to check many variables to figure out which pieces of code I'll need to execute, and so I figured out using these "state" function pointers would indeed be more optimal, and to me, natural, but a few people I'm dealing with are heavily disagreeing.
So, is the gain from using a (virtual)function pointer worth it instead of filling my loops with conditional statements to flow the logic?
Edit: to clarify how the pointer is being set, it's done through event handling on a per-object basis. When an event occurs and, say, that character has custom logic attached to it, it sets the updater pointer in that event handler until another event occurs which will change the flow once again.
Thank you
The function pointer approach let's you make the transitions asynchronous. Rather than just passing dt to the updater, pass the object as well. Now the updater can itself be responsible for the state transitions. This localizes the state transition logic instead of globalizing it in one big ugly if ... else if ... else if ... function.
As far as the cost of this indirection, do you care? You might care if your updaters are so extremely small that the cost of a dereference plus a function call overwhelms the cost of executing the updater code. If the updaters are of any complexity, that complexity is going to overwhelm the cost of this added flexibility.
I think I 'll agree with the non-believers here. The money question in this case is how is the pointer value going to be set?
If you can somehow index into a map and produce a pointer, then this approach might justify itself through reducing code complexity. However, what you have here is rather more like a state machine spread across several functions.
Consider that something_else_happens in practice will have to examine the previous value of the pointer before setting it to another value. The same goes for something_different_happens, etc. In effect you 've scattered the logic for your state machine all over the place and made it difficult to follow.
Now what I'm really wondering is: am I obsessing about this needlessly?
If you haven't actually run your code, and found that it actually runs too slowly, then yes, I think you probably are worrying about performance too soon.
Herb Sutter and Andrei Alexandrescu in
C++ Coding Standards: 101 Rules, Guidelines, and Best Practices devote chapter 8 to this, called "Don’t optimize prematurely", and they summarise it well:
Spur not a willing horse (Latin proverb): Premature optimization is as addictive as it is unproductive. The first rule of optimization is: Don’t do it. The second rule of optimization (for experts only) is: Don’t do it yet. Measure twice, optimize once.
It's also worth reading chapter 9: "Don’t pessimize prematurely"
Testing a condition is:
fetch a value
compare (subtract)
Jump if zero (or non-zero)
Perform an indirection is:
Fetch an address
jump.
It may be even more performant!
In fact you do the "compare" before, in another place, to decide what to call. The result will be identical.
You did nothign more that an dispatch system identical to the one the compiler does when calling virtual functions.
It is proven that avoiding virtual function to implement dispatching through switches doesn't improve performance on modern compilers.
The "don't use indirection / don't use virtual / don't use function pointer / don't dynamic cast etc." in most of the case are just myths based on historical limitations of early compiler and hardware architectures..
The performance difference will depend on the hardware and the compiler
optimizer. Indirect calls can be very expensive on some machines, and
very cheap on others. And really good compilers may be able to optimize
even indirect calls, based on profiler output. Until you've actually
benchmarked both variants, on your actual target hardware and with the
compiler and compiler options you use in your final release code, it's
impossible to say.
If the indirect calls do end up being too expensive, you can still hoist
the tests out of the loop, by either setting an enum, and using a
switch in the loop, or by implementing the loop for each combination
of settings, and selecting once at the beginning. (If the functions you
point to implement the complete loop, this will almost certainly be
faster than testing the condition each time through the loop, even if
indirection is expensive.)

C++: Performance impact of BIG classes (with a lot of code)

I wonder if and how writing "almighty" classes in c++ actually impacts performance.
If I have for example, a class Point, with only uint x; uint y; as data, and have defined virtually everything that math can do to a point as methods. Some of those methods might be huge. (copy-)constructors do nothing more than initializing the two data members.
class Point
{
int mx; int my;
Point(int x, int y):mx(x),my(y){};
Point(const Point& other):mx(other.x),my(other.y){};
// .... HUGE number of methods....
};
Now. I load a big image and create a Point for every pixel, stuff em into a vector and use them. (say, all methods get called once)
This is only meant as a stupid example!
Would it be any slower than the same class without the methods but with a lot of utility functions? I am not talking about virtual functions in any way!
My Motivation for this: I often find myself writing nice and relatively powerful classes, but when I have to initialize/use a ton of them like in the example above, I get nervous.
I think I shouldn't.
what I think I know is:
Methods exist only once in memory.
(optimizations aside)
Allocation
only takes place for the data
members, and they are the only thing
copied.
So it shouldn't matter. Am I missing something?
You are right, methods only exist once in memory, they're just like normal functions with an extra hidden this parameter.
And of course, only data members are taken in account for allocation, well, inheritance may introduce some extra ptrs for vptrs in the object size, but not a big deal
You have already got some pretty good technical advice. I want to throw in something non-technical: As the STL showed us all, doing it all in member functions might not be the best way to do this. Rather than piling up arguments, I refer to Scott Meyers' class article on the subject: How Non-Member Functions Improve Encapsulation.
Although technically there should be no problem, you still might want to review your design from a design POV.
I suppose this is more of an answer than you're looking for, but here goes...
SO is filled with questions where people are worried about the performance of X, Y, or Z, and that worry is a form of guessing.
If you're worried about the performance of something, don't worry, find out.
Here's what to do:
Write the program
Performance tune it
Learn from the experience
What this has taught me, and I've seen it over and over, is this:
Best practice says Don't optimize prematurely.
Best practice says Do use lots of data structure classes, with multiple layers of abstraction, and the best big-O algorithms, "information hiding", with event-driven and notification-style architecture.
Performance tuning reveals where the time is going, which is: Galloping generality, making mountains out of molehills, calling functions & properties with no realization of how long they take, and doing this over multiple layers using exponential time.
Then the question is asked: What is the reason behind the best practice for the big-O algorithms, the event- and notification-driven architecture, etc. The answer comes: Well, among other things, performance.
So in a way, best practice is telling us: optimize prematurely. Get the point? It says "don't worry about performance", and it says "worry about performance", and it causes the very thing we're trying unsuccessfully not to worry about. And the more we worry about it, against our better judgement, the worse it gets.
My constructive suggestion is this: Follow steps 1, 2, and 3 above. That will teach you how to use best practice in moderation, and that will give you the best all-around design.
If you are truly worried, you can tell your compiler to inline the constructors. This optimization step should leave you with clean code and clean execution.
These 2 bits of code are identical:
Point x;
int l=x.getLength();
int l=GetLength(x);
given that the class Point has a non-virtual method getLength(). The first invocation actually calls int getLength(Point &this), an identical signature as the one we wrote in our second example. (*)
This of course wouldn't apply if the methods you're calling are virtual, since everything would go through an extra level of indirection (something akin to the C-style int l=x->lpvtbl->getLength(x)), not to mention that instead of 2 int's for every pixel you'd actually have 3, the extra one being that pointer to the virtual table.
(*) this isn't exactly true, the "this" pointer is passed through one of the cpu registers instead of through the stack, but the mechanism could have easily worked either way.
First: do not optimize prematurely.
Second: clean code is easier to maintain than optimized code.
Methods for classes have the hidden this pointer, but you should not worry about it. Most of the time the compiler tries to pass it via register.
Inheritance and virtual function introduce indirections in the appropriate calls (inheritance = constructor / destructor call, virual function - every function call of this function).
Short:
Objects you don't create/destroy often can have virtual methods, inheritance and so on as long as it benefits the design.
Objects you create/destroy often should be small (few data members) and should not have many virtual methods (best would be none at all - performance wise).
try to inline small methods/constructor. This will reduce the overhead.
Go for a clean design and refactor if you don't reach the desired performance.
There is a different discussion about classes having large or small interfaces (for example in one of Scott Meyers (More) Effective C++ Books - he opts for minimal interface). But this has nothing to do with performance.
I agree with the above comments wrt:performance and class layout, and would like to add a comment not yet stated about design.
It feels to me like you're over-using your Point class beyond it's real Design scope. Sure, it can be used that way but should it?
In past work in computer games I've often been faced by similar situations, and usually the best end result has been that when doing specialized processing (e.g. image processing) having a specialized code set for that which work on differently laid-out buffers has been more efficient.
This also allows you to performance optimize for the case that matters, in a more clean way, without making base code less maintainable.
In theory, I'm sure that there is a crafty way of using a complex combination of template code, concrete class design, etc., and getting nearly the same run-time efficiency ... but I am usually unwilling to make the complexity-of-implementation trade.
Member functions are not copied along with the object. Only data fields contribute to the size of the object.
I have created the same point class as you except it is a template class and all functions are inline. I expect to see performance increase not decrease by this. However, an image of size 800x600 will have 480k pixels and its memory print will be close to 4M without any color information. Not just memory but also initializing 480k object will take too much time. Therefore, I think its not a good idea in that case. However, if you use this class to transform position of an image, or use it for graphic primitives (lines, curves, circles, etc.)