Getters and Setters. Is there performance overhead? - c++

I have a Particle System Engine in my C++ project and the particles themselves are just structs of variables with no functions. Currently, each particle (Particle) is updated from its parent class (ParticleSystem) by having its variables accessed directly. E.g.
particle.x += particle.vx;
I am however, debating using getters and setters like this:
particle.setX( particle.getX()+particle.getVX() );
My question is: Is there any performance overhead from calling getters and setters as opposed to just straight up data access?
After all, I do have many many particles to update through...

Setters and getters have a performance overhead when not optimized out. They are almost always optimized out on compilers that do link time optimization. And on compilers that do not, if they have knowledge of the function body (ie not just a prototype) it will be optimized out. Just look up inline optimization.
However, you use getters and setters are there because you might want getting or setting that variable to have additional side effects. Like changing the position of an object also changes the position of nearby objects in a physics simulation or some such.
Finally, the overhead of a getter and setter operation, within the context of optimized code, is very small, not worth worrying about unless the code is hot. And if it's hot, it's SO EASY to just move the getter or setter to the header file and inline it.
However, in long running scientific simulations it can add up - but it requires millions and millions of calls.
So to sum up, getters and setters are well worth the minor or non-existent overhead, as it allows you to specify very specifically what can and cannot happen with your object, and it also allows you to marshal any changes.

I have a different opinion to this than the previous answers.
Getters and setters are signs that your class isn't designed in a useful way: if you don't make the outer behaviour abstract from the internal implementation, there's no point in using an abstract interface in the first place, you might as well use a plain old struct.
Think about what operations you really need. That's almost certainly not direct access to the x- y- and z-coordinates of position and momentum, you rather want to treat these as vector quantities (at least in most calculations, which is all that's relevant for optimisation). So you want to implement a vector-based interface*, where the basic operations are vector addition, scaling and inner product. Not component-wise access; you probably do need to do that sometimes as well, but this can be done with a single std::array<double,3> to_posarray() member or something like that.
When the internal components x, y ... vz aren't accessible from the outside, you can then safely change the internal implementation without braking any code outside your module. That's pretty much the whole point af getters/setters; however when using those these there's only so much optimisation you can do: any real change of implementation with inevitably make the getters much slower.
On the other hand, you can optimise the hell out of a vector-based interface, with SIMD operations, external library calls (possibly on accelerated hardware like CUDA) and suchlike. A "batch-getter" like to_posarray can still be implemented reasonably efficient, single-variable setters can't.
*I mean vector in the mathematical sense here, not like std::vector.

Getters and setters allow your code to evolve more easily in the future, if getting and setting turn out to be slightly more complicated tasks. Most C++ compilers are smart enough to inline those simple methods and eliminate the overhead of the function call.

There could be various answers to this question, but I put my thoughts here.
For the performance, the cost can be mostly ignored for simple POD type.
But there 's still cost, and it depends on what type you return. For particle, there couldn't be much data. If this is a image class (like OpenCV cv::Mat), or 3d data (like PolyData in VTK), it is better that setter/getter deals with pointers/iterators than the actual data to avoid memory allocation problem.
When you want to template things, the setter/getter can be a lot useful to avoid inexplicit type conversion. A setter/getter can be a way to access private/protected member, which is also allows you to avoid using x as the common name of variable. More over, the setter/getter can return a Lvalue reference which allows you to do particle.getX() = 10.0 ;

In this case, the functional meaning of your calcul could be :
void Particle::updatePosition() { x += vx; y += vy; }
or:
void Particle::updatePositionX() { x += vx; }
then :
particle.updatePositionX();

Related

A question about encapsulation and inheritence practices

I've heard people saying that having protected members kind of breaks the point of encapsulation and is not the best practice, one should design the program such that derived classes will not need to have access to private base class members.
An example situation
Now, imagine the following scenario, a simple 8bit game, we have bunch of different objects, such as, regular boxes act as obstacles, spikes, coins, moving platforms etc. List can go on.
All of them have x and y coordinates, a rectangle that specifies size of the object, and collision box, and a texture. Also they can share functions like setting position, rendering, loading the texture, checking for collision etc.
But some of them also need to modify base members, e.g. boxes can be pushed around so they might need a move function, some objects may be moving by themselves or maybe some blocks change texture in-game.
Therefore a base class like object can really come in handy, but that would either require ton of getters - setters or having private members to be protected instead. Either way, compromises encapsulation.
Given the anecdotal context, which would be a better practice:
1. Have a common base class with shared functions and members, declared as protected. Be able to use common functions, pass the reference of base class to non-member functions which only needs to access shared properties. But compromise encapsulation.
2. Have a separate class for each, declare the member variables as private and don't compromise encapsulation.
3. A better way that I couldn't have thought.
I don't think encapsulation is highly vital and probably way to go for that anecdote would be just having protected members, but my goal with this question is writing a well practiced, standard code, rather than solving that specific problem.
Thanks in advance.
First off, I'm going to start by saying there is not a one-size fits all answer to design. Different problems require different solutions; however there are design patterns that often may be more maintainable over time than others.
Indeed, a lot of suggestions for design make them better in a team environment -- but good practices are also useful for solo projects as well so that it can be easier to understand and change in the future.
Sometimes the person who needs to understand your code will be you, a year from now -- so keep that in mind😊
I've heard people saying that having protected members kind of breaks the point of encapsulation
Like any tool, it can be misused; but there is nothing about protected access that inherently breaks encapsulation.
What defines the encapsulation of your object is the intended projected API surface area. Sometimes, that protected member is logically part of the surface-area -- and this is perfectly valid.
If misused, protected members can give clients access to mutable members that may break a class's intended invariants -- which would be bad. An example of this would be if you were able to derive a class exposing a rectangle, and were able to set the width/height to a negative value. Functions in the base class, such as compute_area could suddenly yield wrong values -- and cause cascading failures that should otherwise have been guarded against by better encapsulated.
As for the design of your example in question:
Base classes are not necessarily a bad thing, but can easily be overused and can lead to "god" classes that unintentionally expose too much functionality in an effort to share logic. Over time this can become a maintenance burden and just an overall confusing mess.
Your example sounds better suited to composition, with some smaller interfaces:
Things like a point and a vector type would be base-types to produce higher-order compositions like rectangle.
This could then be composed together to create a model which handles general (logical) objects in 2D space that have collision.
intersection/collision logic can be handled from an outside utility class
Rendering can be handled from a renderable interface, where any class that needs to render extends from this interface.
intersection handling logic can be handled by an intersectable interface, which determines behaviors of an object on intersection (this effectively abstracts each of the game objects into raw behaviors)
etc
encapsulation is not a security thing, its a neatness thing (and hence a supportability, readability ..). you have to assume that people deriving classes are basically sensible. They are after all writing programs either of their own using your base classes (so who cares), or they are writing in a team with you
The primary purpose of "encapsulation" in object-oriented programming is to limit direct access to data in order to minimize dependencies, and where dependencies must exist, to express those in terms of functions not data.
This is ties in with Design by Contract where you allow "public" access to certain functions and reserve the right to modify others arbitrarily, at any time, for any reason, even to the point of removing them, by expressing those as "protected".
That is, you could have a game object like:
class Enemy {
public:
int getHealth() const;
}
Where the getHealth() function returns an int value expressing the health. How does it derive this value? It's not for the caller to know or care. Maybe it's byte 9 of a binary packet you just received. Maybe it's a string from a JSON object. It doesn't matter.
Most importantly because it doesn't matter you're free to change how getHealth() works internally without breaking any code that's dependent on it.
However, if you're exposing a public int health property that opens up a whole world of problems. What if that is manipulated incorrectly? What if it's set to an invalid value? How do you trap access to that property being manipulated?
It's much easier when you have setHealth(const int health) where you can do things like:
clamp it to a particular range
trigger an event when it exceeds certain bounds
update a saved game state
transmit an update over the network
hook in other "observers" which might need to know when that value is manipulated
None of those things are easily implemented without encapsulation.
protected is not just a "get off my lawn" thing, it's an important tool to ensure that your implementation is used correctly and as intended.

Why should methods return a new instance, rather than modify the instance itself?

Suppose I have a Vector3 class, that contains a normalize() method. Should that method return a new Vector3, or modify the Vector3 instance it is called on (therefore returning a reference to itself (Vector3&)?) What are some instances where one would be preferred over the other? What about performance?
The answer depends on the design of your class.
For mutable classes rotate should rotate the vector itself. This is viewed as somewhat more efficient, and in case of large objects it lets you avoid copying large volumes of data when vectors have many items in them.
Immutable classes, on the other hand, must return only new objects, because they cannot be mutated themselves. This adds some overhead, but it has a lot of pluses, especially when objects must be used concurrently.
A common naming convention is to use a verb for mutating operations, as in
myVector.rotate(angle);
myVector.scale(factor);
while operations that return new objects should be named with past participles, as in
auto newVector = myVector.rotated(angle).scaled(factor);
Competing goals: Correctness vs Performance (sometimes).
If you use immutable types, you have it easier to write correct (parallel) programs. If you use mutable types, you sometimes have a certain performance benefit which might well be lost once you try to go parallel. Then there is the 80/20 rules. 80% of the code need not be optimized. So why use mutable types by default?
First go immutable, then see if it has enough performance, then optimize, if not.
Vector3 rotate(const Angle& angle)
is probably fine, but well depends on Vector3 is implemented correctly. Especially regarding std::move() behavior.
Side effects. If I pass an object that I retrieved from a map, for example, and you do something to the object I pass, you have changed the thing that is inside my map, and the next time I ask for it I get something that does not look the same as I got last time. Keeping objects immutable prevents accidents like this, especially when multiple devs are working on the same app.

Will you create a private class member to eliminate multi-level function call?

Although I wrote this example in C++, this code refactoring question also applies to any language that endorses OO, such as Java.
Basically I have a class A
class A
{
public:
void f1();
void f2();
//..
private:
m_a;
};
void A::f1()
{
assert(m_a);
m_a->h1()->h2()->GetData();
//..
}
void A::f2()
{
assert(m_a);
m_a->h1()->h2()->GetData();
//..
}
Will you guys create a new private data member m_f holding the pointer m_a->h1()->h2()? The benenif I can see is that it effectively eliminates the multi-level function calls which does simplify the code a lot.
But from another point of view, it creates an "unnecessary" data member which can be deduced from another existing data member m_a, which is kinda redundant?
I just come to a dilemma here. By far, I cannot convince myself to use one over the other.
Which do you guys prefer, any reason?
The fancy word for this technique is caching: you calculate a two-away reference once, and cache it in the object. In general, caching lets you "pay" with computer memory for speed-up of your computations.
If a profiler tells you that your code is spending a significant amount of time in the repeated call of m_a->h1()->h2(), this may be a legitimate optimization, provided that the return values of h1 and h2 never change. However, doing an optimization like that without profiling first is nearly always a bad sign of a premature optimization.
If performance is not the issue, a good rule is to stay away from storing members that can be calculated from other members stored in your object. If you would like to improve clarity, you can introduce a nicely named method (a member function) to calculate the two-away reference without storing it. Storing makes sense only in the rare cases when it is critical for the performance.
I would not. I agree it would simply things in your contrived example, but that's because m_a->h1()->h2() has no inherent meaning. In a well-designed application, the method names used should tell you something qualitative about the calls being made, and that should be a part of self-documenting code. I would argue that in properly designed code, m_a->h1()->h2() should be simpler to read and understand than redirecting to a private method which calls it for you.
Now, if m_a->h1()->h2() is an expensive call which takes a significant time to compute the result, then you might have an argument for caching as #dasblinkenlight suggests. But throwing away the descriptiveness of the method call for the sake of a few keypresses is bad.
Whenever I have something like this I usually store m_a->h1() into a variable with a meaningful name at function scope since it's likely to be used again later in function's body.

C++: Performance impact of BIG classes (with a lot of code)

I wonder if and how writing "almighty" classes in c++ actually impacts performance.
If I have for example, a class Point, with only uint x; uint y; as data, and have defined virtually everything that math can do to a point as methods. Some of those methods might be huge. (copy-)constructors do nothing more than initializing the two data members.
class Point
{
int mx; int my;
Point(int x, int y):mx(x),my(y){};
Point(const Point& other):mx(other.x),my(other.y){};
// .... HUGE number of methods....
};
Now. I load a big image and create a Point for every pixel, stuff em into a vector and use them. (say, all methods get called once)
This is only meant as a stupid example!
Would it be any slower than the same class without the methods but with a lot of utility functions? I am not talking about virtual functions in any way!
My Motivation for this: I often find myself writing nice and relatively powerful classes, but when I have to initialize/use a ton of them like in the example above, I get nervous.
I think I shouldn't.
what I think I know is:
Methods exist only once in memory.
(optimizations aside)
Allocation
only takes place for the data
members, and they are the only thing
copied.
So it shouldn't matter. Am I missing something?
You are right, methods only exist once in memory, they're just like normal functions with an extra hidden this parameter.
And of course, only data members are taken in account for allocation, well, inheritance may introduce some extra ptrs for vptrs in the object size, but not a big deal
You have already got some pretty good technical advice. I want to throw in something non-technical: As the STL showed us all, doing it all in member functions might not be the best way to do this. Rather than piling up arguments, I refer to Scott Meyers' class article on the subject: How Non-Member Functions Improve Encapsulation.
Although technically there should be no problem, you still might want to review your design from a design POV.
I suppose this is more of an answer than you're looking for, but here goes...
SO is filled with questions where people are worried about the performance of X, Y, or Z, and that worry is a form of guessing.
If you're worried about the performance of something, don't worry, find out.
Here's what to do:
Write the program
Performance tune it
Learn from the experience
What this has taught me, and I've seen it over and over, is this:
Best practice says Don't optimize prematurely.
Best practice says Do use lots of data structure classes, with multiple layers of abstraction, and the best big-O algorithms, "information hiding", with event-driven and notification-style architecture.
Performance tuning reveals where the time is going, which is: Galloping generality, making mountains out of molehills, calling functions & properties with no realization of how long they take, and doing this over multiple layers using exponential time.
Then the question is asked: What is the reason behind the best practice for the big-O algorithms, the event- and notification-driven architecture, etc. The answer comes: Well, among other things, performance.
So in a way, best practice is telling us: optimize prematurely. Get the point? It says "don't worry about performance", and it says "worry about performance", and it causes the very thing we're trying unsuccessfully not to worry about. And the more we worry about it, against our better judgement, the worse it gets.
My constructive suggestion is this: Follow steps 1, 2, and 3 above. That will teach you how to use best practice in moderation, and that will give you the best all-around design.
If you are truly worried, you can tell your compiler to inline the constructors. This optimization step should leave you with clean code and clean execution.
These 2 bits of code are identical:
Point x;
int l=x.getLength();
int l=GetLength(x);
given that the class Point has a non-virtual method getLength(). The first invocation actually calls int getLength(Point &this), an identical signature as the one we wrote in our second example. (*)
This of course wouldn't apply if the methods you're calling are virtual, since everything would go through an extra level of indirection (something akin to the C-style int l=x->lpvtbl->getLength(x)), not to mention that instead of 2 int's for every pixel you'd actually have 3, the extra one being that pointer to the virtual table.
(*) this isn't exactly true, the "this" pointer is passed through one of the cpu registers instead of through the stack, but the mechanism could have easily worked either way.
First: do not optimize prematurely.
Second: clean code is easier to maintain than optimized code.
Methods for classes have the hidden this pointer, but you should not worry about it. Most of the time the compiler tries to pass it via register.
Inheritance and virtual function introduce indirections in the appropriate calls (inheritance = constructor / destructor call, virual function - every function call of this function).
Short:
Objects you don't create/destroy often can have virtual methods, inheritance and so on as long as it benefits the design.
Objects you create/destroy often should be small (few data members) and should not have many virtual methods (best would be none at all - performance wise).
try to inline small methods/constructor. This will reduce the overhead.
Go for a clean design and refactor if you don't reach the desired performance.
There is a different discussion about classes having large or small interfaces (for example in one of Scott Meyers (More) Effective C++ Books - he opts for minimal interface). But this has nothing to do with performance.
I agree with the above comments wrt:performance and class layout, and would like to add a comment not yet stated about design.
It feels to me like you're over-using your Point class beyond it's real Design scope. Sure, it can be used that way but should it?
In past work in computer games I've often been faced by similar situations, and usually the best end result has been that when doing specialized processing (e.g. image processing) having a specialized code set for that which work on differently laid-out buffers has been more efficient.
This also allows you to performance optimize for the case that matters, in a more clean way, without making base code less maintainable.
In theory, I'm sure that there is a crafty way of using a complex combination of template code, concrete class design, etc., and getting nearly the same run-time efficiency ... but I am usually unwilling to make the complexity-of-implementation trade.
Member functions are not copied along with the object. Only data fields contribute to the size of the object.
I have created the same point class as you except it is a template class and all functions are inline. I expect to see performance increase not decrease by this. However, an image of size 800x600 will have 480k pixels and its memory print will be close to 4M without any color information. Not just memory but also initializing 480k object will take too much time. Therefore, I think its not a good idea in that case. However, if you use this class to transform position of an image, or use it for graphic primitives (lines, curves, circles, etc.)

using accessors in same class

I have heard that in C++, using an accessor ( get...() ) in a member function of the same class where the accessor was defined is good programming practice? Is it true and should it be done?
For example, is this preferred:
void display() {
cout << getData();
}
over something like this:
void display() {
cout << data;
}
data is a data member of the same class where the accessor was defined... same with the display() method.
I'm thinking of the overhead for doing that especially if you need to invoke the accessor lots of times inside the same class rather than just using the data member directly.
The reason for this is that if you change the implementation of getData(), you won't have to change the rest of the code that directly accesses data.
And also, a smart compiler will inline it anyways (it would always know the implementation inside the class), so there is no performance penalty.
It depends. Using an accessor function provides a layer of abstraction, which could make future changes to 'data' less painful. For example, if you wanted to lazily compute the value of 'data', you could hide that computation in the accessor function.
As for the overhead - If you are referring to performance overhead, it will likely be insignificant - your accessors will almost certainly be inlined. If you are referring to coding overhead, then yes, it is a tradeoff, and you'll have to decide whether it is worth the extra effort to provide accessors.
Personally, I don't think the accessors are worth it in most cases.
Yes, I think it should be done more or less unconditionally. If the state variable is in some base class it should more or less always be private. If you allow it to be protected or public, all inherited will use it directly. These classes in turn might be classes your coworkers have written in some other project. If you suddenly decide to mock about in the base class and refactor e.g. the variable name to something more suitable, all users of that state must be rewritten.
This is probably not an issue if you are the only programmer or developing some code that no one ever will use. But as soon as the number of sub classes start to grow, it might get really hairy. Gotta love transparency !
However, I'm not gods best child on this planet. Sometimes I cheat ;) When you're in the owner class, I think it's ok to access private data directly. It might even be beneficial, since you automatically know that you are modifying the actual class you're in. Given that you have some kind of naming convention that actually tells you so, e.g. some variable name with an underscore at the end: "someVariable_".
Cheers !
Well, Mr. Khunt, the overhead is really insignificant for accessors in most cases. The question is whether not the accessor logic needs to be invoked, or the you need direct access to the field. This is a question for each individual implementation, but in many cases, won't make much of a difference.
The real reason for accessors is to provide encapsulation of your fields to other classes - and less about the containing class.
Personally, I prefer not to have dozens of extra functions (get and set per every member variable). I would just use data, and would change to getData() only when required to do something differently. Since we are talking about changing the code only in one class, it shouldn't be too difficult.
It depends on what you might ultimately do with your data member I suppose.
By wrapping it up in the accessor you can then do things like lazily retrieving the data if this was an expensive process and not something you want to do unless someone asks for it. On the other hand you might know that it will always be a dumb built-in type and so I can't see any advantage of going through an accessor there. As I say, it depends on the member.
To my mind, the most important aspect of this question is does it make the code more readable and therefore maintainable? Personally I don't think it does so I wouldn't do this.
Certainly you should never add a private accessor just to do this, that would be cnuts.