I have recently learned that if you have a reference to a class as a function parameter, it is better practice and more efficient to store certain needed pieces of information as local variables rather than accessing the classes members every time you need them in the function.
so...
void function(const Sphere& s)
{
//lots of calls to s.centre and s.radius
}
or
void function(const Sphere& s)
{
Vector3 centre = s.centre; float radius = s.radius;
//an equal amount of calls to centre and radius
}
I am told the second is better, but why? And also, where is a good place to start researching this more fully? (for dummys please!)
Whoever told you this probably thought that the second version was likely to be faster.
I think this is bad advice, for two reasons:
It may or may not actually be faster. This depends on the compiler, on what exactly the code is doing etc.
Even if it is faster, this screams premature optimization. One should micro-optimize only after profiling the code and establishing which part of the code is the overall bottleneck.
The concept is intuitive, but wrong. The concept is that accessing members takes more calculations than local variables, so by converting members to variables, you save performance.
But this is wrong. Your compiler will optimize this in ways you could never imagine. Don't attempt to be smarter than the compiler.
This could actively be dangerous. If s.centre or s.radius change during the execution of this function (say due to a function you call or another thread), you will end up with two inconsistent, old values in your local variables -- causing bugs. Even if you're doing this safely, why take the chances of introducing bugs when you can just refer back to the canonical variables themselves?
You were lied to. Moreover, you should not be worrying about such minutiae unless you have profiled your code and found this to be a main source of inefficiency in your code. Don't try to outsmart your compiler's optimizer. It is better at optimizing your code than you are.
Here is a general outlook of your code:
void function(Sphere s)
{
Vector3 centre = s.centre; float radius = s.radius;
//an equal amount of calls to centre and radius
}
First off, you'd gain much more efficiency by passing Sphere as a const reference. This way, a new copy isn't created, which is probably more expensive than member access. So the way to go is:
void function(const Sphere& s)
{
Vector3 centre = s.centre; float radius = s.radius;
//an equal amount of calls to centre and radius
}
Secondly, you shouldn't access members of classes directly. It may be easy now, but in a large project it's really hard to debug. You should use inline getters and setters. That way, the generated code is the same, but you have a single entry point.
Thirdly, this version isn't thread safe. What if a different thread changes s? s.center and s.radius would change, but you'd still be operating on the old values.
Lastly, compilers do a better job at optimizing than you can, so it's better to leave this one up to the compiler.
Whoever told you that is wrong. You can improve performance by using inline functions. Also in your example use
void function(Sphere &s)
Saves using the copy constructor.
Related
Assume that your first objective is execution speed, then code cleanliness and finally usage of resources.
If at a certain point of an algorithm a variable (for instance a double) is not going to change any more (but you are still going to read it many times), would you copy it into a constant value?
If you want to make your code clearer, by all means, copy your values into a const double const_value = calculated_value;, but compilers are very good at tracking dependencies, and it's highly unlikely (assuming you are using a modern, reasonably competent compiler) that the code will be any faster or otherwise "better" because you do this. There is a small chance that the compiler takes your word for the fact that you want a second variable, and thus makes a copy, and makes the code slower because of that.
As always, if performance is important to your application, make a "before & after" comparative benchmark for your particular code, as reading a page on the internet or asking on SO is not the same as benchmarking your code.
Just copying non-constant variable into a constant one does not make the code cleaner because instead of one variable you have two. Much more interesting would be to move non-constant one out-of-scope. This way, we have only constant version of the variable visible and compiler prevents us from changing its value by mistake.
Herb Sutter describes how to do this using C++11 lambdas: Complex initialization for a const variable.
const int i = [&]{
int i = some_default_value;
if(someConditionIstrue)
{
Do some operations and calculate the value of i;
i = some calculated value;
}
return i;
} ();
(I don't explain execution speed objective since it is already done by Mats Petersson).
For better code readability, you can create a const reference to the variable at the point where it isn't changed anymore, and use the const reference from that point on.
double value_init;
// Some code that generates value_init...
const double& value = value_init;
// Use value from now on in your algorithm
Although I wrote this example in C++, this code refactoring question also applies to any language that endorses OO, such as Java.
Basically I have a class A
class A
{
public:
void f1();
void f2();
//..
private:
m_a;
};
void A::f1()
{
assert(m_a);
m_a->h1()->h2()->GetData();
//..
}
void A::f2()
{
assert(m_a);
m_a->h1()->h2()->GetData();
//..
}
Will you guys create a new private data member m_f holding the pointer m_a->h1()->h2()? The benenif I can see is that it effectively eliminates the multi-level function calls which does simplify the code a lot.
But from another point of view, it creates an "unnecessary" data member which can be deduced from another existing data member m_a, which is kinda redundant?
I just come to a dilemma here. By far, I cannot convince myself to use one over the other.
Which do you guys prefer, any reason?
The fancy word for this technique is caching: you calculate a two-away reference once, and cache it in the object. In general, caching lets you "pay" with computer memory for speed-up of your computations.
If a profiler tells you that your code is spending a significant amount of time in the repeated call of m_a->h1()->h2(), this may be a legitimate optimization, provided that the return values of h1 and h2 never change. However, doing an optimization like that without profiling first is nearly always a bad sign of a premature optimization.
If performance is not the issue, a good rule is to stay away from storing members that can be calculated from other members stored in your object. If you would like to improve clarity, you can introduce a nicely named method (a member function) to calculate the two-away reference without storing it. Storing makes sense only in the rare cases when it is critical for the performance.
I would not. I agree it would simply things in your contrived example, but that's because m_a->h1()->h2() has no inherent meaning. In a well-designed application, the method names used should tell you something qualitative about the calls being made, and that should be a part of self-documenting code. I would argue that in properly designed code, m_a->h1()->h2() should be simpler to read and understand than redirecting to a private method which calls it for you.
Now, if m_a->h1()->h2() is an expensive call which takes a significant time to compute the result, then you might have an argument for caching as #dasblinkenlight suggests. But throwing away the descriptiveness of the method call for the sake of a few keypresses is bad.
Whenever I have something like this I usually store m_a->h1() into a variable with a meaningful name at function scope since it's likely to be used again later in function's body.
I have a Particle System Engine in my C++ project and the particles themselves are just structs of variables with no functions. Currently, each particle (Particle) is updated from its parent class (ParticleSystem) by having its variables accessed directly. E.g.
particle.x += particle.vx;
I am however, debating using getters and setters like this:
particle.setX( particle.getX()+particle.getVX() );
My question is: Is there any performance overhead from calling getters and setters as opposed to just straight up data access?
After all, I do have many many particles to update through...
Setters and getters have a performance overhead when not optimized out. They are almost always optimized out on compilers that do link time optimization. And on compilers that do not, if they have knowledge of the function body (ie not just a prototype) it will be optimized out. Just look up inline optimization.
However, you use getters and setters are there because you might want getting or setting that variable to have additional side effects. Like changing the position of an object also changes the position of nearby objects in a physics simulation or some such.
Finally, the overhead of a getter and setter operation, within the context of optimized code, is very small, not worth worrying about unless the code is hot. And if it's hot, it's SO EASY to just move the getter or setter to the header file and inline it.
However, in long running scientific simulations it can add up - but it requires millions and millions of calls.
So to sum up, getters and setters are well worth the minor or non-existent overhead, as it allows you to specify very specifically what can and cannot happen with your object, and it also allows you to marshal any changes.
I have a different opinion to this than the previous answers.
Getters and setters are signs that your class isn't designed in a useful way: if you don't make the outer behaviour abstract from the internal implementation, there's no point in using an abstract interface in the first place, you might as well use a plain old struct.
Think about what operations you really need. That's almost certainly not direct access to the x- y- and z-coordinates of position and momentum, you rather want to treat these as vector quantities (at least in most calculations, which is all that's relevant for optimisation). So you want to implement a vector-based interface*, where the basic operations are vector addition, scaling and inner product. Not component-wise access; you probably do need to do that sometimes as well, but this can be done with a single std::array<double,3> to_posarray() member or something like that.
When the internal components x, y ... vz aren't accessible from the outside, you can then safely change the internal implementation without braking any code outside your module. That's pretty much the whole point af getters/setters; however when using those these there's only so much optimisation you can do: any real change of implementation with inevitably make the getters much slower.
On the other hand, you can optimise the hell out of a vector-based interface, with SIMD operations, external library calls (possibly on accelerated hardware like CUDA) and suchlike. A "batch-getter" like to_posarray can still be implemented reasonably efficient, single-variable setters can't.
*I mean vector in the mathematical sense here, not like std::vector.
Getters and setters allow your code to evolve more easily in the future, if getting and setting turn out to be slightly more complicated tasks. Most C++ compilers are smart enough to inline those simple methods and eliminate the overhead of the function call.
There could be various answers to this question, but I put my thoughts here.
For the performance, the cost can be mostly ignored for simple POD type.
But there 's still cost, and it depends on what type you return. For particle, there couldn't be much data. If this is a image class (like OpenCV cv::Mat), or 3d data (like PolyData in VTK), it is better that setter/getter deals with pointers/iterators than the actual data to avoid memory allocation problem.
When you want to template things, the setter/getter can be a lot useful to avoid inexplicit type conversion. A setter/getter can be a way to access private/protected member, which is also allows you to avoid using x as the common name of variable. More over, the setter/getter can return a Lvalue reference which allows you to do particle.getX() = 10.0 ;
In this case, the functional meaning of your calcul could be :
void Particle::updatePosition() { x += vx; y += vy; }
or:
void Particle::updatePositionX() { x += vx; }
then :
particle.updatePositionX();
I'm writing something performance-critical and wanted to know if it could make a difference if I use:
int test( int a, int b, int c )
{
// Do millions of calculations with a, b, c
}
or
class myStorage
{
public:
int a, b, c;
};
int test( myStorage values )
{
// Do millions of calculations with values.a, values.b, values.c
}
Does this basically result in similar code? Is there an extra overhead of accessing the class members?
I'm sure that this is clear to an expert in C++ so I won't try and write an unrealistic benchmark for it right now
The compiler will probably equalize them. If it has any brains at all, it will copy values.a, values.b, and values.c into local variables or registers, which is also what happens in the simple case.
The relevant maxims:
Premature optimization is the root of much evil.
Write it so you can read it at 1am six months from now and still understand what you were trying to do.
Most of the time significant optimization comes from restructuring your algorithm, not small changes in how variables are accessed. Yes, I know there are exceptions, but this probably isn't one of them.
This sounds like premature optimization.
That being said, there are some differences and opportunities but they will affect multiple calls to the function rather than performance in the function.
First of all, in the second option you may want to pass MyStorage as a constant reference.
As a result of that, your compiled code will likely be pushing a single value into the stack (to allow you to access the container), rather than pushing three separate values. If you have additional fields (in addition to a-c), sending MyStorage not as a reference might actually cost you more because you will be invoking a copy constructor and essentially copying all the additional fields. All of this would be costs per-call, not within the function.
If you are doing tons of calculations with a b and c within the function, then it really doesn't matter how you transfer or access them. If you passed by reference, the initial cost might be slightly more (since your object, if passed by reference, could be on the heap rather than the stack), but once accessed for the first time, caching and registers on your machine will probably mean low-cost access. If you have passed your object by value, then it really doesn't matter, since even initially, the values will be nearby on the stack.
For the code you provided, if these are the only fields, there will likely not be a difference. the "values.variable" is merely interpreted as an offset in the stack, not as "lookup one object, then access another address".
Of course, if you don't buy these arguments, just define local variables as the first step in your function, copy the values from the object, and then use these variables. If you realy use them multiple times, the initial cost of this copy wouldn't matter :)
No, your cpu would cache the variables you use over and over again.
I think there are some overhead, but may not be much. Because the memory address of the object will be stored in the stack, which points to the heap memory object, then you access the instance variable.
If you store the variable int in stack, it would be really faster, because the value is already in stack and the machine just go to stack to get it out to calculate:).
It also depends on if you store the class's instance variable value on stack or not. If inside the test(), you do like:
int a = objA.a;
int b = objA.b;
int c = objA.c;
I think it would be almost the same performance
If you're really writing performance critical code and you think one version should be faster than the other one, write both versions and test the timing (with the code compiled with right optimization switch). You may even want to see the generated assembly codes. A lot of things can affect the speed of a code snippets that are quite subtle, like register spilling, etc.
you can also start your function with
int & a = values.a;
int & b = values.b;
although the compiler should be smart enough to do that for you behind the scenes. In general I prefer to pass around structures or classes, this makes it often clearer what the function is meant to do, plus you don't have to change the signatures every time you want to take another parameter into account.
As with your previous, similar question: it depends on the compiler and platform. If there is any difference at all, it will be very small.
Both values on the stack and values in an object are commonly accessed using a pointer (the stack pointer, or the this pointer) and some offset (the location in the function's stack frame, or the location inside the class).
Here are some cases where it might make a difference:
Depending on your platform, the stack pointer might be held in a CPU register, whereas the this pointer might not. If this is the case, accessing this (which is presumably on the stack) would require an extra memory lookup.
Memory locality might be different. If the object in memory is larger than one cache line, the fields are spread out over multiple cache lines. Bringing only the relevant values together in a stack frame might improve cache efficiency.
Do note, however, how often I used the word "might" here. The only way to be sure is to measure it.
If you can't profile the program, print out the assembly language for the code fragments.
In general, less assembly code means less instructions to execute which speeds up performance. This is a technique for getting a rough estimate of performance when a profiler is not available.
An assembly language listing will allow you to see differences, if any, between implementations.
I wonder if and how writing "almighty" classes in c++ actually impacts performance.
If I have for example, a class Point, with only uint x; uint y; as data, and have defined virtually everything that math can do to a point as methods. Some of those methods might be huge. (copy-)constructors do nothing more than initializing the two data members.
class Point
{
int mx; int my;
Point(int x, int y):mx(x),my(y){};
Point(const Point& other):mx(other.x),my(other.y){};
// .... HUGE number of methods....
};
Now. I load a big image and create a Point for every pixel, stuff em into a vector and use them. (say, all methods get called once)
This is only meant as a stupid example!
Would it be any slower than the same class without the methods but with a lot of utility functions? I am not talking about virtual functions in any way!
My Motivation for this: I often find myself writing nice and relatively powerful classes, but when I have to initialize/use a ton of them like in the example above, I get nervous.
I think I shouldn't.
what I think I know is:
Methods exist only once in memory.
(optimizations aside)
Allocation
only takes place for the data
members, and they are the only thing
copied.
So it shouldn't matter. Am I missing something?
You are right, methods only exist once in memory, they're just like normal functions with an extra hidden this parameter.
And of course, only data members are taken in account for allocation, well, inheritance may introduce some extra ptrs for vptrs in the object size, but not a big deal
You have already got some pretty good technical advice. I want to throw in something non-technical: As the STL showed us all, doing it all in member functions might not be the best way to do this. Rather than piling up arguments, I refer to Scott Meyers' class article on the subject: How Non-Member Functions Improve Encapsulation.
Although technically there should be no problem, you still might want to review your design from a design POV.
I suppose this is more of an answer than you're looking for, but here goes...
SO is filled with questions where people are worried about the performance of X, Y, or Z, and that worry is a form of guessing.
If you're worried about the performance of something, don't worry, find out.
Here's what to do:
Write the program
Performance tune it
Learn from the experience
What this has taught me, and I've seen it over and over, is this:
Best practice says Don't optimize prematurely.
Best practice says Do use lots of data structure classes, with multiple layers of abstraction, and the best big-O algorithms, "information hiding", with event-driven and notification-style architecture.
Performance tuning reveals where the time is going, which is: Galloping generality, making mountains out of molehills, calling functions & properties with no realization of how long they take, and doing this over multiple layers using exponential time.
Then the question is asked: What is the reason behind the best practice for the big-O algorithms, the event- and notification-driven architecture, etc. The answer comes: Well, among other things, performance.
So in a way, best practice is telling us: optimize prematurely. Get the point? It says "don't worry about performance", and it says "worry about performance", and it causes the very thing we're trying unsuccessfully not to worry about. And the more we worry about it, against our better judgement, the worse it gets.
My constructive suggestion is this: Follow steps 1, 2, and 3 above. That will teach you how to use best practice in moderation, and that will give you the best all-around design.
If you are truly worried, you can tell your compiler to inline the constructors. This optimization step should leave you with clean code and clean execution.
These 2 bits of code are identical:
Point x;
int l=x.getLength();
int l=GetLength(x);
given that the class Point has a non-virtual method getLength(). The first invocation actually calls int getLength(Point &this), an identical signature as the one we wrote in our second example. (*)
This of course wouldn't apply if the methods you're calling are virtual, since everything would go through an extra level of indirection (something akin to the C-style int l=x->lpvtbl->getLength(x)), not to mention that instead of 2 int's for every pixel you'd actually have 3, the extra one being that pointer to the virtual table.
(*) this isn't exactly true, the "this" pointer is passed through one of the cpu registers instead of through the stack, but the mechanism could have easily worked either way.
First: do not optimize prematurely.
Second: clean code is easier to maintain than optimized code.
Methods for classes have the hidden this pointer, but you should not worry about it. Most of the time the compiler tries to pass it via register.
Inheritance and virtual function introduce indirections in the appropriate calls (inheritance = constructor / destructor call, virual function - every function call of this function).
Short:
Objects you don't create/destroy often can have virtual methods, inheritance and so on as long as it benefits the design.
Objects you create/destroy often should be small (few data members) and should not have many virtual methods (best would be none at all - performance wise).
try to inline small methods/constructor. This will reduce the overhead.
Go for a clean design and refactor if you don't reach the desired performance.
There is a different discussion about classes having large or small interfaces (for example in one of Scott Meyers (More) Effective C++ Books - he opts for minimal interface). But this has nothing to do with performance.
I agree with the above comments wrt:performance and class layout, and would like to add a comment not yet stated about design.
It feels to me like you're over-using your Point class beyond it's real Design scope. Sure, it can be used that way but should it?
In past work in computer games I've often been faced by similar situations, and usually the best end result has been that when doing specialized processing (e.g. image processing) having a specialized code set for that which work on differently laid-out buffers has been more efficient.
This also allows you to performance optimize for the case that matters, in a more clean way, without making base code less maintainable.
In theory, I'm sure that there is a crafty way of using a complex combination of template code, concrete class design, etc., and getting nearly the same run-time efficiency ... but I am usually unwilling to make the complexity-of-implementation trade.
Member functions are not copied along with the object. Only data fields contribute to the size of the object.
I have created the same point class as you except it is a template class and all functions are inline. I expect to see performance increase not decrease by this. However, an image of size 800x600 will have 480k pixels and its memory print will be close to 4M without any color information. Not just memory but also initializing 480k object will take too much time. Therefore, I think its not a good idea in that case. However, if you use this class to transform position of an image, or use it for graphic primitives (lines, curves, circles, etc.)