Related
Quick question and I apologize if it sounds naive.
What is faster in c++. A code like this:
ProgramsManager::CurrentProgram->Uniforms->Set(n1);
ProgramsManager::CurrentProgram->Uniforms->Set(n2);
ProgramsManager::CurrentProgram->Uniforms->Set(n3);
ProgramsManager::CurrentProgram->Uniforms->Set(...);
Or this one?
Uniforms* u = ProgramsManager::CurrentProgram->Uniforms;
u->Set(n1);
u->Set(n2);
u->Set(n3);
u->Set(...);
I know the second piece of code is faster in interpreted languages, but I feel like it makes no difference in compiled languages. Am I right?
Thank you in advance
The second might be faster, but it won't be faster by a lot.
The reason it might be faster is if the compiler cannot prove to itself that ProgramsManager::CurrentProgram->Uniforms could be changed by the calls to ...->Set. If it can't prove this, it will have to re-evaluate the expression ProgramsManager::CurrentProgram->Uniforms for each line.
However, modern CPUs are usually fairly quick at this kind of thing, and compilers are getting better.
There are 3 choices here, not 2.
Call a single parameter function.
Call one function with many parameters.
Call a single function with container, like struct or vector.
Fundamental Overhead
When calling a function there is an overhead of instructions. Usually this involves placing values in registers or on the stack or something else.
Lower level, there may be the possibility of the processor having to reload it's instruction cache / pipe line.
Optimizing The Function Call
For optimizing function calls, the best method is to avoid the call by pasting the code (a.k.a. inlining). This removes the overhead.
The next best is to reduce the number of function calls. For example, passing more parameters will use less function calls and less overhead.
Many Parameters versus One Container
The optimal function call passes values by registers. Extra parameters, more than the available registers, results in using the stack memory. This means that the function will need code to retrieve the values from the stack memory.
Passing many parameters using the stack incurs an overhead. Also, the function signature will need to change if more parameters are added or removed.
Placing variables into a container reduces the overhead. Only a pointer (or reference) to the container needs to be passed. This usually involves only a register since pointers usually fit into a register (many compilers pass structures by reference using pointers).
Another benefit to the container is that the container can change without having to change the function signature.
I read about function pointers in C.
And everyone said that will make my program run slow.
Is it true?
I made a program to check it.
And I got the same results on both cases. (measure the time.)
So, is it bad to use function pointer?
Thanks in advance.
To response for some guys.
I said 'run slow' for the time that I have compared on a loop.
like this:
int end = 1000;
int i = 0;
while (i < end) {
fp = func;
fp ();
}
When you execute this, i got the same time if I execute this.
while (i < end) {
func ();
}
So I think that function pointer have no difference of time
and it don't make a program run slow as many people said.
You see, in situations that actually matter from the performance point of view, like calling the function repeatedly many times in a cycle, the performance might not be different at all.
This might sound strange to people, who are used to thinking about C code as something executed by an abstract C machine whose "machine language" closely mirrors the C language itself. In such context, "by default" an indirect call to a function is indeed slower than a direct one, because it formally involves an extra memory access in order to determine the target of the call.
However, in real life the code is executed by a real machine and compiled by an optimizing compiler that has a pretty good knowledge of the underlying machine architecture, which helps it to generate the most optimal code for that specific machine. And on many platforms it might turn out that the most efficient way to perform a function call from a cycle actually results in identical code for both direct and indirect call, leading to the identical performance of the two.
Consider, for example, the x86 platform. If we "literally" translate a direct and indirect call into machine code, we might end up with something like this
// Direct call
do-it-many-times
call 0x12345678
// Indirect call
do-it-many-times
call dword ptr [0x67890ABC]
The former uses an immediate operand in the machine instruction and is indeed normally faster than the latter, which has to read the data from some independent memory location.
At this point let's remember that x86 architecture actually has one more way to supply an operand to the call instruction. It is supplying the target address in a register. And a very important thing about this format is that it is normally faster than both of the above. What does this mean for us? This means that a good optimizing compiler must and will take advantage of that fact. In order to implement the above cycle, the compiler will try to use a call through a register in both cases. If it succeeds, the final code might look as follows
// Direct call
mov eax, 0x12345678
do-it-many-times
call eax
// Indirect call
mov eax, dword ptr [0x67890ABC]
do-it-many-times
call eax
Note, that now the part that matters - the actual call in the cycle body - is exactly and precisely the same in both cases. Needless to say, the performance is going to be virtually identical.
One might even say, however strange it might sound, that on this platform a direct call (a call with an immediate operand in call) is slower than an indirect call as long as the operand of the indirect call is supplied in a register (as opposed to being stored in memory).
Of course, the whole thing is not as easy in general case. The compiler has to deal with limited availability of registers, aliasing issues etc. But is such simplistic cases as the one in your example (and even in much more complicated ones) the above optimization will be carried out by a good compiler and will completely eliminate any difference in performance between a cyclic direct call and a cyclic indirect call. This optimization works especially well in C++, when calling a virtual function, since in a typical implementation the pointers involved are fully controlled by the compiler, giving it full knowledge of the aliasing picture and other relevant stuff.
Of course, there's always a question of whether your compiler is smart enough to optimize things like that...
I think when people say this they're referring to the fact that using function pointers may prevent compiler optimizations (inlining) and processor optimizations (branch prediction). However, if function pointers are an effective way to accomplish something that you're trying to do, chances are that any other method of doing it would have the same drawbacks.
And unless your function pointers are being used in tight loops in a performance critical application or on a very slow embedded system, chances are the difference is negligible anyway.
And everyone said that will make my
program run slow. Is it true?
Most likely this claim is false. For one, if the alternative to using function pointers are something like
if (condition1) {
func1();
} else if (condition2)
func2();
} else if (condition3)
func3();
} else {
func4();
}
this is most likely relatively much slower than just using a single function pointer. While calling a function through a pointer does have some (typically neglectable) overhead, it is normally not the direct-function-call versus through-pointer-call difference that is relevant to compare.
And secondly, never optimize for performance without any measurements. Knowing where the bottlenecks are is very difficult (read impossible) to know and sometimes this can be quite non-intuitively (for instance the linux kernel developers have started removing the inline keyword from functions because it actually hurt performance).
A lot of people have put in some good answers, but I still think there's a point being missed. Function pointers do add an extra dereference which makes them several cycles slower, that number can increase based on poor branch prediction (which incidentally has almost nothing to do with the function pointer itself). Additionally functions called via a pointer cannot be inlined. But what people are missing is that most people use function pointers as an optimization.
The most common place you will find function pointers in c/c++ APIs is as callback functions. The reason so many APIs do this is because writing a system that invokes a function pointer whenever events occur is much more efficient than other methods like message passing. Personally I've also used function pointers as part of a more-complex input processing system, where each key on the keyboard has a function pointer mapped to it via a jump table. This allowed me to remove any branching or logic from the input system and merely handle the key press coming in.
Calling a function via a function pointer is somewhat slower than a static function call, since the former call includes an extra pointer dereferencing. But AFAIK this difference is negligible on most modern machines (except maybe some special platforms with very limited resources).
Function pointers are used because they can make the program much simpler, cleaner and easier to maintain (when used properly, of course). This more than makes up for the possible very minor speed difference.
A lot of good points in earlier replies.
However take a look at C qsort comparison function. Because the comparison function cannot be inlined and needs to follow standard stack based calling conventions, the total running time for the sort can be an order of magnitude (more exactly 3-10x) slower for integer keys, than otherwise same code with a direct, inlineable, call.
A typical inlined comparison would be a sequence of simple CMP and possibly CMOV/SET instruction. A function call also incurs the overhead of a CALL, setting up stack frame, doing the comparison, tearing down stack frame and returning the result. Note, that the stack operations can cause pipeline stalls due to CPU pipeline length and virtual registers. For example if value of say eax is needed before the instruction that last modified eax has finished executing (which typically takes about 12 clock cycles on the newest processors). Unless the CPU can execute other instructions out of order to wait for that, a pipeline stall will occur.
Using a function pointer is slower that just calling a function as it is another layer of indirection. (The pointer needs to be dereferenced to get the memory address of the function). While it is slower, compared to everything else your program may do (Read a file, write to the console) it is negligible.
If you need to use function pointers, use them because anything that tries to do the same thing but avoids using them will be slower and less maintainable that using function pointers.
Possibly.
The answer depends on what the function pointer is being used for and hence what the alternatives are. Comparing function pointer calls to direct function calls is misleading if a function pointer is being used to implement a choice that's part of our program logic and which can't simply be removed. I'll go ahead and nonetheless show that comparison and come back to this thought afterwards.
Function pointer calls have the most opportunity to degrade performance compared to direct function calls when they inhibit inlining. Because inlining is a gateway optimization, we can craft wildly pathological cases where function pointers are made arbitrarily slower than the equivalent direct function call:
void foo(int* x) {
*x = 0;
}
void (*foo_ptr)(int*) = foo;
int call_foo(int *p, int size) {
int r = 0;
for (int i = 0; i != size; ++i)
r += p[i];
foo(&r);
return r;
}
int call_foo_ptr(int *p, int size) {
int r = 0;
for (int i = 0; i != size; ++i)
r += p[i];
foo_ptr(&r);
return r;
}
Code generated for call_foo():
call_foo(int*, int):
xor eax, eax
ret
Nice. foo() has not only been inlined, but doing so has allowed the compiler to eliminate the entire preceding loop! The generated code simply zeroes out the return register by XORing the register with itself and then returns. On the other hand, compilers will have to generate code for the loop in call_foo_ptr() (100+ lines with gcc 7.3) and most of that code effectively does nothing (so long as foo_ptr still points to foo()). (In more typical scenarios, you can expect that inlining a small function into a hot inner loop might reduce execution time by up to about an order of magnitude.)
So in a worst case scenario, a function pointer call is arbitrarily slower than a direct function call, but this is misleading. It turns out that if foo_ptr had been const, then call_foo() and call_foo_ptr() would have generated the same code. However, this would require us to give up the opportunity for indirection provided by foo_ptr. Is it "fair" for foo_ptr to be const? If we're interested in the indirection provided by foo_ptr, then no, but if that's the case, then a direct function call is not a valid option either.
If a function pointer is being used to provide useful indirection, then we can move the indirection around or in some cases swap out function pointers for conditionals or even macros, but we can't simply remove it. If we've decided that function pointers are a good approach but performance is a concern, then we typically want to pull indirection up the call stack so that we pay the cost of indirection in an outer loop. For example, in the common case where a function takes a callback and calls it in a loop, we might try moving the innermost loop into the callback (and changing the responsibility of each callback invocation accordingly).
In heavy loops, such as ones found in game applications, there could be many factors that decide what part of the loop body is executed (for example, a character object will be updated differently depending on its current state) and so instead of doing:
void my_loop_function(int dt) {
if (conditionX && conditionY)
doFoo();
else
doBar();
...
}
I am used to using a function pointer that points to a certain logic function corresponding to the character's current state, as in:
void (*updater)(int);
void something_happens() {
updater = &doFoo;
}
void something_else_happens() {
updater = &doBar;
}
void my_loop_function(int dt) {
(*updater)(dt);
...
}
And in the case where I don't want to do anything, I define a dummy function and point to it when I need to:
void do_nothing(int dt) { }
Now what I'm really wondering is: am I obsessing about this needlessly? The example given above of course is simple; sometimes I need to check many variables to figure out which pieces of code I'll need to execute, and so I figured out using these "state" function pointers would indeed be more optimal, and to me, natural, but a few people I'm dealing with are heavily disagreeing.
So, is the gain from using a (virtual)function pointer worth it instead of filling my loops with conditional statements to flow the logic?
Edit: to clarify how the pointer is being set, it's done through event handling on a per-object basis. When an event occurs and, say, that character has custom logic attached to it, it sets the updater pointer in that event handler until another event occurs which will change the flow once again.
Thank you
The function pointer approach let's you make the transitions asynchronous. Rather than just passing dt to the updater, pass the object as well. Now the updater can itself be responsible for the state transitions. This localizes the state transition logic instead of globalizing it in one big ugly if ... else if ... else if ... function.
As far as the cost of this indirection, do you care? You might care if your updaters are so extremely small that the cost of a dereference plus a function call overwhelms the cost of executing the updater code. If the updaters are of any complexity, that complexity is going to overwhelm the cost of this added flexibility.
I think I 'll agree with the non-believers here. The money question in this case is how is the pointer value going to be set?
If you can somehow index into a map and produce a pointer, then this approach might justify itself through reducing code complexity. However, what you have here is rather more like a state machine spread across several functions.
Consider that something_else_happens in practice will have to examine the previous value of the pointer before setting it to another value. The same goes for something_different_happens, etc. In effect you 've scattered the logic for your state machine all over the place and made it difficult to follow.
Now what I'm really wondering is: am I obsessing about this needlessly?
If you haven't actually run your code, and found that it actually runs too slowly, then yes, I think you probably are worrying about performance too soon.
Herb Sutter and Andrei Alexandrescu in
C++ Coding Standards: 101 Rules, Guidelines, and Best Practices devote chapter 8 to this, called "Don’t optimize prematurely", and they summarise it well:
Spur not a willing horse (Latin proverb): Premature optimization is as addictive as it is unproductive. The first rule of optimization is: Don’t do it. The second rule of optimization (for experts only) is: Don’t do it yet. Measure twice, optimize once.
It's also worth reading chapter 9: "Don’t pessimize prematurely"
Testing a condition is:
fetch a value
compare (subtract)
Jump if zero (or non-zero)
Perform an indirection is:
Fetch an address
jump.
It may be even more performant!
In fact you do the "compare" before, in another place, to decide what to call. The result will be identical.
You did nothign more that an dispatch system identical to the one the compiler does when calling virtual functions.
It is proven that avoiding virtual function to implement dispatching through switches doesn't improve performance on modern compilers.
The "don't use indirection / don't use virtual / don't use function pointer / don't dynamic cast etc." in most of the case are just myths based on historical limitations of early compiler and hardware architectures..
The performance difference will depend on the hardware and the compiler
optimizer. Indirect calls can be very expensive on some machines, and
very cheap on others. And really good compilers may be able to optimize
even indirect calls, based on profiler output. Until you've actually
benchmarked both variants, on your actual target hardware and with the
compiler and compiler options you use in your final release code, it's
impossible to say.
If the indirect calls do end up being too expensive, you can still hoist
the tests out of the loop, by either setting an enum, and using a
switch in the loop, or by implementing the loop for each combination
of settings, and selecting once at the beginning. (If the functions you
point to implement the complete loop, this will almost certainly be
faster than testing the condition each time through the loop, even if
indirection is expensive.)
Lets say I know a guy who is new to C++. He does not pass around pointers (rightly so) but he refuses to pass by reference. He uses pass by value always. Reason being that he feels that "passing objects by reference is a sign of a broken design".
The program is a small graphics program and most of the passing in question is mathematical Vector(3-tuple) objects. There are some big controller objects but nothing more complicated than that.
I'm finding it hard to find a killer argument against only using the stack.
I would argue that pass by value is fine for small objects such as vectors but even then there is a lot of unnecessary copying occurring in the code. Passing large objects by value is obviously wasteful and most likely not what you want functionally.
On the pro side, I believe the stack is faster at allocating/deallocating memory and has a constant allocation time.
The only major argument I can think of is that the stack could possibly overflow, but I'm guessing that it is improbable that this will occur? Are there any other arguments against using only the stack/pass by value as opposed to pass by reference?
Subtyping-polymorphism is a case where passing by value wouldn't work because you would slice the derived class to its base class. Maybe to some, using subtyping-polymorphism is bad design?
Your friend's problem is not his idea as much as his religion. Given any function, always consider the pros and cons of passing by value, reference, const reference, pointer or smart pointer. Then decide.
The only sign of broken design I see here is your friend's blind religion.
That said, there are a few signatures that don't bring much to the table. Taking a const by value might be silly, because if you promise not to change the object then you might as well not make your own copy of it. Unless its a primitive, of course, in which case the compiler can be smart enough to take a reference still. Or, sometimes it's clumsy to take a pointer to a pointer as argument. This adds complexity; instead, you might be able to get away with it by taking a reference to a pointer, and get the same effect.
But don't take these guidelines as set in stone; always consider your options because there is no formal proof that eliminates any alternative's usefulness.
If you need to change the argument for your own needs, but don't want to affect the client, then take the argument by value.
If you want to provide a service to the client, and the client is not closely related to the service, then consider taking an argument by reference.
If the client is closely related to the service then consider taking no arguments but write a member function.
If you wish to write a service function for a family of clients that are closely related to the service but very distinct from each other then consider taking a reference argument, and perhaps make the function a friend of the clients that need this friendship.
If you don't need to change the client at all then consider taking a const-reference.
There are all sorts of things that cannot be done without using references - starting with a copy constructor. References (or pointers) are fundamental and whether he likes it or not, he is using references. (One advantage, or maybe disadvantage, of references is that you do not have to alter the code, in general, to pass a (const) reference.) And there is no reason not to use references most of the time.
And yes, passing by value is OK for smallish objects without requirements for dynamic allocation, but it is still silly to hobble oneself by saying "no references" without concrete measurements that the so-called overhead is (a) perceptible and (b) significant. "Premature optimization is the root of all evil"1.
1
Various attributions, including C A Hoare (although apparently he disclaims it).
I think there is a huge misunderstanding in the question itself.
There is not relationship between stack or heap allocated objects on the one hand and pass by value or reference or pointer on the other.
Stack vs Heap allocation
Always prefer stack when possible, the object's lifetime is then managed for you which is much easier to deal with.
It might not be possible in a couple of situations though:
Virtual construction (think of a Factory)
Shared Ownership (though you should always try to avoid it)
And I might miss some, but in this case you should use SBRM (Scope Bound Resources Management) to leverage the stack lifetime management abilities, for example by using smart pointers.
Pass by: value, reference, pointer
First of all, there is a difference of semantics:
value, const reference: the passed object will not be modified by the method
reference: the passed object might be modified by the method
pointer/const pointer: same as reference (for the behavior), but might be null
Note that some languages (the functional kind like Haskell) do not offer reference/pointer by default. The values are immutable once created. Apart from some work-arounds for dealing with the exterior environment, they are not that restricted by this use and it somehow makes debugging easier.
Your friend should learn that there is absolutely nothing wrong with pass-by-reference or pass-by-pointer: for example thing of swap, it cannot be implemented with pass-by-value.
Finally, Polymorphism does not allow pass-by-value semantics.
Now, let's speak about performances.
It's usually well accepted that built-ins should be passed by value (to avoid an indirection) and user-defined big classes should be passed by reference/pointer (to avoid copying). big in fact generally means that the Copy Constructor is not trivial.
There is however an open question regarding small user-defined classes. Some articles published recently suggest that in some case pass-by-value might allow better optimization from the compiler, for example, in this case:
Object foo(Object d) { d.bar(); return d; }
int main(int argc, char* argv[])
{
Object o;
o = foo(o);
return 0;
}
Here a smart compiler is able to determine that o can be modified in place without any copying! (It is necessary that the function definition be visible I think, I don't know if Link-Time Optimization would figure it out)
Therefore, there is only one possibility to the performance issue, like always: measure.
Reason being that he feels that "passing objects by reference is a sign of a broken design".
Although this is wrong in C++ for purely technical reasons, always using pass-by-value is a good enough approximation for beginners – it’s certainly much better than passing everything by pointers (or perhaps even than passing everything by reference). It will make some code inefficient but, hey! As long as this doesn’t bother your friend, don’t be unduly disturbed by this practice. Just remind him that someday he might want to reconsider.
On the other hand, this:
There are some big controller objects but nothing more complicated than that.
is a problem. Your friend is talking about broken design, and then all the code uses are a few 3D vectors and large control structures? That is a broken design. Good code achieves modularity through the use of data structures. It doesn’t seem as though this were the case.
… And once you use such data structures, code without pass-by-reference may indeed become quite inefficient.
First thing is, stack rarely overflows outside this website, except in the recursion case.
About his reasoning, I think he might be wrong because he is too generalized, but what he has done might be correct... or not?
For example, the Windows Forms library use Rectangle struct that have 4 members, the Apple's QuartzCore also has CGRect struct, and those structs always passed by value. I think we can compare that to Vector with 3 floating-point variable.
However, as I do not see the code, I feel I should not judge what he has done, though I have a feeling he might did the right thing despite of his over generalized idea.
I would argue that pass by value is fine for small objects such as vectors but even then there is a lot of unnecessary copying occurring in the code. Passing large objects by value is obviously wasteful and most likely not what you want functionally.
It's not quite as obvious as you might think. C++ compilers perform copy elision very aggressively, so you can often pass by value without incurring the cost of a copy operation. And in some cases, passing by value might even be faster.
Before condemning the issue for performance reasons, you should at the very least produce the benchmarks to back it up. And they might be hard to create because the compiler typically eliminates the performance difference.
So the real issue should be one of semantics. How do you want your code to behave? Sometimes, reference semantics are what you want, and then you should pass by reference. If you specifically want/need value semantics then you pass by value.
There is one point in favor of passing by value. It's helpful in achieving a more functional style of code, with fewer side effects and where immutability is the default. That makes a lot of code easier to reason about, and it may make it easier to parallelize the code as well.
But in truth, both have their place. And never using pass-by-reference is definitely a big warning sign.
For the last 6 months or so, I've been experimenting with making pass-by-value the default. If I don't explicitly need reference semantics, then I try to assume that the compiler will perform copy elision for me, so I can pass by value without losing any efficiency.
So far, the compiler hasn't really let me down. I'm sure I'll run into cases where I have to go back and change some calls to passing by reference, but I'll do that when I know that
performance is a problem, and
the compiler failed to apply copy elision
I would say that Not using pointers in C is a sign of a newbie programmer.
It sounds like your friend is scared of pointers.
Remember, C++ pointers were actually inherited from the C language, and C was developed when computers were much less powerful. Nevertheless, speed and efficiency continue to be vital until this day.
So, why use pointers? They allow the developer to optimize a program to run faster or use less memory that it would otherwise! Referring to the memory location of a data is much more efficient then copying all the data around.
Pointers usually are a concept that is difficult to grasp for those beginning to program, because all the experiments done involve small arrays, maybe a few structs, but basically they consist of working with a couple of megabytes (if you're lucky) when you have 1GB of memory laying around the house. In this scene, a couple of MB are nothing and it usually is too little to have a significant impact on the performance of your program.
So let's exaggerate that a little bit. Think of a char array with 2147483648 elements - 2GB of data - that you need to pass to function that will write all the data to the disk. Now, what technique do you think is going to be more efficient/faster?
Pass by value, which is going to have to re-copy those 2GB of data to another location in memory before the program can write the data to the disk, or
Pass by reference, which will just refer to that memory location.
What happens when you just don't have 4GB of RAM? Will you spend $ and buy chips of RAM just because you are afraid of using pointers?
Re-copying the data in memory sounds a bit redundant when you don't have to, and its a waste of computer resource.
Anyway, be patient with your friend. If he would like to become a serious/professional programmer at some point in his life he will eventually have to take the time to really understand pointers.
Good Luck.
As already mentioned the big difference between a reference and a pointer is that a pointer can be null. If a class requires data a reference declaration will make it required. Adding const will make it 'read only' if that is what is desired by the caller.
The pass-by-value 'flaw' mentioned is simply not true. Passing everything by value will completely change the performance of an application. It is not so bad when primitive types (i.e. int, double, etc.) are passed by value but when a class instance is passed by value temporary objects are created which requires constructors and later on destructor's to be called on the class and on all of the member variable in the class. This is exasperated when large class hierarchies are used because parent class constructors/destructor's must be called as well.
Also, just because the vector is passed by value does not mean that it only uses stack memory. heap may be used for each element as it is created in the temporary vector that is passed to the method/function. The vector itself may also have to reallocate via heap if it reaches its capacity.
If pass by value is being so that the callers values are not modified then just use a const reference.
The answers that I've seen so far have all focused on performance: cases where pass-by-reference is faster than pass-by-value. You may have more success in your argument if you focus on cases that are impossible with pass-by-value.
Small tuples or vectors are a very simple type of data-structure. More complex data-structures share information, and that sharing can't be represented directly as values. You either need to use references/pointers or something that simulates them such as arrays and indices.
Lots of problems boil down to data that forms a Graph, or a Directed-Graph. In both cases you have a mixture of edges and nodes that need to be stored within the data-structure. Now you have the problem that the same data needs to be in multiple places. If you avoid references then firstly the data needs to be duplicated, and then every change needs to be carefully replicated in each of the other copies.
Your friend's argument boils down to saying: tackling any problem complex enough to be represented by a Graph is a bad-design....
The only major argument I can think of
is that the stack could possibly
overflow, but I'm guessing that it is
improbable that this will occur? Are
there any other arguments against
using only the stack/pass by value as
opposed to pass by reference?
Well, gosh, where to start...
As you mention, "there is a lot of unnecessary copying occurring in the code". Let's say you've got a loop where you call a function on these objects. Using a pointer instead of duplicating the objects can accelerate execution by one or more orders of magnitude.
You can't pass a variable-sized data structures, arrays, etc. around on the stack. You have to dynamically allocate it and pass a pointers or reference to the beginning. If your friend hasn't run into this, then yes, he's "new to C++."
As you mention, the program in question is simple and mostly uses quite small objects like graphics 3-tuples, which if the elements are doubles would be 24 bytes apiece. But in graphics, it's common to deal with 4x4 arrays, which handle both rotation and translation. Those would be 128 bytes apiece, so if a program that had to deal with those would be five times slower per function call with pass-by-value due to the increased copying. With pass-by-reference, passing a 3-tuple or a 4x4 array in a 32-bit executable would just involve duplicating a single 4-byte pointer.
On register-rich CPU architecures like ARM, PowerPC, 64-bit x86, 680x0 - but not 32-bit x86 - pointers (and references, which are secretly pointers wearing fancy syntatical clothing) are commonly be passed or returned in a register, which is really freaking fast compared to the memory access involved in a stack operation.
You mention the improbability of running out of stack space. And yes, that's so on a small program one might write for a class assignment. But a couple of months ago, I was debugging commercial code that was probably 80 function calls below main(). If they'd used pass-by-value instead of pass-by-reference, the stack would have been ginormous. And lest your friend think this was a "broken design", this was actually a WebKit-based browser implemented on Linux using GTK+, all of which is very state-of-the-art, and the function call depth is normal for professional code.
Some executable architectures limit the size of an individual stack frame, so even though you might not run out of stack space per se, you could exceed that and wind up with perfectly valid C++ code that wouldn't build on such a platform.
I could go on and on.
If your friend is interested in graphics, he should take a look at some of the common APIs used in graphics: OpenGL and XWindows on Linux, Quartz on Mac OS X, Direct X on Windows. And he should look at the internals of large C/C++ systems like the WebKit or Gecko HTML rendering engines, or any of the Mozilla browsers, or the GTK+ or Qt GUI toolkits. They all pass by anything much larger than a single integer or float by reference, and often fill in results by reference rather than as a function return value.
Nobody with any serious real world C/C++ chops - and I mean nobody - passes data structures by value. There's a reason for this: it's just flipping inefficient and problem-prone.
Wow, there are already 13 answers… I didn't read all in detail but I think this is quite different from the others…
He has a point. The advantage of pass-by-value as a rule is that subroutines cannot subtly modify their arguments. Passing non-const references would indicate that every function has ugly side effects, indicating poor design.
Simply explain to him the difference between vector3 & and vector3 const&, and demonstrate how the latter may be initialized by a constant as in vec_function( vector3(1,2,3) );, but not the former. Pass by const reference is a simple optimization of pass by value.
Buy your friend a good c++ book. Passing non-trivial objects by reference is a good practice and saves you a lot of unneccessary constructor/destructor calls. This has also nothing to do with allocating on free store vs. using stack. You can (or should) pass objects allocated on program stack by reference without any free store usage. You also can ignore free store completely, but that throws you back to the old fortran days which your friend probably hadn't in mind - otherwise he would pick an ancient f77 compiler for your project, wouldn't he...?
I have the following situation:
class A
{
public:
A(int whichFoo);
int foo1();
int foo2();
int foo3();
int callFoo(); // cals one of the foo's depending on the value of whichFoo
};
In my current implementation I save the value of whichFoo in a data member in the constructor and use a switch in callFoo() to decide which of the foo's to call. Alternatively, I can use a switch in the constructor to save a pointer to the right fooN() to be called in callFoo().
My question is which way is more efficient if an object of class A is only constructed once, while callFoo() is called a very large number of times. So in the first case we have multiple executions of a switch statement, while in the second there is only one switch, and multiple calls of a member function using the pointer to it. I know that calling a member function using a pointer is slower than just calling it directly. Does anybody know if this overhead is more or less than the cost of a switch?
Clarification: I realize that you never really know which approach gives better performance until you try it and time it. However, in this case I already have approach 1 implemented, and I wanted to find out if approach 2 can be more efficient at least in principle. It appears that it can be, and now it makes sense for me to bother to implement it and try it.
Oh, and I also like approach 2 better for aesthetic reasons. I guess I am looking for a justification to implement it. :)
How sure are you that calling a member function via a pointer is slower than just calling it directly? Can you measure the difference?
In general, you should not rely on your intuition when making performance evaluations. Sit down with your compiler and a timing function, and actually measure the different choices. You may be surprised!
More info: There is an excellent article Member Function Pointers and the Fastest Possible C++ Delegates which goes into very deep detail about the implementation of member function pointers.
You can write this:
class Foo {
public:
Foo() {
calls[0] = &Foo::call0;
calls[1] = &Foo::call1;
calls[2] = &Foo::call2;
calls[3] = &Foo::call3;
}
void call(int number, int arg) {
assert(number < 4);
(this->*(calls[number]))(arg);
}
void call0(int arg) {
cout<<"call0("<<arg<<")\n";
}
void call1(int arg) {
cout<<"call1("<<arg<<")\n";
}
void call2(int arg) {
cout<<"call2("<<arg<<")\n";
}
void call3(int arg) {
cout<<"call3("<<arg<<")\n";
}
private:
FooCall calls[4];
};
The computation of the actual function pointer is linear and fast:
(this->*(calls[number]))(arg);
004142E7 mov esi,esp
004142E9 mov eax,dword ptr [arg]
004142EC push eax
004142ED mov edx,dword ptr [number]
004142F0 mov eax,dword ptr [this]
004142F3 mov ecx,dword ptr [this]
004142F6 mov edx,dword ptr [eax+edx*4]
004142F9 call edx
Note that you don't even have to fix the actual function number in the constructor.
I've compared this code to the asm generated by a switch. The switch version doesn't provide any performance increase.
To answer the asked question: at the finest-grained level, the pointer to the member function will perform better.
To address the unasked question: what does "better" mean here? In most cases I would expect the difference to be negligible. Depending on what the class it doing, however, the difference may be significant. Performance testing before worrying about the difference is obviously the right first step.
If you are going to keep using a switch, which is perfectly fine, then you probably should put the logic in a helper method and call if from the constructor. Alternatively, this is a classic case of the Strategy Pattern. You could create an interface (or abstract class) named IFoo which has one method with Foo's signature. You would have the constructor take in an instance of IFoo (constructor Dependancy Injection that implemented the foo method that you want. You would have a private IFoo that would be set with this constructor, and every time you wanted to call Foo you would call your IFoo's version.
Note: I haven't worked with C++ since college, so my lingo might be off here, ut the general ideas hold for most OO languages.
If your example is real code, then I think you should revisit your class design. Passing in a value to the constructor, and using that to change behaviour is really equivalent to creating a subclass. Consider refactoring to make it more explicit. The effect of doing so is that your code will end up using a function pointer (all virtual methods are, really, are function pointers in jump tables).
If, however your code was just a simplified example to ask whether, in general, jump tables are faster than switch statements, then my intuition would say that jump tables are quicker, but you are dependent on the compiler's optimisation step. But if performance is really such a concern, never rely on intuition - knock up a test program and test it, or look at the generated assembler.
One thing is certain, a switch statement will never be slower than a jump table. The reason being that the best a compiler's optimiser can do will be too turn a series of conditional tests (i.e. a switch) into a jump table. So if you really want to be certain, take the compiler out of the decision process and use a jump table.
Sounds like you should make callFoo a pure virtual function and create some subclasses of A.
Unless you really need the speed, have done extensive profiling and instrumenting, and determined that the calls to callFoo are really the bottleneck. Have you?
Function pointers are almost always better than chained-ifs. They make cleaner code, and are nearly always faster (except perhaps in a case where its only a choice between two functions and is always correctly predicted).
I should think that the pointer would be faster.
Modern CPUs prefetch instructions; mis-predicted branches flush the cache, which means it stalls while it refills the cache. A pointer doens't do that.
Of course, you should measure both.
Optimize only when needed
First: Most of the time you most likely do not care, the difference will be very small. Make sure optimizing this call really makes sense first. Only if your measurements show there is really significant time spent in the call overhead, proceed to optimizing it (shameless plug - Cf. How to optimize an application to make it faster?) If the optimization is not significant, prefer the more readable code.
Indirect call cost depends on target platform
Once you have determined it is worth to apply low-level optimization, then it is a time to understand your target platform. The cost you can avoid here is the branch misprediction penalty. On modern x86/x64 CPU this misprediction is likely to be very small (they can predict indirect calls quite well most of the time), but when targeting PowerPC or other RISC platforms, the indirect calls/jumps are often not predicted at all and avoiding them can cause significant performance gain. See also Virtual call cost depends on platform.
Compiler can implement switch using jump table as well
One gotcha: Switch can sometimes be implemented as an indirect call (using a table) as well, especially when switching between many possible values. Such switch exhibits the same misprediction as a virtual function. To make this optimization reliable, one would probably prefer using if instead of switch for the most common case.
Use timers to see which is quicker. Although unless this code is going to be over and over then it's unlikely that you'll notice any difference.
Be sure that if you are running code from the constructor that if the contruction fails that you wont leak memory.
This technique is used heavily with Symbian OS:
http://www.titu.jyu.fi/modpa/Patterns/pattern-TwoPhaseConstruction.html
If you are only calling callFoo() once, than most likely the function pointer will be slower by an insignificant amount. If you are calling it many times than most likely the function pointer will be faster by an insignificant amount (because it doesn't need to keep going through the switch).
Either way look at the assembled code to find out for sure it is doing what you think it is doing.
One often overlooked advantage to switch (even over sorting and indexing) is if you know that a particular value is used in the vast majority of cases.
It's easy to order the switch so that the most common are checked first.
ps. To reinforce greg's answer, if you care about speed - measure.
Looking at assembler doesn't help when CPUs have prefetch / predictive branching and pipeline stalls etc