Lookup table to Function Pointer Array C++ performance - c++

I have a following code to emulate basic system on my pc (x86):
typedef void (*op_fn) ();
void add()
{
//add Opcode
//fetch next opcode
opcodes[opcode]();
}
void nop()
{
//NOP opcode
//fetch next opcode
opcodes[opcode]();
}
const op_fn opcodes[256] =
{
add,
nop,
etc...
};
and i call this "table" via opcodes[opcode]()
I am trying to improve performance of my interpreter.
What about inlining every function, like
inline void add()
inline void nop()
Is there any benefits of doing it?
Is there anyway to make it go faster?
Thanks

Just because you flag a method as inline it doesn't require the compiler to do so - it's more of a hint than an order.
Given that you are storing the opcode handlers in an array the compiler will need to place the address of the function into the array, therefore it can't inline it.
There's actually nothing wrong with your approach. If you really think you've got performance issues then get some metrics, otherwise don't worry (at this point!). The concept of a table of pointers to functions is nothing new - it's actually how C++ implement virtual functions (ie the vtable).

"Inline" means "don't emit a function call; instead, substitute the function body at compile time."
Calling through a function pointer means "do a function call, the details of which won't be known until runtime."
The two features are fundamentally opposed. (The best you could hope for is that a sufficiently advanced compiler could statically determine which function is being called through a function pointer in very limited circumstances and inline those.)
switch blocks are typically implemented as jump tables, which could have less overhead than function calls, so replacing your function pointer array with a switch block and using inline might make a difference.

inline is just a hint to your compiler, it does not guarantee any inlining being done. You should read up on inlining (maybe at the ISO C++ FAQ), as too much inlining can actually make your code slower (through code bloat and associated virtual memory trashing ).

Related

Is there a Macro for a Function Pointer? UE4 C++

I made a function pointer in ue4 c++ and dont know which macro i can/should use for the pointer in the header file. (things like UPROPERTY() or UFUNCTION())
why i even want to use a macro? Because of the garbage collection and, as far as i know, the gb only works when the variable/function has a macro.
btw, is a function pointer call as performant as a normal function call?
header.h Code:
typedef void (AWeaponGun::*FireTypeFunctionPtr)(void);
FireTypeFunctionPtr PtrFireType;
UFUNCTION()
void FireLineTrace();
file.cpp Code:
void AWeaponGun::BeginPlay()
{
PtrFireType = &AWeaponGun::FireLineTrace;
}
void AWeaponGun::Tick(float DeltaTime)
{
(this->*PtrFireType)();
}
Functions and your function pointer are not allocated and deallocated so you don't need the garbage collector in this case, and thus you don't need any UE4 macro (...if you only consider the garbage collector).
As for performance, the function pointer is most probably slightly less performant as it adds one more variable and needs additional instructions to reach the function. But the compiler might be able to optimize some of the differences in some cases.
You would need to check the assembly code created in both cases to be sure.

Automatically wrap C/C++ function at compile-time with annotation

In my C/C++ code I want to annotate different functions and methods so that additional code gets added at compile-time (or link-time). The added wrapping code should be able to inspect context (function being called, thread information, etc.), read/write input variables and modify return values and output variables.
How can I do that with GCC and/or Clang?
Take a look at instrumentation functions in GCC. From man gcc:
-finstrument-functions
Generate instrumentation calls for entry and exit to functions. Just after function entry and just before function exit, the following profiling functions will be called with the address of the current function and its call site. (On some platforms,
"__builtin_return_address" does not work beyond the current function, so the call site information may not be available to the profiling functions otherwise.)
void __cyg_profile_func_enter (void *this_fn,
void *call_site);
void __cyg_profile_func_exit (void *this_fn,
void *call_site);
The first argument is the address of the start of the current function, which may be looked up exactly in the symbol table.
This instrumentation is also done for functions expanded inline in other functions. The profiling calls will indicate where, conceptually, the inline function is entered and exited. This means that addressable versions of such functions must be available. If
all your uses of a function are expanded inline, this may mean an additional expansion of code size. If you use extern inline in your C code, an addressable version of such functions must be provided. (This is normally the case anyways, but if you get lucky
and the optimizer always expands the functions inline, you might have gotten away without providing static copies.)
A function may be given the attribute "no_instrument_function", in which case this instrumentation will not be done. This can be used, for example, for the profiling functions listed above, high-priority interrupt routines, and any functions from which the
profiling functions cannot safely be called (perhaps signal handlers, if the profiling routines generate output or allocate memory).

Breaking down function Logic into sub function

Splitting up functions, into smaller sub function into the code, can effect efficiency of the program?
while reducing cyclomatic complexity of functions i have break down function into smaller parts, and has used helper function and inline functions for it.
void functionParent(arguments)
{
intialCheckFunction(arguments);
functionOne();
functionTwo();
functionThree();
functionFour();
return STATUS;
}
void functionOne()
{
/*follows unary Principle.*/
}
My concern is regarding the stack pointer, does a frequent switch of SP reduce efficiency of program drastically or it negligible.
The above functionOne,Two,.. are having UNARY Logic in them.
Kindenter code herely reply in both context, C as well C++
You should split off logic into its own function whenever you think that it would aid readability: the cost of a function call itself is negligible.
Although it is generally true that calling a function consumes some space and CPU cycles, you shouldn't be worrying about it at all: the instructions involved are optimized beyond belief, and the compiler can inline your code when it sees fit.
EDIT (in response to comment by Potatoswatter)
One thing you need to be careful is passing parameters, especially in C++, where user code can participate in the process of copying parameters being passed to the function. Passing large structs by value can take more than a few cycles in C, too, so you should pass them by reference or by pointer whenever you can.
Generally I prefer to break up function into smaller functions so that we can reuse it in other places, It is often required during re-factoring. If you are so concerned about the switch and if the function is really small you can mark it as inline. However i don't think having too many functions make such a huge difference in performance of your program.
If you are using C++ you can declare inline functions.
Then no overhead will happen due function call, once code will be replaced in line.
inline bool check(args){
if( some_condiction(args) ){
return true;
}
}
inline void functionOne(){...}
inline void functionTwo(){...}
inline void functionThree(){...}
inline void functionFour(){...}
int functionParent(arguments){
if(check(arguments)==false)
return FAIL;
functionOne();
functionTwo();
functionThree();
functionFour();
return STATUS;
}
The current processors architecture execute more than one instruction per time, lets say 5. At a given moment they may be in intermediate stages of completion, lets say 90%,80%,70%,60%,50%. If the first instruction is a function call, all effort made to evaluate the next 4 instructions will be in vain, which will greatly reduce the program execution speed.
Its not needed to care too much about these details, unless you are creating a critical application. Usually compiler is smart enough to inline the needed functions when using optimization flags.

Do repetitive calls to member functions hurt?

I have programmed in both Java and C, and now I am trying to get my hands dirty with C++.
Given this code:
class Booth {
private :
int tickets_sold;
public :
int get_tickets_sold();
void set_tickets_sold();
};
In Java, wherever I needed the value of tickets_sold, I would call the getter repeatedly.
For example:
if (obj.get_tickets_sold() > 50 && obj.get_tickets_sold() < 75){
//do something
}
In C I would just get the value of the particular variable in the structure:
if( obj_t->tickets_sold > 50 && obj_t->tickets_sold < 75){
//do something
}
So while using structures in C, I save on the two calls that I would otherwise make in Java, the two getters that is, I am not even sure if those are actual calls or Java somehow inlines those calls.
My point is if I use the same technique that I used in Java in C++ as well, will those two calls to getter member functions cost me, or will the compiler somehow know to inline the code? (thus reducing the overhead of function call altogether?)
Alternatively, am I better off using:
int num_tickets = 0;
if ( (num_tickets = obj.get_ticket_sold()) > 50 && num_tickets < 75){
//do something
}
I want to write tight code and avoid unnecessary function calls, I would care about this in Java, because, well, we all know why. But, I want my code to be readable and to use the private and public keywords to correctly reflect what is to be done.
Unless your program is too slow, it doesn't really matter. In 99.9999% of code, the overhead of a function call is insignificant. Write the clearest, easiest to maintain, easiest to understand code that you can and only start tweaking for performance after you know where your performance hot spots are, if you have any at all.
That said, modern C++ compilers (and some linkers) can and will inline functions, especially simple functions like this one.
If you're just learning the language, you really shouldn't worry about this. Consider it fast enough until proven otherwise. That said, there are a lot of misleading or incomplete answers here, so for the record I'll flesh out a few of the subtler implications. Consider your class:
class Booth
{
public:
int get_tickets_sold();
void set_tickets_sold();
private:
int tickets_sold;
};
The implementation (known as a definition) of the get and set functions is not yet specified. If you'd specified function bodies inside the class declaration then the compiler would consider you to have implicitly requested they be inlined (but may ignore that if they're excessively large). If you specify them later using the inline keyword, that has exactly the safe effect. Summarily...
class Booth
{
public:
int get_tickets_sold() { return tickets_sold; }
...
...and...
class Booth
{
public:
int get_tickets_sold();
...
};
inline int Booth::get_tickets_sold() { return tickets_sold; }
...are equivalent (at least in terms of what the Standard encourages us to expect, but individual compiler heuristics may vary - inlining is a request that the compiler's free to ignore).
If the function bodies are specified later without the inline keyword, then the compiler is under no obligation to inline them, but may still choose to do so. It's much more likely to do so if they appear in the same translation unit (i.e. in the .cc/.cpp/.c++/etc. "implementation" file you're compiling or some header directly or indirectly included by it). If the implementation is only available at link time then the functions may not be inlined at all, but it depends on the way your particular compiler and linker interact and cooperate. It is not simply a matter of enabling optimisation and expecting magic. To prove this, consider the following code:
// inline.h:
void f();
// inline.cc:
#include <cstdio>
void f() { printf("f()\n"); }
// inline_app.cc:
#include "inline.h"
int main() { f(); }
Building this:
g++ -O4 -c inline.cc
g++ -O4 -o inline_app inline_app.cc inline.o
Investigating the inlining:
$ gdb inline_app
...
(gdb) break main
Breakpoint 1 at 0x80483f3
(gdb) break f
Breakpoint 2 at 0x8048416
(gdb) run
Starting program: /home/delroton/dev/inline_app
Breakpoint 1, 0x080483f3 in main ()
(gdb) next
Single stepping until exit from function main,
which has no line number information.
Breakpoint 2, 0x08048416 in f ()
(gdb) step
Single stepping until exit from function _Z1fv,
which has no line number information.
f()
0x080483fb in main ()
(gdb)
Notice the execution went from 0x080483f3 in main() to 0x08048416 in f() then back to 0x080483fb in main()... clearly not inlined. This illustrates that inlining can't be expected just because a function's implementation is trivial.
Notice that this example is with static linking of object files. Clearly, if you use library files you may actually want to avoid inlining of the functions specifically so that you can update the library without having to recompile the client code. It's even more useful for shared libraries where the linking is done implicitly at load time anyway.
Very often, classes providing trivial functions use the two forms of expected-inlined function definitions (i.e. inside class or with inline keyword) if those functions can be expected to be called inside any performance-critical loops, but the countering consideration is that by inlining a function you force client code to be recompiled (relatively slow, possibly no automated trigger) and relinked (fast, for shared libraries happens on next execution), rather than just relinked, in order to pick up changes to the function implementation.
These kind of considerations are annoying, but deliberate management of these tradeoffs is what allows enterprise use of C and C++ to scale to tens and hundreds of millions of lines and thousands of individual projects, all sharing various libraries over decades.
One other small detail: as a ballpark figure, an out-of-line get/set function is typically about an order of magnitude (10x) slower than the equivalent inlined code. That will obviously vary with CPU, compiler, optimisation level, variable type, cache hits/misses etc..
No, repetitive calls to member functions will not hurt.
If it's just a getter function, it will almost certainly be inlined by the C++ compiler (at least with release/optimized builds) and the Java Virtual Machine may "figure out" that a certain function is being called frequently and optimize for that. So there's pretty much no performance penalty for using functions in general.
You should always code for readability first. Of course, that's not to say that you should completely ignore performance outright, but if performance is unacceptable then you can always profile your code and see where the slowest parts are.
Also, by restricting access to the tickets_sold variable behind getter functions, you can pretty much guarantee that the only code that can modify the tickets_sold variable to member functions of Booth. This allows you to enforce invariants in program behavior.
For example, tickets_sold is obviously not going to be a negative value. That is an invariant of the structure. You can enforce that invariant by making tickets_sold private and making sure your member functions do not violate that invariant. The Booth class makes tickets_sold available as a "read-only data member" via a getter function to everyone else and still preserves the invariant.
Making it a public variable means that anybody can go and trample over the data in tickets_sold, which basically completely destroys your ability to enforce any invariants on tickets_sold. Which makes it possible for someone to write a negative number into tickets_sold, which is of course nonsensical.
The compiler is very likely to inline function calls like this.
class Booth {
public:
int get_tickets_sold() const { return tickets_sold; }
private:
int tickets_sold;
};
Your compiler should inline get_tickets_sold, I would be very surprised if it didn't. If not, you either need to use a new compiler or turn on optimizations.
Any compiler worth its salt will easily optimize the getters into direct member access. The only times that won't happen are when you have optimization explicitly disabled (e.g. for a debug build) or if you're using a brain-dead compiler (in which case, you should seriously consider ditching it for a real compiler).
The compiler will very likely do the work for you, but in general, for things like this I would approach it more from the C perspective rather than the Java perspective unless you want to make the member access a const reference. However, when dealing with integers, there's usually little value in using a const reference over a copy (at least in 32 bit environments since both are 4 bytes), so your example isn't really a good one here... Perhaps this may illustrate why you would use a getter/setter in C++:
class StringHolder
{
public:
const std::string& get_string() { return my_string; }
void set_string(const std::string& val) { if(!val.empty()) { my_string = val; } }
private
std::string my_string;
}
That prevents modification except through the setter which would then allow you to perform extra logic. However, in a simple class such as this, the value of this model is nil, you've just made the coder who is calling it type more and haven't really added any value. For such a class, I wouldn't have a getter/setter model.

Execution time differences, are there any?

Consider this piece of code:
class A {
void methodX() {
// snip (1 liner function)
}
}
class B {
void methodX() {
// same -code
}
}
Now other way i can go is, I have a class(AppManager) most of whose members are static, (from legacy code, don't suggest me singleton ;))
class AppManager {
public:
static void methodX(){
// same-code
}
}
Which one should be preferred?
As both are inlined, there shouldn't be a runtime difference, right?
Which form is more cleaner?
Now first of all, this is a concern so minuscule that you would never have to worry about it unless the functions are called thousands of times per frame (and you're doing something where "frames" matter).
Second, IF they are inlined, the code will be (hopefully) optimized so much that there is no sign whatsoever of the function being non-static. It would be identical.
Even if they were not inlined, the difference would be minor. The ABI would put the "this" pointer into a register (or the stack), which it wouldn't do in a static function, but again, the net result would be almost not measurable.
Bottom line - write your code in the cleanest possible way. Performance is not a concern at this point.
In my opinion Inline way would be faster.
because inline functions are replaced in code in compile time and therefor there is no need to save registers, make a function call and then return again. but when you call a static function it's just a function call and it has much overhead than the inline one.
I think that this is most common optimisation problem. At first level when you writing a code you try every single trick that would help compiler so if compiler can not optimise code well, you already have. This is wrong. What are you looking for in first stage of optimisation during writing code is just clean and understandable code, design and structure. That will make by far better code, that "optimised" by hand.
Rule is:
If you do not have resources to benchmark code, rewrite it and spend lot of time for optimisation than you do not need optimised code. In most cases it is hard to gain any speed boost whit any kind optimisation, if you structured your code well.