I need to use small classes formed essentially from just an integer "handle" and be able to treat that as a class in order to be able to attach methods to it.
At the same time I want also to avoid to pass from one function to the other just the address of the handle ( the "this" pointer) because doing so means that in order to read a handle that should just be there I would need to read a memory location to have it.
So I need essentially to have the "handle" passed by value eventually in registers ( depending on calling convention ).
Some clarifying code is:
struct F{
int aa,bb,cc;};
F A[0x100];
struct handle{
int hhh;
void elaborateHandle(){ ... operations ;}
};
int main(){
handle h;h.hhh=3;
h.elaborateHandle();
// I need that call to pass on the stack essentially the number 3 and not the address of where the number 3 was saved on the stack.
}
I think, that you shouldn't think about it, because here you are having a very very small performance lose, dereferencing one pointer is a cheap operation.
If you use optimizing compiler, there is a chance, that your method call will be inlined inside caller func.
Anyways, if you trying to optimize your performance, you should search in other place.
But if you really thinking that it causes troubles there is a way:
Declare the function outside your class (not as member), and if you want to access private data declare it as friend.
First, print the assembly language of the code that calls your function and the first part of your function.
The assembly language will show how the registers are used. Normally, compilers try to make best use of registers when passing values to functions.
To help the compiler better use registers:
Limit parameter quantities in functions.
The compiler reserves a limited quantity of registers for passing to functions. The more parameters a function has, the less probability that all parameters will be in registers.
Also, the compiler may need to save registers before calling a function in order to pass more parameters to the function.
Pass values that fit inside registers.
If the compiler can't fit a data type in one register, it may use two registers (such as passing 64 bit values on a 32-bit processor).
If the compiler can't fit the data type in two registers, it may push the data on the stack rather than passing by register. This means that the receiving function will have to copy the values from the stack.
Pass large items by reference or pointer. On most platforms, the compiler can store a pointer into a register and pass the register to the receiving function. A lot faster that pushing and popping values with a stack. Also, compilers may use pointers to implement references.
Suggest to the compiler to place values in registers.
Although the register keyword may not be available in more recent language versions, using the register keyword with variables suggests to the compiler that you would like to have the variable in a register. It is only a suggestion and the compiler can ignore it and you.
Define variables as close to their point of usage. This allows the compiler to allocate registers when needed rather than reserving them for a while.
Create scope blocks. Using { and } to create new scope blocks will help the compiler allocate and deallocate registers that are used only in a limited area. So if variables are only used in a limited area in a function, place that area in a new scope block. You can even tag those local variable with the register keyword.
Compile with high optimization levels.
Set your compiler's optimization levels high, then check the assembly language.
The compiler may use memory for variable storage when optimization is at the lowest setting (debugging). At higher optimization levels, the compiler starts using registers more effectively.
Remember, print the assembly language of the functions and the calling code before and after playing with optimization levels.
Using the g++ compiler on the x86 platform, I found that the flag "-freg-struct-return" has a different effect that described in the documentation. According to my tests, that flag, obliges the compiler to pass structures by value ( I didn't checked it but it will be probably be valid when structures have a size smaller than a specific size -- I checked up to 64 bits and it works compiling using -m32 ).
Differently from what the documentation says, structs aren't passed in registers, unless a register passing convention in used.
That behaviour is valid also for declared or compiler recognized const methods of structures ( or classes ).
So if a method doesn't change the structure, than the structure is passed by value ( in stack allocated space or in registers depending e.g. on defining a function using the __attributes__ (( regparam(3) )) .
Instead as it should be, if a structure is modified by a method, than the address is passed to the method instead of the value of the struct ( as it should be ).
The documentation of that flag is misleading because it says: "Return struct and union values in registers when possible. This is more efficient for small structures than -fpcc-struct-return.
If you specify neither -fpcc-struct-return nor -freg-struct-return, GCC defaults to whichever convention is standard for the target. If there is no standard convention, GCC defaults to -fpcc-struct-return, except on targets where GCC is the principal compiler. In those cases, we can choose the standard, and we chose the more efficient register return alternative."
The testing code I used is bellow, the effects may be seen by seeing what the disassemler shows.
#include <stdio.h>
int a;
int aa[100];
struct Token{
short int ind; short int ind1; short int ind2;
int v() const{return aa[ind];}
__attribute((noinline)) void setind(int i){ind=i;}
__attribute((noinline)) int tok() {return ind;}
};
__attribute__ ((noinline)) void showIt(Token t){
t.ind+=t.ind;
a+=t.ind;
t.ind=8;
}
Token t0 = {.ind=15};
Token t1 = {.ind=99};
int main(int argc, char **argv)
{
t0.setind(10);
int x=19;
x=t0.tok();
showIt(t0);
t1.setind(20+x);
showIt(t1);
printf("%i\n",a);
return 0;
}
Related
I have a following code to emulate basic system on my pc (x86):
typedef void (*op_fn) ();
void add()
{
//add Opcode
//fetch next opcode
opcodes[opcode]();
}
void nop()
{
//NOP opcode
//fetch next opcode
opcodes[opcode]();
}
const op_fn opcodes[256] =
{
add,
nop,
etc...
};
and i call this "table" via opcodes[opcode]()
I am trying to improve performance of my interpreter.
What about inlining every function, like
inline void add()
inline void nop()
Is there any benefits of doing it?
Is there anyway to make it go faster?
Thanks
Just because you flag a method as inline it doesn't require the compiler to do so - it's more of a hint than an order.
Given that you are storing the opcode handlers in an array the compiler will need to place the address of the function into the array, therefore it can't inline it.
There's actually nothing wrong with your approach. If you really think you've got performance issues then get some metrics, otherwise don't worry (at this point!). The concept of a table of pointers to functions is nothing new - it's actually how C++ implement virtual functions (ie the vtable).
"Inline" means "don't emit a function call; instead, substitute the function body at compile time."
Calling through a function pointer means "do a function call, the details of which won't be known until runtime."
The two features are fundamentally opposed. (The best you could hope for is that a sufficiently advanced compiler could statically determine which function is being called through a function pointer in very limited circumstances and inline those.)
switch blocks are typically implemented as jump tables, which could have less overhead than function calls, so replacing your function pointer array with a switch block and using inline might make a difference.
inline is just a hint to your compiler, it does not guarantee any inlining being done. You should read up on inlining (maybe at the ISO C++ FAQ), as too much inlining can actually make your code slower (through code bloat and associated virtual memory trashing ).
Dearest stack exchange,
I'm programming an MRI scanner. I won't go into too much background, but I'm fairly constrained in how much code I've got access to, and the way things have been set up is...suboptimal. I have a situation as follows:
There is a big library, written in C++. It ultimately does "transcoding" (in the worst possible way), writing out FPGA assembly that DoesThings. It provides a set of functions to "userland" that are translated into (through a mix of preprocessor macros and black magic) long strings of 16 bit and 32 bit words. The way this is done is prone to buffer overflows, and generally to falling over.*
The FPGA assembly is then strung out over a glorified serial link to the relevant electronics, which executes it (doing the scan), and returning the data back again for processing.
Programmers are expected to use the functions provided by the library to do their thing, in C (not C++) functions that are linked against the standard library. Unfortunately, in my case, I need to extend the library.
There's a fairly complicated chain of preprocessor substitution and tokenization, calling, and (in general) stuff happening between you writing doSomething() in your code, and the relevant library function actually executing it. I think I've got it figured out to some extent, but it basically means that I've got no real idea about the scope of anything...
In short, my problem is:
In the middle of a method, in a deep dark corner of many thousands of lines of code in a big blob I have little control over, with god-knows-what variable scoping going on, I need to:
Extend this method to take a function pointer (to a userland function) as an argument, but
Let this userland function, written after the library has been compiled, have access to variables that are local to both the scope of the method where it appears, as well as variables in the (C) function where it is called.
This seems like an absolute mire of memory management, and I thought I'd ask here for the "best practice" in these situations, as it's likely that there are lots of subtle issues I might run into -- and that others might have lots of relevant wisdom to impart. Debugging the system is a nightmare, and I've not really got any support from the scanner's manufacturer on this.
A brief sketch of how I plan to proceed is as follows:
In the .cpp library:
/* In something::something() /*
/* declare a pointer to a function */
void (*fp)(int*, int, int, ...);
/* by default, the pointer points to a placeholder at compile time*/
fp = &doNothing(...);
...
/* At the appropriate time, point the pointer to the userland function, whose address is supplied as an argument to something(): /*
fp= userFuncPtr;
/* Declare memory for the user function to plonk data into */
i_arr_coefficients = (int) malloc(SOMETHING_SENSIBLE);
/* Create a pointer to that array for the userland function */
i_ptr_array=&i_arr_coefficients[0];
/* define a struct of pointers to local variables for the userland function to use*/
ptrStrct=createPtrStruct();
/* Call the user's function: */
fp(i_ptr_array,ptrStrct, ...);
CarryOnWithSomethingElse();
The point of the placeholder function is to keep things ticking over if the user function isn't linked in. I get that this could be replaced with a #DEFINE, but the compiler's cleverness or stupidity might result in odd (to my ignorant mind, at least) behaviour.
In the userland function, we'd have something like:
void doUsefulThings(i_ptr_array, ptrStrct, localVariableAddresses, ...) {
double a=*ptrStrct.a;
double b=*ptrStrct.b;
double c=*localVariableAddresses.c;
double d=doMaths(a, b, c);
/* I.e. do maths using all of these numbers we've got from the different sources */
storeData(i_ptr_array, d);
/* And put the results of that maths where the C++ method can see it */
}
...
something(&doUsefulThings(i_ptr_array, ptrStrct, localVariableAddresses, ...), ...);
...
If this is as clear as mud please tell me! Thank you very much for your help. And, by the way, I sincerely wish someone would make an open hardware/source MRI system.
*As an aside, this is the primary justification the manufacturer uses to discourage us from modifying the big library in the first place!
You have full access to the C code. You have limited access to the C++ library code. The C code is defining the "doUsefullthings" function. From C code you are calling the "Something" function ( C++ class/function) with function pointer to "doUseFullThings" as the argument. Now the control goes to the C++ library. Here the various arguments are allocated memory and initialized. Then the the "doUseFullThings" is called with those arguments. Here the control transfers back to the C code. In short, the main program(C) calls the library(C++) and the library calls the C function.
One of the requirements is that the "userland function should have access to local variable from the C code where it is called". When you call "something" you are only giving the address of "doUseFullThings". There is no parameter/argument of "something" that captures the address of the local variables. So "doUseFullThings" does not have access to those variables.
malloc statement returns pointer. This has not been handled properly.( probably you were trying to give us overview ). You must be taking care to free this somewhere.
Since this is a mixture of C and C++ code, it is difficult to use RAII (taking care of allocated memory), Perfect forwarding ( avoid copying variables), Lambda functions ( to access local varibales) etc. Under the circumstances, your approach seems to be the way to go.
The following is the situation. There is a system/software which is completely written in C. This C program spawns a new thread to start some kind of a data processing engine written in C++. Hence, the system which I have, runs 2 threads (the main thread and the data processing engine thread). Now, I have written some function in C which takes in a C struct and passes it to the data processing thread so that a C++ function can access the C struct. While doing so, I am observing that the values of certain fields (like unsigned int) in the C struct changes when being accessed in the C++ side and I am not sure why. At the same time, if I pass around a primitive data type like an int, the value does not change. It would be great if someone can explain me why it behaves like this. The following is the code that i wrote.
`
/* C++ Function */
void DataProcessor::HandleDataRecv(custom_struct* cs)
{
/*Accesses the fields in the structure cs - an unsigned int field. The value of
field here is different from the value when accessed through the C function below.
*/
}
/*C Function */
void forwardData(custom_struct* cs)
{
dataProcessor->HandleDataRecv(cs); //Here dataProcessor is a reference to the object
//of the C++ class.
}
`
Also, both these functions are in different source files(one with .c ext and other with .cc ext)
I'd check that both sides layout the struct in the same
print sizeof(custom_struct) in both languages
Create an instance of custom_struct in both languages and print the offset of
each member variable.
My wild guess would be Michael Andresson is right, structure aligment might be the issue.
Try to compile both c and c++ files with
-fpack-struct=4
(or some other number for 4). This way, the struct is aligned the same in every case.
If we could see the struct declaration, it would probably clearer. The struct does not contain any #ifdef with c++-specific code like a constructor, does it? Also, check for #pragma pack directives which manipulate data alignment.
Maybe on one side the struct has 'empty bytes' added to make the variables align on 32 bit boundaries for speed (so a CPU register can point to the variable directly).
And on the other side the struct may be packed to conserve space.
(CORRECTION) With minor exceptions, C++ is a superset of C (meaning C89), So i'm confused about what is going on. I can only assume it has something to do with how you are passing or typing your variables, and/or the systems they are running on. It should, technically speaking, unless I am very mistaken, have nothing to do with c/c++ interoperability.
Some more details would help.
I have programmed in both Java and C, and now I am trying to get my hands dirty with C++.
Given this code:
class Booth {
private :
int tickets_sold;
public :
int get_tickets_sold();
void set_tickets_sold();
};
In Java, wherever I needed the value of tickets_sold, I would call the getter repeatedly.
For example:
if (obj.get_tickets_sold() > 50 && obj.get_tickets_sold() < 75){
//do something
}
In C I would just get the value of the particular variable in the structure:
if( obj_t->tickets_sold > 50 && obj_t->tickets_sold < 75){
//do something
}
So while using structures in C, I save on the two calls that I would otherwise make in Java, the two getters that is, I am not even sure if those are actual calls or Java somehow inlines those calls.
My point is if I use the same technique that I used in Java in C++ as well, will those two calls to getter member functions cost me, or will the compiler somehow know to inline the code? (thus reducing the overhead of function call altogether?)
Alternatively, am I better off using:
int num_tickets = 0;
if ( (num_tickets = obj.get_ticket_sold()) > 50 && num_tickets < 75){
//do something
}
I want to write tight code and avoid unnecessary function calls, I would care about this in Java, because, well, we all know why. But, I want my code to be readable and to use the private and public keywords to correctly reflect what is to be done.
Unless your program is too slow, it doesn't really matter. In 99.9999% of code, the overhead of a function call is insignificant. Write the clearest, easiest to maintain, easiest to understand code that you can and only start tweaking for performance after you know where your performance hot spots are, if you have any at all.
That said, modern C++ compilers (and some linkers) can and will inline functions, especially simple functions like this one.
If you're just learning the language, you really shouldn't worry about this. Consider it fast enough until proven otherwise. That said, there are a lot of misleading or incomplete answers here, so for the record I'll flesh out a few of the subtler implications. Consider your class:
class Booth
{
public:
int get_tickets_sold();
void set_tickets_sold();
private:
int tickets_sold;
};
The implementation (known as a definition) of the get and set functions is not yet specified. If you'd specified function bodies inside the class declaration then the compiler would consider you to have implicitly requested they be inlined (but may ignore that if they're excessively large). If you specify them later using the inline keyword, that has exactly the safe effect. Summarily...
class Booth
{
public:
int get_tickets_sold() { return tickets_sold; }
...
...and...
class Booth
{
public:
int get_tickets_sold();
...
};
inline int Booth::get_tickets_sold() { return tickets_sold; }
...are equivalent (at least in terms of what the Standard encourages us to expect, but individual compiler heuristics may vary - inlining is a request that the compiler's free to ignore).
If the function bodies are specified later without the inline keyword, then the compiler is under no obligation to inline them, but may still choose to do so. It's much more likely to do so if they appear in the same translation unit (i.e. in the .cc/.cpp/.c++/etc. "implementation" file you're compiling or some header directly or indirectly included by it). If the implementation is only available at link time then the functions may not be inlined at all, but it depends on the way your particular compiler and linker interact and cooperate. It is not simply a matter of enabling optimisation and expecting magic. To prove this, consider the following code:
// inline.h:
void f();
// inline.cc:
#include <cstdio>
void f() { printf("f()\n"); }
// inline_app.cc:
#include "inline.h"
int main() { f(); }
Building this:
g++ -O4 -c inline.cc
g++ -O4 -o inline_app inline_app.cc inline.o
Investigating the inlining:
$ gdb inline_app
...
(gdb) break main
Breakpoint 1 at 0x80483f3
(gdb) break f
Breakpoint 2 at 0x8048416
(gdb) run
Starting program: /home/delroton/dev/inline_app
Breakpoint 1, 0x080483f3 in main ()
(gdb) next
Single stepping until exit from function main,
which has no line number information.
Breakpoint 2, 0x08048416 in f ()
(gdb) step
Single stepping until exit from function _Z1fv,
which has no line number information.
f()
0x080483fb in main ()
(gdb)
Notice the execution went from 0x080483f3 in main() to 0x08048416 in f() then back to 0x080483fb in main()... clearly not inlined. This illustrates that inlining can't be expected just because a function's implementation is trivial.
Notice that this example is with static linking of object files. Clearly, if you use library files you may actually want to avoid inlining of the functions specifically so that you can update the library without having to recompile the client code. It's even more useful for shared libraries where the linking is done implicitly at load time anyway.
Very often, classes providing trivial functions use the two forms of expected-inlined function definitions (i.e. inside class or with inline keyword) if those functions can be expected to be called inside any performance-critical loops, but the countering consideration is that by inlining a function you force client code to be recompiled (relatively slow, possibly no automated trigger) and relinked (fast, for shared libraries happens on next execution), rather than just relinked, in order to pick up changes to the function implementation.
These kind of considerations are annoying, but deliberate management of these tradeoffs is what allows enterprise use of C and C++ to scale to tens and hundreds of millions of lines and thousands of individual projects, all sharing various libraries over decades.
One other small detail: as a ballpark figure, an out-of-line get/set function is typically about an order of magnitude (10x) slower than the equivalent inlined code. That will obviously vary with CPU, compiler, optimisation level, variable type, cache hits/misses etc..
No, repetitive calls to member functions will not hurt.
If it's just a getter function, it will almost certainly be inlined by the C++ compiler (at least with release/optimized builds) and the Java Virtual Machine may "figure out" that a certain function is being called frequently and optimize for that. So there's pretty much no performance penalty for using functions in general.
You should always code for readability first. Of course, that's not to say that you should completely ignore performance outright, but if performance is unacceptable then you can always profile your code and see where the slowest parts are.
Also, by restricting access to the tickets_sold variable behind getter functions, you can pretty much guarantee that the only code that can modify the tickets_sold variable to member functions of Booth. This allows you to enforce invariants in program behavior.
For example, tickets_sold is obviously not going to be a negative value. That is an invariant of the structure. You can enforce that invariant by making tickets_sold private and making sure your member functions do not violate that invariant. The Booth class makes tickets_sold available as a "read-only data member" via a getter function to everyone else and still preserves the invariant.
Making it a public variable means that anybody can go and trample over the data in tickets_sold, which basically completely destroys your ability to enforce any invariants on tickets_sold. Which makes it possible for someone to write a negative number into tickets_sold, which is of course nonsensical.
The compiler is very likely to inline function calls like this.
class Booth {
public:
int get_tickets_sold() const { return tickets_sold; }
private:
int tickets_sold;
};
Your compiler should inline get_tickets_sold, I would be very surprised if it didn't. If not, you either need to use a new compiler or turn on optimizations.
Any compiler worth its salt will easily optimize the getters into direct member access. The only times that won't happen are when you have optimization explicitly disabled (e.g. for a debug build) or if you're using a brain-dead compiler (in which case, you should seriously consider ditching it for a real compiler).
The compiler will very likely do the work for you, but in general, for things like this I would approach it more from the C perspective rather than the Java perspective unless you want to make the member access a const reference. However, when dealing with integers, there's usually little value in using a const reference over a copy (at least in 32 bit environments since both are 4 bytes), so your example isn't really a good one here... Perhaps this may illustrate why you would use a getter/setter in C++:
class StringHolder
{
public:
const std::string& get_string() { return my_string; }
void set_string(const std::string& val) { if(!val.empty()) { my_string = val; } }
private
std::string my_string;
}
That prevents modification except through the setter which would then allow you to perform extra logic. However, in a simple class such as this, the value of this model is nil, you've just made the coder who is calling it type more and haven't really added any value. For such a class, I wouldn't have a getter/setter model.
I came across the following weird chunk of code.Imagine you have the following typedef:
typedef int (*MyFunctionPointer)(int param_1, int param_2);
And then , in a function , we are trying to run a function from a DLL in the following way:
LPCWSTR DllFileName; //Path to the dll stored here
LPCSTR _FunctionName; // (mangled) name of the function I want to test
MyFunctionPointer functionPointer;
HINSTANCE hInstLibrary = LoadLibrary( DllFileName );
FARPROC functionAddress = GetProcAddress( hInstLibrary, _FunctionName );
functionPointer = (MyFunctionPointer) functionAddress;
//The values are arbitrary
int a = 5;
int b = 10;
int result = 0;
result = functionPointer( a, b ); //Possible error?
The problem is, that there isn't any way of knowing if the functon whose address we got with LoadLibrary takes two integer arguments.The dll name is provided by the user at runtime, then the names of the exported functions are listed and the user selects the one to test ( again, at runtime :S:S ).
So, by doing the function call in the last line, aren't we opening the door to possible stack corruption? I know that this compiles, but what sort of run-time error is going to occur in the case that we are passing wrong arguments to the function we are pointing to?
There are three errors I can think of if the expected and used number or type of parameters and calling convention differ:
if the calling convention is different, wrong parameter values will be read
if the function actually expects more parameters than given, random values will be used as parameters (I'll let you imagine the consequences if pointers are involved)
in any case, the return address will be complete garbage, so random code with random data will be run as soon as the function returns.
In two words: Undefined behavior
I'm afraid there is no way to know - the programmer is required to know the prototype beforehand when getting the function pointer and using it.
If you don't know the prototype beforehand then I guess you need to implement some sort of protocol with the DLL where you can enumerate any function names and their parameters by calling known functions in the DLL. Of course, the DLL needs to be written to comply with this protocol.
If it's a __stdcall function and they've left the name mangling intact (both big ifs, but certainly possible nonetheless) the name will have #nn at the end, where nn is a number. That number is the number of bytes the function expects as arguments, and will clear off the stack before it returns.
So, if it's a major concern, you can look at the raw name of the function and check that the amount of data you're putting onto the stack matches the amount of data it's going to clear off the stack.
Note that this is still only a protection against Murphy, not Machiavelli. When you're creating a DLL, you can use an export file to change the names of functions. This is frequently used to strip off the name mangling -- but I'm pretty sure it would also let you rename a function from xxx#12 to xxx#16 (or whatever) to mislead the reader about the parameters it expects.
Edit: (primarily in reply to msalters's comment): it's true that you can't apply __stdcall to something like a member function, but you can certainly use it on things like global functions, whether they're written in C or C++.
For things like member functions, the exported name of the function will be mangled. In that case, you can use UndecorateSymbolName to get its full signature. Using that is somewhat nontrivial, but not outrageously complex either.
I do not think so, it is a good question, the only provision is that you MUST know what the parameters are for the function pointer to work, if you don't and blindly stuff the parameters and call it, it will crash or jump off into the woods never to be seen again... It is up to the programmer to convey the message on what the function expects and the type of parameters, luckily you could disassemble it and find out from looking at the stack pointer and expected address by way of the 'stack pointer' (sp) to find out the type of parameters.
Using PE Explorer for instance, you can find out what functions are used and examine the disassembly dump...
Hope this helps,
Best regards,
Tom.
It will either crash in the DLL code (since it got passed corrupt data), or: I think Visual C++ adds code in debug builds to detect this type of problem. It will say something like: "The value of ESP was not saved across a function call", and will point to code near the call. It helps but isn't totally robust - I don't think it'll stop you passing in the wrong but same-sized argument (eg. int instead of a char* parameter on x86). As other answers say, you just have to know, really.
There is no general answer. The Standard mandates that certain exceptions be thrown in certain circumstances, but aside from that describes how a conforming program will be executed, and sometimes says that certain violations must result in a diagnostic. (There may be something more specific here or there, but I certainly don't remember one.)
What the code is doing there isn't according to the Standard, and since there is a cast the compiler is entitled to go ahead and do whatever stupid thing the programmer wants without complaint. This would therefore be an implementation issue.
You could check your implementation documentation, but it's probably not there either. You could experiment, or study how function calls are done on your implementation.
Unfortunately, the answer is very likely to be that it'll screw something up without being immediately obvious.
Generally if you are calling LoadLibrary and GetProcByAddrees you have documentation that tells you the prototype. Even more commonly like with all of the windows.dll you are provided a header file. While this will cause an error if wrong its usually very easy to observe and not the kind of error that will sneak into production.
Most C/C++ compilers have the caller set up the stack before the call, and readjust the stack pointer afterwards. If the called function does not use pointer or reference arguments, there will be no memory corruption, although the results will be worthless. And as rerun says, pointer/reference mistakes almost always show up with a modicum of testing.