C++ - Global variable performance when it is likely in the cache - c++

I'm trying to understand if my global variable usage which is being done for convenience and ease of assembly generation has a positive side-effect or not (I guess I'm looking to rid myself of the guilt of having these globals).
Program Details:
Broken up into "operations". Each operation reads I/O then does heavy mathematical compute, lots of special casing of code paths via hand-written assembly.
Single-threaded, will never be multi-threaded
One global variable, a fixed-size pre-allocated array (128K)
One global variable, an integer that acts as a pointer
My justification for using global variables here is primarily that I can then just generate call instructions without having to pass parameters, setting up the stack, etc.
The calls will to be functions like this:
DoSomething1()
{
access global1's memory ...
increment global2 ...
reset code;
}
I can ofcourse generate code for parameters, but then I thought maybe the global variables will likely have a perf benefit as well, since the compiler is going to be a constant address for the access. Of course, my global is extremely likely to be in the cache as well.
Am I thinking about this right? Is it possible that using the global the way I describe will make the compiler try to do load/store as opposed to en register them? In fact, can the compiler en register a global variable?

Related

Declare var outside loop is bad?

I wrote this basic code for a DSP/audio application I'm making:
double input = 0.0;
for (int i = 0; i < nChannels; i++) {
input = inputs[i];
and some DSP engineering expert tell to me: "you should not declare it outside the loop, otherwise it create a dependency and the compiler can't deal with it as efficiently as possible."
He's talking about var input I think. Why this? Isn't better decleare once and overwrite it?
Maybe somethings to do with different memory location used? i.e. register instead of stack?
Good old K&R C compilers in the early eighties used to produce code as near as possible what the programmer wrote, and programmers used to do their best to produce optimized source code. Modern optimizing compilers can rework things provided the resulting code has same observable effects as the original code. So here, assuming the input variable is not used outside the loop, an optimizing compiler could optimize out the line double input = 0.0; because there are no observable effects until next assignation : input = inputs[i];. And it could the same factor the variable assignation outside the loop (whether in source C++ file it is inside or not) for the same reason.
Short story, unless you want to produce code for one specific compiler with one specific parameters set, and in that case you should thoroughly examine the generated assembly code, you should never worry for those low level optimizations. Some people say compiler is smarter than you, other say compiler will produce its own code whatever way I wrote mine.
What matters is just readability and variable scoping. Here input is functionaly local to the loop, so it should be declared inside the loop. Full stop. Any other optimization consideration is just useless, unless you do have special requirements for low level optimization (profiling showing that these lines require special processing).
Many people think that declaring a variable allocates some memory for you to use. It does not work like that. It does not allocate a register either.
It only creates for you a name (and an associated type) that you can use to link consumers of values with their producers.
On a 50 year old compiler (or one written by students in their 3rd year Compiler Construction course), that may be implemented by indeed allocating some memory for the variable on the stack, and using that every time the variable is referenced. It's simple, it works, and it's horribly inefficient. A good step up is putting local variables in registers when possible, but that uses registers inefficiently and it's not where we're at currently (have been for some time).
Linking consumers with producers creates a data flow graph. In most modern compilers, it's the edges in that graph that receive registers. This is completely removed from any variables as you declared them. They no longer exist. You can see this in action if you use -emit-llvm in clang.
So variables aren't real, they're just labels. Use them as you want.
It is better to declare variable inside the loop, but the reason is wrong.
There is a rule of thumb: declare variables in the smallest scope possible. Your code is more readable and less error prone this way.
As for performance question, it doesn't matter at all for any modern compiler where exactly you declare your variables. For example, clang eliminates variable entirely at -O1 from its own IR: https://godbolt.org/g/yjs4dA
One corner case, however: if you ever takes an address of input, variable can't be eliminated (easily), and you should declare it inside the loop, if you care about performance.

Performance: should I use a global variable in a function which gets called often?

First off, let me get of my chest the fact that I'm a greenhorn trying to do things the right way which means I get into a contradiction about what is the right way every now and then.
I am modifying a driver for a peripheral which contains a function - lets call it Send(). In the function I have a timestamp variable so the function loops for a specified amount of time.
So, should I declare the variable global (that way it is always in memory and no time is lost for declaring it each time the function runs) or do I leave the variable local to the function context (and avoid a bad design pattern with global variables)?
Please bear in mind that the function can be called multiple times per milisecond.
Speed of execution shouldn't be significantly different for a local vs. a global variable. The only real difference is where the variable lives. Local variables are allocated on the stack, global variables are in a different memory segment. It is true that local variables are allocated every time you enter a routine, but allocating memory is a single instruction to move the stack pointer.
There are much more important considerations when deciding if a variable should be global or local.
When implementing a driver, try to avoid global variables as much as possible, because:
They are thread-unsafe, and you have no idea about the scheduling scheme of the user application (in fact, even without threads, using multiple instances of the same driver is a potential problem).
It automatically yields the creation of data-section as part of the executable image of any application that links to your driver (which is something that the application programmer might want to avoid).
Did you profile a fully-optimized, release build of your code and identify the bottleneck to be small allocations in this function?
The change you are proposing is a micro-optimization; a change to a small part of your code with the intent to make it more efficient. If the question to the above question is "no" as I'd expect, you shouldn't even be thinking of such things.
Select the correct algorithm for your code. Write your code using idiomatic techniques. Do not write in micro-optimizations. You might be surprised how good your compiler is at optimizing your code for you. It will often be able to optimize away these small allocations, but even if it can't you still don't know if the performance penalty imposed by them is even noticeable or significant.
For drivers, with is usually position independent, global variables are accessed indirectly with GOT table unless IP-relative operations is available (i.e. x86_64, ARM, etc)
In case of GOT, you can think it as an extra indirect pointer.
However, even with an extra pointer it won't make any observable difference if it's "only" called in mill-second frequency.

Best practice: Variables, Functions and Arduino?

I am using Sublime and Arduino to program a Barometer (MS5611). But what is the best practice to store variables that is only used as temporary storage inside a specific function:
1) Create private variables in my header file for all variables used?
2) Create the variables inside the functions where they are used?
What takes most processing power and memory usage - (1) create them once as private variables and change the content with the functions, or (2) create the variables each time I call a function?
Always declare them inside the function. This improves readability as it shows the intent behind the declaration. Also it lowers the chance for mistakes.
Wherever possible as "const", e.g.
uint16_t sample_it() {
const uint16_t sample = analogRead(...);
const uint16_t result = do_somehting(sample);
return result;
}
Almost for the same reasons but this also gives the compiler more optimization options.
If and how variables are allocated is up to the compiler and its optimizer. Unless you have very tight performance constraints chances are that the compiler will optimize much better than you would. Actually using global variables instead will sometimes slow down your code. Of course you might avoid allocation. However you will pay by additional storage instructions. On the other hand the "allocation" might get optimized away and then your global variables code becomes slower than the local variables code.
depends on your sample rate, meaning how many times calls the function to save the data?
In any case, it is important to also take into account how empty the memory once you've collected and processed the data, in any case if you do not have a lot of variables, but you have to handle more functions that can use it is best to set them globally.
At least, I do so in my projects, and I have never had a problem.
you should avoid using global variables as they are allocated from the available heap RAM and exist (take up space) for the duration of the program (forever in embedded systems) Globals also make for less maintainable and more fragile programs.
If you only need the data inside a function, declare it there. There is almost no penalty (initialization only) and the used space is automatically returned when the function returns as local variables are placed on the stack as are passed parameters.

When are global variables actually considered good/recommended practice?

I've been reading a lot about why global variables are bad and why they should not be used. And yet most of the commonly used programming languages support globals in some way.
So my question is what is the reason global variables are still needed, do they offer some unique and irreplaceable advantage that cannot be implemented alternatively? Are there any benefits to global addressing compared to user specified custom indirection to retrieve an object out of its local scope?
As far as I understand, in modern programming languages, global addressing comes with the same performance penalty as calculating every offset from a memory address, whether it is an offset from the beginning of the "global" user memory or an offset from a this or any other pointer. So in terms of performance, the user can fake globals in the narrow cases they are needed using common pointer indirection without losing performance to real global variables. So what else? Are global variables really needed?
Global variables aren't generally bad because of their performance, they're bad because in significantly sized programs, they make it hard to encapsulate everything - there's information "leakage" which can often make it very difficult to figure out what's going on.
Basically the scope of your variables should be only what's required for your code to both work and be relatively easy to understand, and no more. Having global variables in a program which prints out the twelve-times tables is manageable, having them in a multi-million line accounting program is not so good.
I think this is another subject similar to goto - it's a "religious thing".
There is a lot of ways to "work around" globals, but if you are still accessing the same bit of memory in various places in the code you may have a problem.
Global variables are useful for some things, but should definitely be used "with care" (more so than goto, because the scope of misuse is greater).
There are two things that make global variables a problem:
1. It's hard to understand what is being done to the variable.
2. In a multithreaded environment, if a global is written from one thread and read by any other thread, you need synchronisation of some sort.
But there are times when globals are very useful. Having a config variable that holds all your configuration values that came from the config file of the application, for example. The alternative is to store it in some object that gets passed from one function to another, and it's just extra work that doesn't give any benefit. In particular if the config variables are read-only.
As a whole, however, I would suggest avoiding globals.
Global variables imply global state. This makes it impossible to store overlapping state that is local to a given part or function in your program.
For example, let stay we store the credentials of a given user in global variables which are used throughout our program. It will now be a lot more difficult to upgrade our program to allow multiple users at the same time. Had we just passed a user's state as a parameter, to our functions, we would have had a lot less problems upgrading to multiple users.
my question is what is the reason global variables are still needed,
Sometimes you need to access the same data from a lot of different functions. This is when you need globals.
For instance, I am working on a piece of code right now, that looks like this:
static runtime_thread *t0;
void
queue_thread (runtime_thread *newt)
{
t0 = newt;
do_something_else ();
}
void
kill_and_replace_thread (runtime_thread *newt)
{
t0->status = dead;
t0 = newt;
t0->status = runnable;
do_something_else ();
}
Note: Take the above as some sort of mixed C and pseudocode, to give you an idea of where a global is actually useful.
Static Global is almost mandatory when writing any cross platform library. These Global Variables are static so that they stay within the translation unit. There are few if any cross platform libraries that does not use static global variables because they have to hide their platform specific implementation to the user. These platform specific implementations are held in static global variables. Of course, if they use an opaque pointer and require the platform specific implementation to be held in such a structure, they could make a cross platform library without any static global. However, such an object needs to be passed to all functions within such a library. Therefore, you have a pass this opaque pointer everywhere, or make static global variables.
There's also the identifier limit issue. Compilers (especially older ones) have a limit to the number of identifiers they could handle within a scope. Many operating systems still use tons of #define instead of enumerations because their old compilers cannot handle the enumeration constants that bloat their identifiers. A proper rewrite of the header files could solve some of these.
Global variables are considered when you want to use them in every function including main. Also remember that if you initialize a variable globally, its initial value will be same in every function, however you can reinitialize it inside a function to use a different value for that variable in that function. In this way you don't have to declare the same variable again and again in each function. But yes they can cause trouble at times.
List item
Global names are available everywhere. You may unknowingly end up using a global when you think you are using a local
And if you make a mistake while declaring a global variable, then you'll have to apply the changes to the whole program like if you accidentally declared it to be int instead of float

C++ performance of global variables

For clarification: I know how evil globals are and when not to use them :)
Is there any performance penalty when accessing/setting a global variable vs. a local one in a compiled C++ program?
That would depend entirely on your machine architecture. Global variables are accessed via a single known address, whereas local variables are typically accessed by indexing off an address register. The chances of the difference between the two being significant is extremely remote, but if you think it will be important, you should write a test for your target architecture and measure the difference.
It depends but usually yes although it is a micro issue. Global variables should be reference-able from many contexts, which means that putting them into a register is not possible. While in the case of local variables, that is possible and preferable. In fact, the more narrower the scope the more the compiler has the opportunity to optimize access/modifying that variable.
Local variables are probably "faster" in many cases, but I don't think the performance gain would be noticeable or outweigh the additional maintenance cost of having many global variables. Everything I list below either has a negligible cost or can easily be dwarfed by just about any other inefficiency in your program. I would consider these to be a perfect example of a micro-optimization.
Local variables are on the stack, which is more likely to be in the cache. This point is moot if your global variable is frequently used, since it will therefore also be in the cache.
Local variables are scoped to the function - therefore, the compiler can presume that they won't be changed by any other function calls. With a global, the compiler may be forced to reload the global value.
On some 64-bit machines, getting the address of a global variable is a two-step process - you must also add the 32-bit offset of the global to a 64-bit base address. Local variables can always be directly accessed off of the stack pointer.
Strictly speaking, no.
Some things to consider:
Global variables increase the static size of your program in memory.
If access to the variable needs to be synchronized, that would incur some performance overhead.
There are a number of compiler optimisations that are possible with local variables but not with global variables, so in some cases you might see a difference in performance. I doubt that your global variable is being accessed in a performance-critical loop though (very bad design if it is !) so it's probably not going to be an issue.
It's more of the way how you use data stored in your variables that matters performance wise then how you declare them. I'm not sure about the correct terminology here, but one can define two types of data access. Shared access (where you access same data from different parts of the code) and private data, where each part has its own data. By default global variables imply shared access and local imply private access. But both types of access can be achieved with both types of variables (i.e. local pointers pointing to the same chunk of memory, or global array where each part of code access different part of array).
Shared access has better caching, lower memory footprint, but is harder to optimize, especially in multi threaded environment. It is also bad for scaling especially with NUMA architecture..
Private access is easier to optimise and better for scaling. Problems with private access usually exist in situation where you have multiple copies of same data. The problems usually associated with these scenario's are higher memory footprint, synchronization between copies, worse caching etc.
The answer is tied to the overall structure of the program.
For example, I just disassembled this, and in both cases the looping variable was moved into a register, after which there was no difference:
int n = 9;
int main()
{
for (n = 0; n < 10; ++n)
printf("%d", n);
for (int r = 0; r < 10; ++r)
printf("%d", r);
return 0;
}
Just to be sure, I did similar things with classes and again saw no difference. But if the global is in a different compilation unit that might change.
There is no performance penalty, either way, that you should be concerned about. In adition to what everyone else has said, you should also bear in mind the paging overhead. Local instance variables are fetched from the object structure which has likely (?) already been paged into cache memory). Global variables, on the other hand, may cause a different pattern of virtual memory paging.
Again, the performance is really not worth any consideration on your part.
In addition to other answers, I would just point out that accessing a global variable in a multithreaded environment is likely to be more expensive because you need to ensure it is locked properly and the threads may wait in line to access it. With local variables it is not an issue.