Imagine the following declaration:
void foo(){
const std::array<int, 80000> arr = {/* a lot of different values*/};
//do stuff
}
And a second one:
void foo(){
static const std::array<int, 80000> arr = {/* a lot of different values*/};
//do stuff
}
What are the possible performance differences between these two if any? And is there any danger associated with any of these solutions?
Forget the array for a moment. That muddles two separate issues. You've got answers that address the lifetime and storage issue. I'll address the initialization issue.
void f() {
static const int x = get_x();
// do something with x
}
void g() {
const int x = get_x();
// do something with x
}
The difference between these two is that the first one will only call get_x() the first time that f() is called; x retains that value through the remainder of the program. The second one will call get_x() each time that g() is called.
That matters if get_x() returns different values on subsequent calls:
int current_x = 0;
int get_x() { return current_x++; }
And is there any danger associated with any of these solutions?
Non-static is dangerous because the array is huge, and the memory reserved for automatic storage is limited. Depending on the system and configuration, that array could use about 30% of the space available for automatic storage. As such, it greatly increases the possibility of stack overflow.
While an optimiser might certainly avoid allocating memory on the stack, there are good reasons why you would want your non-optimised debug build to also not crash.
What are the possible performance differences between these two if any?And is there any danger associated with any of these solutions?
The difference depends exactly on how you use foo().
1st case:(low probability): Your implementation is such that you will call foo() only once , maybe you have created separate function to divide code logic as practiced. Well in this case declaring as static is very bad, because a static variable or object remains in memory until programs ends . So just imagine that your variable occupying memory unnecessarily.
2nd case:(high probability): Your implementation is such that you will call foo() again and again . Then non-static object will get allocated and de allocated again and again.This will take huge amount of cpu clock cycles which is not desired .Use static in this case.
In this particular context, one point to consider regarding using static on a variable with initialization:
From C++17 standard:
6.7.1 Static storage duration [basic.stc.static]
...
2 If a variable with static storage duration has initialization or a destructor with side effects, it shall not be eliminated even if it appears to be unused, except that a class object or its copy/move may be eliminated as specified in 15.8.
Related
Let's say that I have a function that I call a lot which has an array in it:
char foo[LENGTH];
Depending upon the value of LENGTH this may be expensive to allocate every time the function is called. I have seen:
static char foo[LENGTH];
So that it is only allocated once and that array is always used: https://en.cppreference.com/w/cpp/language/storage_duration#Static_local_variables
Is that best practice for arrays?
EDIT:
I've seen several responses that static locals are not best. But what about initialization cost? What if I'd called:
char foo[LENGTH] = "lorem ipsum";
Isn't that going to have to be copied every time I call the function?
As LENGTH is supposed to be a compile time constant (C++, no C99 VLA), foo is just going to use space on the stack. Very fast.
First off, time to allocate automatic array of char is not dependent on it's size, and on any sane implementation is a constant time complexity of incrementing stack pointer, which is superfast. Please note, this would be the same even for VLA (which are not valid in C++), only that increment would be a run-time operand. Also please note, the answer would be different if your array would be initialized.
So it is really unclear what performance drawback you are referring to.
On the other hand, if you make the said array static, you would incur no penalty whatsoever in the provided example - since char is not initialized, there will be no normal synchronization which prevents static variables from getting doubly initialized. However, your function will (likely) become thread-unsafe.
Bottom line: premature optimization is the root of evil.
"Allocating" an object of primitive data type and with automatic storage duration is usually not a big deal. The question is more: Do you want that the contents of foo to survive the execution of the function or not?
Consider, for example, following function:
char* bar() {
char foo[LENGTH];
strcpy(foo, "Hello!");
return foo; // returning a pointer to a local variable; undefined behaviour if someone will use it.
}
In this case, foo will go out of scope and will not be (legally) accessible when bar has finished.
Everything is OK, however, if you write
char* bar() {
static char foo[LENGTH];
strcpy(foo, "Hello!");
return foo; // foo has static storage duration and will be destroyed at the end of your program (not at the end of bar())
}
An issue with large variables with automatic storage duration might arise, if they get so large that they will exceed a (limited) stack size, or if you call the function recursively. To overcome this issue, however, you'd need to use dynamic memory allocation instead (i.e. new/delete).
A local variable (say an int) can be stored in a processor register, at least as long as its address is not needed anywhere. Consider a function computing something, say, a complicated hash:
int foo(int const* buffer, int size)
{
int a; // local variable
// perform heavy computations involving frequent reads and writes to a
return a;
}
Now assume that the buffer does not fit into memory. We write a class for computing the hash from chunks of data, calling foo multiple times:
struct A
{
void foo(int const* buffer, int size)
{
// perform heavy computations involving frequent reads and writes to a
}
int a;
};
A object;
while (...more data...)
{
A.foo(buffer, size);
}
// do something with object.a
The example may be a bit contrived. The important difference here is that a was a local variable in the free function and now is a member variable of the object, so the state is preserved across multiple calls.
Now the question: would it be legal for the compiler to load a at the beginning of the foo method into a register and store it back at the end? In effect this would mean that a second thread monitoring the object could never observe an intermediate value of a (synchronization and undefined behavior aside). Provided that speed is a major design goal of C++, this seems to be reasonable behavior. Is there anything in the standard that would keep a compiler from doing this? If no, do compilers actually do this? In other words, can we expect a (possibly small) performance penalty for using a member variable, aside from loading and storing it once at the beginning and the end of the function?
As far as I know, the C++ language itself does not even specify what a register is. However, I think that the question is clear anyway. Whereever this matters, I appreciate answers for a standard x86 or x64 architecture.
The compiler can do that if (and only if) it can prove that nothing else will access a during foo's execution.
That's a non-trivial problem in general; I don't think any compiler attempts to solve it.
Consider the (even more contrived) example
struct B
{
B (int& y) : x(y) {}
void bar() { x = 23; }
int& x;
};
struct A
{
int a;
void foo(B& b)
{
a = 12;
b.bar();
}
};
Looks innocent enough, but then we say
A baz;
B b(baz.a);
baz.foo(b);
"Optimising" this would leave 12 in baz.a, not 23, and that is clearly wrong.
Short answer to "Can a member variable (attribute) reside in a register?": yes.
When iterating through a buffer and writing the temporary result to any sort of primitive, wherever it resides, keeping the temporary result in a register would be a good optimization. This is done frequently in compilers. However, it is implementation based, even influenced by passed flags, so to know the result, you should check the generated assembly.
Do built-in types which are not defined dynamically, always stay in the same piece of memory during the duration of the program?
If it's something I should understand how do I go about and check it?
i.e.
int j = 0;
double k = 2.2;
double* p = &k;
Does the system architecture or compiler move around all these objects if a C/C++ program is, say, highly memory intensive?
Note: I'm not talking about containers such as std::vectors<T>. These can obviously reallocate in certain situations, but again this is dynamic.
side question:
The following scenario will obviously raise a few eyebrows. Just as an example, will this pointer always be valid during the duration of the program?
This side-question is obsolete, thanks to my ignorance!
struct null_deleter
{
void operator() (void const *) const {};
};
int main()
{
// define object
double b=0;
// define shared pointer
std::shared_ptr<double> ptr_store;
ptr_store.reset(&b,null_deleter()); // this works and behaves how you would expect
}
In the abstract machine, an object's address does not change during that object's lifetime.
(The word "object" here does not refer to "object-oriented" anything; an "object" is merely a region of storage.)
That really means that a program must behave as if an object's address never changes. A compiler can generate code that plays whatever games it likes, including moving objects around or not storing them anywhere at all, as long as such games don't affect the visible behavior in a way that violates the standard.
For example, this:
int n;
int *addr1 = &n;
int *addr2 = &n;
if (addr1 == addr2) {
std::cout << "Equal\n";
}
must print "Equal" -- but a clever optimizing compiler could legally eliminate everything but the output statement.
The ISO C standard states this explcitly, in section 6.2.4:
The lifetime of an object is the portion of program execution during
which storage is guaranteed to be reserved for it. An object exists,
has a constant address, and retains its last-stored value throughout
its lifetime.
with a (non-normative) footnote:
The term "constant address" means that two pointers to the object
constructed at possibly different times will compare equal. The
address may be different during two different executions of the same
program.
I haven't found a similar explicit statement in the C++ standard; either I'm missing it, or the authors considered it too obvious to bother stating.
The compiler is free to do whatever it wants, so long as it doesn't affect the observable program behaviour.
Firstly, consider that local variables might not even get put in memory (they might get stored in registers only, or optimized away entirely).
So even in your example where you take the address of a local variable, that doesn't mean that it has to live in a fixed location in memory. It depends what you go on to do with it, and whether the compiler is smart enough to optimize it. For example, this:
double k = 2.2;
double *p = &k;
*p = 3.3;
is probably equivalent to this:
double k = 3.3;
Yes and no.
Global variables will stay in the same place.
Stack variables (inside a function) will get allocated and deallocated each time the function is called and returns. For example:
void k(int);
void f() {
int x;
k(x);
}
void g() {
f();
}
int main() {
f();
g();
}
Here, the second time f() is called, it's x will be in a different location.
There are several answers to this question, depending on factors you haven't mentioned.
If a data object's address is never taken, then a conforming C program cannot tell whether or not it even has an address. It might exist only in registers, or be optimized completely out; if it does exist in memory, it need not have a fixed address.
Data objects with "automatic" storage duration (to first approximation, function-local variables not declared with static) are created each time their containing function is invoked and destroyed when it exits; there may be multiple copies of them at any given time, and there's no guarantee that a new instance of one has the same address as an old one.
We speak of the & operator as "taking the address" of a data object, but technically speaking that's not what it does. It constructs a pointer to that data object. Pointers are opaque entities in the C standard. If you inspect the bits (by converting to integer) the result is implementation-defined. And if you inspect the bits twice in a row there is no guarantee that you get the same number! A hypothetical garbage-collected C implementation could track all pointers to each datum and update them as necessary when it moved the heap around. (People have actually tried this. It tends to break programs that don't stick to the letter of the rules.)
What does the following statement mean?
Local and dynamically allocated variables have addresses that are not known by the compiler when the source file is compiled
I used to think that local variables are allocated addresses at compile time, but this address can change when it will go out of scope and then come in scope again during function calling. But the above statement says addresess of local variables are not known by the compiler. Then how are local variables allocated? Why can global variables' addresses be known at compile time??
Also, can you please provide a good link to read how local variables and other are allocated?
Thanks in advance!
The above quote is correct - the compiler typically doesn't know the address of local variables at compile-time. That said, the compiler probably knows the offset from the base of the stack frame at which a local variable will be located, but depending on the depth of the call stack, that might translate into a different address at runtime. As an example, consider this recursive code (which, by the way, is not by any means good code!):
int Factorial(int num) {
int result;
if (num == 0)
result = 1;
else
result = num * Factorial(num - 1);
return result;
}
Depending on the parameter num, this code might end up making several recursive calls, so there will be several copies of result in memory, each holding a different value. Consequently, the compiler can't know where they all will go. However, each instance of result will probably be offset the same amount from the base of the stack frame containing each Factorial invocation, though in theory the compiler might do other things like optimizing this code so that there is only one copy of result.
Typically, compilers allocate local variables by maintaining a model of the stack frame and tracking where the next free location in the stack frame is. That way, local variables can be allocated relative to the start of the stack frame, and when the function is called that relative address can be used, in conjunction with the stack address, to look up the location of that variable in the particular stack frame.
Global variables, on the other hand, can have their addresses known at compile-time. They differ from locals primarily in that there is always one copy of a global variable in a program. Local variables might exist 0 or more times depending on how execution goes. As a result of the fact that there is one unique copy of the global, the compiler can hardcode an address in for it.
As for further reading, if you'd like a fairly in-depth treatment of how a compiler can lay out variables, you may want to pick up a copy of Compilers: Principles, Techniques, and Tools, Second Edition by Aho, Lam, Sethi, and Ullman. Although much of this book concerns other compiler construction techniques, a large section of the book is dedicated to implementing code generation and the optimizations that can be used to improve generated code.
Hope this helps!
In my opinion the statement is not talking about runtime access to variables or scoping, but is trying to say something subtler.
The key here is that its "local and dynamically allocated" and "compile time".
I believe what the statement is saying is that those addresses can not be used as compile time constants. This is in contrast to the address of statically allocated variables, which can be used as compile time constants. One example of this is in templates:
template<int *>
class Klass
{
};
int x;
//OK as it uses address of a static variable;
Klass<&::x> x_klass;
int main()
{
int y;
Klass<&y> y_klass; //NOT OK since y is local.
}
It seems there are some additional constraints on templates that don't allow this to compile:
int main()
{
static int y;
Klass<&y> y_klass;
}
However other contexts that use compile time constants may be able to use &y.
And similarly I'd expect this to be invalid:
static int * p;
int main()
{
p = new int();
Klass<p> p_klass;
}
Since p's data is now dynamically allocated (even though p is static).
Address of dynamic variables are not known for the expected reason,
as they are allocated dynamically from memory pool.
Address of local variables are not known, because they reside on
"stack" memory region. Stack winding-unwinding of a program may
defer based on runtime conditions of the code flow.
For example:
void bar(); // forward declare
void foo ()
{
int i; // 'i' comes before 'j'
bar();
}
void bar ()
{
int j; // 'j' comes before 'i'
foo();
}
int main ()
{
if(...)
foo();
else
bar();
}
The if condition can be true or false and the result is known only at runtime. Based on that int i or int j would take place at appropriate offset on stack.
It's a nice question.
While executing the code, program is loaded into memory. Then the local variable gets the address. At compile time, source code is converted into machine language code so that it can be executed
POD means primitive data type without constructor and destructor.
I am curious, how compilers handle lazy initialization of POD static local variables. What is the implication of lazy initialization if the function are meant to be run inside tight loops in multithreaded applications? These are the possible choices. Which one is better?
void foo_1() {
static const int v[4] = {1, 2, 3, 4};
}
void foo_2() {
const int v[4] = {1, 2, 3, 4};
}
How about this? No lazy initialization, but slightly clumsy syntax?
struct Bar
{
static const int v[4];
void foo_3()
{
// do something
}
};
const int My::v[4] = {1, 2, 3, 4};
When a static variable is initialized with constant data, all compilers that I'm familiar with will initialize the values at compile time so that there is no run time overhead whatsoever.
If the variable isn't static it must be allocated on each function invocation, and the values must be copied into it. I suppose it's possible that the compiler might optimize this into a static if it's a const variable, except that const-ness can be cast away.
In foo_1(), v is initialized sometime before main() starts. In foo_2(), v is created and initialized every time foo_2() is called. Use foo_1() to eliminate that extra cost.
In the second example, Bar::v is also initialized sometime before main().
Performance is more complex than just allocation. For example, you could cause an extra cache line to have to be in cache with the static variable, because it's not contiguous with other local memory that you're using, and increase cache pressure, cache misses, and suchlike. In comparison to this cost, I would say that the incredibly tiny overhead of re-allocating the array on the stack every time would be very trivial. Not just that, but any compiler is excellent at optimizing things like that, whereas it can't do anything about static variables.
In any case, I would suggest that the performance difference between the two is minimal - even for inside a tight loop.
Finally, you may as well use foo_2()- the compiler is perfectly within it's rights to make a variable like that static. As it was initially defined as const, const_casting the const away is undefined behaviour, regardless of whether or not it's static. However, it can't choose to make a static constant non-static, as you could be depending upon the ability to return it's address, for example.
An easy method to find out how variables are initialized is to print an assembly language listing of a function that has static and local variables.
Not all compiler initialize variables in the same method. Here is a common practice:
Before the main() method global variables are initialized by copying a section of values into the variables. Many compilers will place the constants into an area so that the data can be assigned using simple assembly move or copy instructions.
Local variables (variables with local scope) may be initialized upon entering the local scope and before the first statement in the scope is executed. This depends upon many factors, one of them is the constness of the variable.
Constants may be placed directly into the executable code, or they may be a pointer to a value in ROM, or copied into memory or register. This is decided by the compiler for best performance or code size, depending on the compiler's settings.
On the technical side, foo_1 and foo_3 are required to initialize their arrays before any functions, including class constructors, are called. That guarantee is essentially as good as no runtime. And in practice, most implementations don't need any runtime to initialize them.
This guarantee applies only to objects of POD type with static storage duration which are initialized with "constant expressions". A few more contrasting examples:
void foo_4() {
static const int v[4] = { firstv(), 2, 3, 4 };
}
namespace { // anonymous
const int foo_5_data[4] = { firstv(), 2, 3, 4 };
}
void foo_5() {
const int (&v)[4] = foo_5_data;
}
The data for foo_4 is initialized the first time foo_4 is called. (Check your compiler documentation to find out whether this is thread-safe!)
The data for foo_5 is initialized at some time before main() but might be after some other dynamic initializations.
But none of this really answers questions about performance, and I'm not qualified to comment on that. #DeadMG's answer looks helpful.
You have a static initialization in all those cases, all your static variables will be initialized by the virtue of loading data segment into memory. The const in foo_2 can be initialized away if compiler finds it possible.
If you had a dynamic initialization, then initialization of variables in the namespace scope can be deferred until their first use. Similarly, dynamic initialization of local static variables in the scope of function can be performed during the first pass through the function or earlier. Additionally, compiler can statically initialize those variables if it's able to do that. I don't remember the exact verbiage from the Standard.