lifetime of declaration within a loop - c++

I have a loop as follows
while(1)
{
int i;
}
Does i get destroyed and recreated on the stack each time the loop occurs?

Theoretically, it gets recreated. In practice, it might be kept alive and reinitalized for optimization reasons.
But from your point of view, it gets recreated, and the compiler handles the optimization (i.e, keep it at it's innermost scope, as long as it's a pod type).

Not necessarily. Your compiler could choose to change it into
int i;
while(1) {
...
i = 0;
}
It may not be literally created and destroyed on the stack every time. However, semantically, that is what occurs,and when you use more complex types in C++ that have custom destruction behaviour then that is exactly what happens, although the compiler may still choose to hold the stack memory separately.

Conceptually, yes. But since there's nothing being done to the value, the compiler is very likely to generate code does nothing with the variable on each iteration of the loop. It can, for instance, allocate it in advance (when the function enters), since it's going to be used later.
Since you can't reference the variable outside the defining scope, that doesn't change the semantics.

In C you have to look at the assembly generated to know that (the compiler might have chosen to put it in a register).
What you know is that outside the loop you cannot access that particular object by any means (by name, by pointer, by hack, ...)

Related

Should I reset primitive member variable in destructor?

Please see following code,
class MyClass
{
public:
int i;
MyClass()
{
i = 10;
}
};
MyClass* pObj = nullptr;
int main()
{
{
MyClass obj;
pObj = &obj;
}
while (1)
{
cout << pObj->i; //pObj is dangling pointer, still no crash.
Sleep(1000);
}
return 0;
}
obj will die once it comes out of scope. But I tested in VS 2017, I see no crash even after I use it.
Is it good practice to reset int member varialbe i?
Accessing a member after an object got destroyed is undefined behavior. It may seem like a good to set members in a destructor to a predictable and most likely unexpected value, e.g., a rather large value or a value with specific bit pattern making it easy to recognize the value in a debugger.
However, this idea is flawed and dwarved by the system:
All classes would need to play along and instead of concentrating on creating correct code developers would spent time (both development time as well as run-time) making pointless change.
Compilers happen get rather smart and can detect that changes in destructor are not needed. Since a correct program cannot detect whether the change was made they may not make the change at all. This effect is an actual issue for security applications where, e.g., a password should be erased from memory so it cannot be read (using some non-portable mean).
Even if the value gets set to a specific value, memory gets reused and the values get overwritten. Especially with objects on the stack it is most likely that the memory is used for something else before you see the bad value in a debugger.
Even when resetting values you would necessarily see a "crash": a crash is caused by something being setup to protect against something invalid. In your example you are accessing an int on the stack: the stack will remain accessible from a CPU point of view and at best you'd get an unexpected value. Use of unusual pointer values typically leads to a crash because the memory management system tries to access a location which isn't mapped but even that isn't guaranteed: on a busy 32 bit system pretty much all memory may be in use. That is, trying to rely on undefined behavior being detect is also futile.
Correspondingly, it is much better to use good coding practices which avoid dangling references right away and concentrate on using these. For example, I'm always initializing members in the member initializer list, even in the rare cases they end up getting changed in the body of the constructor (i.e., you'd write your constructor as MyClass(): i() {}).
As a debugging tool it may be reasonable to replace the allocation functions (ideally the allocator object but potentially the global operator new()/operator delete() and family with a version which doesn't quickly hand out released memory and instead fills the released memory with a predictable pattern. Since these actions slow down the program you'd only use this code in a debug build but it is relatively simple to implement once and easy to enable/disable centrally it may be worth the effort. In practice I don't think even such a system pays off as use of managed pointers and proper design of ownership and lifetime avoid most errors due to dangling references.
The behaviour of code you gave is undefined. Partial case of undefined behaviour is working as expected, so here is nothing strange that the code works. Code can work now and it can broke anyway at any time depending of compiler version, compiler options, stack content and a moon phase.
So first and most important is to avoid dangling pointers (and all other kinds of undefined behaviour) everywhere.
What about clearing variables in destructor, I found a best practice:
Follow coding rules saving me from mistakes of access to unallocated or destroyed objects. I cannot describe it in a few words but rules are pretty common (see here and anywhere).
Analyze code by humans (code review) or by statical analyzers (like cppcheck or PVS-Studio or another) to avoid cases similar to one you described above.
Do not call delete manually, better use scoped_ptr or similar object lifetime managers. When delete is reasonable, I usually (usually) set pointer to nullptr after deletion to keep myself from mistakes.
Use pointers as rare as it possible. References are preferred.
When objects of my class used outside and I suspect that somebody can access it after deletion I can put signature field inside, set it to something like 0xDEAD in destructor and check at enter or every public method. Here be careful to not slow down your code to unacceptable speed.
After all of this setting i from your example to 0 or -1 is redundant. As for me it's not a thing you should focus your attention.

Does declaring a variable inside a loop body have any disadvantages? [duplicate]

This question already has answers here:
Is there any overhead to declaring a variable within a loop? (C++) [duplicate]
(13 answers)
Closed 5 years ago.
Suppose we have a loop that iterates many times:
for (int i=0; i < 1000000; ++i) {
int s = 100;
s += i;
cout << s;
}
We are only using s inside the loop body, so ideally we'd like to declare it there so it won't pollute the enclosing namespace.
I'm wondering if there's any disadvantage to that. For example, will it incur a performance cost, because the program re-declares s on every iteration?
Conceptually that variable is constructed and destructed on each iteration.
But does it affect performance? Well, you can check your case right here. Delete int on line 7 to switch between the loop-local and function-local variables.
Conclusion: no difference whatsoever. The assembly is the same!
So, just use what makes sense in your code. If you need one object per iteration, make one object per. The optimizer is smarter than you think. If that wasn't enough, you'd come back to it with profiling data and careful tweaking, not broad guidelines.
Yes. Declaring a variable inside a loop will cause it to be deconstructed and reconstructed on every iteration. This might not be noticeable with small loops and simple data types, which the compiler would optimize anyways, however when working with complex objects and large loops it is best to declare the variables outside.
If the variables for a loop use too much memory you can enclose the loop and the declarations in braces, causing all variables allocated inside the braces to be deleted after exiting. Mostly such micro-optimizations would not matter, but if you're using complex classes and such, just use initialize the variable outside and reset it every time.
Generally it is not a good idea to declare too many variables, it makes your code hard to read and increases memory usage. If you can, don't declare variables when you don't need to. Your example can be simplified to for(int i = 0;i<1000000;i++)cout<<i+100;, for example. If such optimizations are possible and they do not make your code hard to read, use them.
Destroying an int is a noop. The variable ceases to exist, but no runtime code need be run.
References or pointers to variables that cease to exist have undefined behaviour. Prior to initialization, newly created local variables have undefined state. So simply reusing an old variable for a new one is legal, the compiler doesn't have to prove there are no such outstanding references.
In this case, if it can prove that the value was constant 100, it can even skip everything except the first initialization. And it can do this initialization "early" as there is no defined way to detect it happening early. In this case it is easy, and most compilers will do it easily; in more complex cases, less so. If you mark it const, the compiler no longer has to prove it was unmodified, but rather can assume it!
Many of the areas that C++ leaves undefined exist in order to make certain optimizations easy.
Now, if you had something more complex, like a vector<int>{1,2,3,4,5}, destruction and creation becomes less of a noop. It still becomes possible to 'hoist' the variable out of the loop, but much harder for the compiler. This is because dynamic allocation is a bit hard to optimize out sometimes.

What's better to use and why?

class MyClass {
private:
unsigned int currentTimeMS;
public:
void update() {
currentTimeMS = getTimeMS();
// ...
}
};
class MyClass {
public:
void update() {
unsigned int currentTimeMS = getTimeMS();
// ...
}
};
update() calls in main game loop so in the second case we get a lot of allocation operations (unsigned int currentTimeMS). In the first case we get only one allocate and use that allocated variable before.
Which of this code better to use and why?
I recommend the second variant because it is stateless and the scope of the variable is smaller. Use the first one only if you really experience a performance issue, which I consider unlikely.
If you do not modify the variable value later, you should also consider to make it const in order to express this intent in your code and to give the compiler additional optimization options.
It depends upon your needs. If currentTimeMS is needed only temporarily in the update(), then surely declare it there. (in your case, #option2)
But if it's value is needed for the instance of the class (i.e. being used in some other method), then you should declare it as a field (in your case, #option1).
In the first example, you are saving the state of this class object. In the second one, you're not, so the currentTime will be lost the instant update() is called.
It is really up to you to decide which one you need.
The first case is defining a member variable the second a local variable. Basic class stuff. A private member variable is available to any function (method) in that class. a local variable is only available in the function in which it is declared.
Which of this code better to use and why?
First and foremost, the cited code is at best a tiny micro-optimization. Don't worry about such things unless you have to.
In fact, this is most likely a disoptimization. Sometimes automatic variables are allocated on the stack. Stack allocation is extremely fast (and even free sometimes). There is no need to worry. Other times, the compiler may place a small automatic variable such the unsigned int used here in a register. There's no allocation whatsoever.
Compare that to making the variable a data member of the class, and solely for the purpose of avoiding that allocation. Accessing that variable involves going through the this pointer. Pointer dereference has a cost, potentially well beyond that of adding an offset to a pointer. The dereference might result in a cache miss. Even worse, this dereferencing may well be performed every time the variable is referenced.
That said, sometimes it is better to create data members solely for the purpose of avoiding automatic variables in various member functions. Large arrays declared as local automatic variables might well result in stack overflow. Note, however, that making double big_array[2000][2000] a data member of MyClass will most likely make it impossible to have a variable of type MyClass be declared as a local automatic variable in some function.
The standard solution to the problems created by placing large arrays on the stack is to instead allocate them on the heap. This leads to another place where creating a data member to avoid a local variable can be beneficial. While stack allocation is extremely fast, heap allocation (e.g., new) is quite slow. A member function that is called repeatedly may benefit by making the automatic variable std::unique_ptr<double> big_array = std::make_unique<double>(2000*2000) a data member of MyClass.
Note that neither of the above applies to the sample code in the question. Note also that the last concern (making an heap-allocated variable a data member so as to avoid repeated allocations and deallocations) means that the code has to go through the this pointer to access that memory. In tight code, I've sometimes been forced to create a local automatic pointer variable such as double* local_pointer = this->some_pointer_member to avoid repeated traversals through this.

How can a global variable produce an error on desctruction while local variable does not?

I have a class that I find difficult to debug. The curiosity is that when I use it as local variable it behaves well, but when I use it as global variable it gives an error on program's exit.
For example:
A a;
int main(){
dosomething(a);
}
gives bad_alloc exception on program's exit.
While
int main(){
A a;
dosomething(a);
}
works well.
I am hoping that, by knowing in what circumstance this can happen, I can get to the bottom of the problem with the class. I tried the class in many context and the symptom is always the same, the global variable always have problem while the local variable works well, always.
Note 1: the class has a desctructor that calls a (non virtual) member function called flush that writes an internal buffer to disk and executes a shell command to process that file.
~A(){this->flush();}
Note 2: I will try to post MWE while I keep debugging (and simplifying the code). For the moment it will help to know typical cases where this can happen precisely to help debugging.
C++ doesn't define neither construction nor destruction order for the global objects. The consequence of this is that if there are two (or more) global objects that somehow depend on each other (in your case that would be A depends on buffers and shell execution), the dependent object might be already destroyed when the execution gets into destructor that uses that dependent object.
You need to guarantee the correct destruction order `manually'. The easiest way is to allocate the object on the stack (as your working example shows) or on the heap and destroy it using delete before returning from main (or calling exit, etc.).

Returning a Static Local Reference

Suppose I have a function that will return a large data structure, with the intention that the caller will immediately copy the return value:
Large large()
{
return Large();
}
Now suppose I do not want to rely on any kind of compiler optimizations such as return value optimization etc. Also suppose that I cannot rely on the C++11 move constructor. I would like to gather some opinions on the "correctness" of the following code:
const Large& large()
{
static Large large;
large = Large();
return large;
}
It should work as intended, but is it poor style to return a reference to a static local even if it is const qualified?
It all depends on what should work as expected means. In this case all callers will share references to the exact same variable. Also note that if callers will copy, then you are effectively disabling RVO (Return Value Optimization), which will work in all current compilers [*].
I would stay away from that approach as much as possible, it is not idiomatic and will probably cause confusion in many cases.
[*]The calling convention in all compilers I know of determines that a function that returns a large (i.e. does not fit a register) variable receives a hidden pointer to the location in which the caller has allocated the space for the variable. That is, the optization is forced by the calling convention.
I don't think there's any issue with doing this. So long as this code base is, and forever will be, single threaded.
Do this on a multithreaded piece of code, and you might never be able to figure out why your data are occasionally being randomly corrupted.