I have read that applying DEALLOCATE to an allocated array frees the space it was using. I deal with several allocatable arrays in my program, but never bother deallocating them. Is there a method to determine if/how not deallocating impacts the execution time?
Thanks in advance
PS: I am not prone to do this test directly (by comparing the execution time with and without deallocation) because the program depends on random variables whose values will eventually affect the performance.
Indeed, deallocation frees the memory occupied by the variables, but not always you need to do it manually.
If you know you won't need the content of the variable anymore AND you need to free up memory for other variables to be allocated (or for the system), you can use the deallocate statement.
However, deallocation occurs automatically when the variable goes out of scope (Fortran 95 or later, as pointed by #francescalus) or when you reach the end of the program.
Also, deallocation occurs automatically, when necessary, before assignment, if array's dimensions don't coincide or if the variable is polymorphic and have to assume a conformable dynamic type. (This behavior is Fortran2003 or later, and may need to be turned ON on some compilers).
Moreover, when an allocated object is argument-associated with a dummy argument that has the attribute INTENT(OUT), deallocation occur before entering the procedure.
** Warning for Pointer variables:**
If you allocated storage for a pointer variable explicitly (with the allocate statement), and after that you perform a pointer association ( => ), deallocation DOES NOT occur automatically. You are responsible for deallocating the variable before doing it or else memory leaks will happen.
As a final note, trying to deallocate a variable that is not allocated throws an error. You can check if an allocatable variable is allocated with the intrinsic function allocated.
Can deallocating variables no longer needed affect execution speed? Yes. Is it likely to in "normal" programs? No, if not preventing memory leaks.
There is no valuable heuristic of which I'm aware to help you determine usefulness of deallocating "for speed".
As previously mentioned, deallocating may be necessary for correctness or avoiding memory leaks.
However, if a program requires finalization of an allocatable variable for correctness then it will be necessary to have a deallocate statement for it: finalization does not occur when termination of execution comes about by a stop or end program statement.
Allocatable variables declared within a procedure (subroutine or function) without the save attribute (so-called unsaved local variables) are deallocated automatically when the procedure ends execution.
As a historical note, though, this wasn't true in Fortran 90. In Fortran 90 such variables were not deallocated and worse was that the allocation status of them became undefined (so that even the allocation status couldn't be queried). One really wanted a deallocate there. This deficiency was corrected in Fortran 95 but habits and code may live for a long time.
Related
An experienced C++ user told me that I should strive for using heap variables, i.e.:
A* obj = new A("A");
as opposed to:
A obj("A");
Aside from all that stuff about using pointers being nice and flexible, he said it's better to put things on the heap rather than the stack (something about the stack being smaller than the heap?). Is it true? If so why?
NB: I know about issues with lifetime. Let's assume I have managed the lifetime of these variables appropriately. (i.e. the only criteria of concern is heap vs. stack storage with no lifetime concern)
Depending on the context we can consider heap or stack. Every thread gets a stack and the thread executes instructions by invoking functions. When a function is called, the function variables are pushed to stack. And when the function returns the stack rollbacks and memory is reclaimed. Now there is a size limitation for the thread local stack, it varies and can be tweaked to some extent. Considering this if every object is created on stack and the object requires large memory, then the stack space will exhaust resulting to stackoverflow error. Besides this if the object is to be accessed by multiple threads then storing such object on stack makes no sense.
Thus small variables, small objects who's size can be determine at compile time and pointers should be stored on stack. The concern of storing objects on heap or free store is, memory management becomes difficult. There are chances of memory leak, which is bad. Also if application tries to access an object which is already deleted, then access violation can happen which can cause application crash.
C++11 introduces smart pointers (shared, unique) to make memory management with heap easier. The actual referenced object is on heap but is encapsulation by the smart pointer which is always on the stack. Hence when the stack rollbacks during function return event or during exception the destructor of smart pointer deletes the actual object on heap. In case of shared pointer the reference count is maintained and the actually object is deleted when the reference count is zero.
http://en.wikipedia.org/wiki/Smart_pointer
There are no general rules regarding use of stack allocated vs heap allocated variables. There are only guidelines, depending on what you are trying to do.
Here are some pros and cons:
Heap Allocation:
Pros:
more flexible - in case you have a lot of information that is not available at compile-time
bigger in size - you can allocate more - however, it's not infinite, so at some point your program might run out of memory if allocations/deallocations are not handled correctly
Cons:
slower - dynamic allocation is usually slower than stack allocation
may cause memory fragmentation - allocating and deallocating objects of different sizes will make the memory look like Swiss cheese :) causing some allocations to fail if there is no memory block of the required size available
harder to maintain - as you know each dynamic allocation must be followed by a deallocation, which should be done by the user - this is error prone as there are a lot of cases where people forget to match every malloc() call with a free() call or new() with delete()
Stack allocation:
Pros:
faster - which is important mostly on embedded systems (I believe that for embedded there is a MISRA rule which forbids dynamic allocation)
does not cause memory fragmentation
makes the behavior of applications more deterministic - e.g. removes the possibility to run out of memory at some point
less error prone - as the user is not needed to handle deallocation
Cons:
less flexible - you have to have all information available at compile-time (data size, data structure, etc.)
smaller in size - however there are ways to calculate total stack size of an application, so running out of stack can be avoided
I think this captures a few of the pros and cons. I'm sure there are more.
In the end it depends on what your application needs.
The stack should be prefered to the heap, as stack allocated variables are automatic variables: their destruction is done automatically when the program goes out of their context.
In fact, the lifespan of object created on the stack and on the heap is different:
The local variables of a function or a code block {} (not allocated by new), are on the stack. They are automatically destroyed when you are returning from the function. (their destructors are called and their memory is freed).
But, if you need something an object to be used outside of the the function, you will have to allocate in on the heap (using new) or return a copy.
Example:
void myFun()
{
A onStack; // On the stack
A* onHeap = new A(); // On the heap
// Do things...
} // End of the function onStack is destroyed, but the &onHeap is still alive
In this example, onHeap will still have its memory allocated when the function ends. Such that if you don't have a pointer to onHeap somewhere, you won't be able to delete it and free the memory. It's a memory leak as the memory will be lost until the program end.
However if you were to return a pointer on onStack, since onStack was destroyed when exiting the function, using the pointer could cause undefined behaviour. While using onHeap is still perfectly valid.
To better understand how stack variables are working, you should search information about the call stack such as this article on Wikipedia. It explains how the variables are stacked to be used in a function.
It is always better to avoid using new as much as possible in C++.
However, there are times when you cannot avoid it.
For ex:
Wanting variables to exist beyond their scopes.
So it should be horses for courses really, but if you have a choice always avoid heap allocated variables.
The answer is not as clear cut as some would make you believe.
In general, you should prefer automatic variables (on the stack) because it's just plain easier. However some situations call for dynamic allocations (on the heap):
unknown size at compile time
extensible (containers use heap allocation internally)
large objects
The latter is a bit tricky. In theory, the automatic variables could get allocated infinitely, but computers are finite and worse all, most of the times the size of the stack is finite too (which is an implementation issue).
Personally, I use the following guideline:
local objects are allocated automatically
local arrays are deferred to std::vector<T> which internally allocates them dynamically
it has served me well (which is just anecdotal evidence, obviously).
Note: you can (and probably should) tie the life of the dynamically allocated object to that of a stack variable using RAII: smart pointers or containers.
C++ has no mention of the Heap or the Stack. As far as the language is concerned they do not exist/are not separate things.
As for a practical answer - use what works best - do you need fast - do you need guarantees. Application A might be much better with everything on the Heap, App B might fragment OS memory so badly it kills the machine - there is no right answer :-(
Simply put, don't manage your own memory unless you need to. ;)
Stack = Static Data allocated during compile time. (not dynamic)
Heap = Dyanamic Data allocated during run time. (Very dynamic)
Although pointers are on the Stack...Those pointers are beautiful because they open the doors for dynamic, spontaneous creation of data (depending on how you code your program).
(But I'm just a savage, so why does it matter what i say)
Can anyone tell me how many times variable integer1 allocated and deallocated?
how about class_object? Is it true that both of them allocate and deallocate three times?
for(int i = 0; i < 3; i++){
int integer1;
Class1 class_object(some_parameter);
}
For local variables allocation and deallocation is something compiler specific. Allocation/deallocation for local variables means reserving space on the stack.
Most compilers though will move the allocation and deallocation of the variables out of the loop and reuse the same space for the variable every time.
So there would be one allocation, meaning changing the stack pointer, before the loop and one deallocation, meaning restoring the stack pointer, after the loop. Many compilers will compute the maximum space needed for the function and allocate it all only once on function entry. Stack space can also be reused when the compiler sees that the life time of a variable has ended or that it simply can't be accessed anymore by later code. So talking about allocation and dealocation is rather pointless here.
Aren't you more interested in the number of constrcutions and deconstructions hapening? In that case yes, the constructor for Class1 is called 3 times and the destructor too. But again the compiler can optimize that as long as the result behaves "as if" the constructor/destructor were called.
PS: if the address of something is never taken (or can be optimized away) then the compiler might not even reserve stack space and just keep the variable in a register for the whole lifetime.
For automatic (local stack) variables the compiler reserves some space on the stack.
In this case (if we ignore optimizations) the compiler will reserve space for integer1 and class_object that most probably will be reused in each loop iteration.
For basic data types nothing is done beyond this but for classes the compiler will call the constructor when entering the scope of the variable and call the destructor when the variable goes out of scope.
Most probable both variable get the same address on each loop iteration (but this does not have to be true from the standards point of view).
The term allocation usually refers to requesting some heap memory or other resource from the operating system. Regarding to this definition there is nothing allocated.
But assigning some stack space (or a register) to a automatic variables may also be called allocation most compiler will allocate memory once (by setting the stack frame to a value big enough on entering the routine.
Summary:
At the end it is totally up to the compiler. You are just guaranteed to get a valid object in its scope
I have found a problem when using some existing FORTRAN code. Although it had anticipated the need to deallocate arrays before re-allocating, this had never been necessary. I now need it to do this, and it doesn't function correctly.
The current pseudo-code is approximately:
MODULE mA
TYPE A
REAL, DIMENSION(:,:,:), ALLOCATABLE :: array
END TYPE
TYPE (A), POINTER :: sw(:)
END MODULE
Later, there is the code which allocates the size of 'array', which I'm now calling twice (hitherto only once):
...
IF (ALLOCATED(sw(1)%array)) DEALLOCATE(sw(1)%array, STAT=aviFail)
IF (aviFail.EQ.0) ALLOCATE(sw(1)%array(1,2,3), STAT=aviFail)
...
I've looked at the definition of ALLOCATE, DEALLOCATE and ALLOCATED, and I have found the following:
On the second time through, DEALLOCATE is called, but the STAT value is '1'
In case of failure (i.e. a positive STAT return), DEALLOCATE is meant to leave the original array untouched. It doesn't: it apparently clears it correctly (at least, according to the debugger).
In case of failure and no STAT being defined, DEALLOCATE is meant to terminate the program. It doesn't, but the following ALLOCATE statement fails with STAT value of '1'.
I had also inadvertently called ALLOCATE on the same array twice elsewhere, without DEALLOCATING first. According to the book, this should result in program termination. It not only works, but works correctly and the STAT return from the second ALLOCATE is '0'.
Does Intel FORTRAN handle these things differently, or is FORTRAN not as fussy about fulfilling its specification as C++?
Without seeing more of the implementation, it is difficult to give a detailed & targeted explanation, but I think it's likely to be the implementation of the pointer that is causing your problem. The "book" answers you gave on the behavior of ALLOCATE and DEALLOCATE sound correct, but you described how they behave when working directly with an allocatable array. ALLOCATE and DEALLOCATE may function differently (compiler dependent) when operating on a pointer. At the most basic level, allocating memory through a pointer requires more steps: 1) determine the type/dimension of object to be created for the pointer, 2) create and allocate an unnamed object of that type/dimension in memory, 3) associate the pointer with the new object. Depending on the implementation, compiler, and other factors these extra steps can add complexity to the observed behavior of a program.
Is there a particular reason for using a pointer in this implementation? If not, I would recommend switching to a simpler normal allocatable array to see if the problem persists.
Regarding you being able to ALLOCATE and array twice by mistake without the expected program termination: I think this is also related to your implementation using a pointer. The pointer you are re-allocating is already associated with a location in memory. It is likely that this association changes the manner in which the compiler handles the ALLOCATE statement as it is executed the second time. If the pointer is already associated with a memory position with the dimensions the ALLOCATE statement is asking for, then there is no reason to terminate the program or throw an error; the programmer is getting exactly what he or she asked for.
In closing, the ALLOCATE/DEALLOCATE statements and pointer association/nullification are handled differently by different compilers, so it's not surprising that your observing behavior not in accordance with "the book." I would recommend taking a look at whether you really need the pointer implementation and be sure to be applying memory management best practices as you code.
I wish to know what the following code does to memory:
program A
While (t < large number)
allocate(a)
...
end program
Is "allocate(a)" referring to the same memory location at each iteration, and is there memory leak if deallocate(a) before the end of the program is not used?
The answer is that it is an error to allocate an already allocated item, so this code example is erroneous.
Compilers that I tried notice the error at runtime if the item is declared as allocatable. They didn't notice if the item was declared with the pointer attribute. In that case you have a memory leak since memory has been reserved on earlier iterations but there is no longer a way to reach it since the pointer has been reused.
To answer your other question, it is impossible to leak memory with allocatable objects. For example, allocatable arrays with a local scope are deallocated upon reaching return or end (unless they are saved), allocatable type components are automatically deallocated along with their parent, etc.
Not deallocating an object before the end of a program is not really a leak in the sense of unaddressable memory, since your program still had access to it during execution. This memory will be reported by Valgrind as "still reachable". You might consider it better style to deallocate such objects, but you don't need to.
An experienced C++ user told me that I should strive for using heap variables, i.e.:
A* obj = new A("A");
as opposed to:
A obj("A");
Aside from all that stuff about using pointers being nice and flexible, he said it's better to put things on the heap rather than the stack (something about the stack being smaller than the heap?). Is it true? If so why?
NB: I know about issues with lifetime. Let's assume I have managed the lifetime of these variables appropriately. (i.e. the only criteria of concern is heap vs. stack storage with no lifetime concern)
Depending on the context we can consider heap or stack. Every thread gets a stack and the thread executes instructions by invoking functions. When a function is called, the function variables are pushed to stack. And when the function returns the stack rollbacks and memory is reclaimed. Now there is a size limitation for the thread local stack, it varies and can be tweaked to some extent. Considering this if every object is created on stack and the object requires large memory, then the stack space will exhaust resulting to stackoverflow error. Besides this if the object is to be accessed by multiple threads then storing such object on stack makes no sense.
Thus small variables, small objects who's size can be determine at compile time and pointers should be stored on stack. The concern of storing objects on heap or free store is, memory management becomes difficult. There are chances of memory leak, which is bad. Also if application tries to access an object which is already deleted, then access violation can happen which can cause application crash.
C++11 introduces smart pointers (shared, unique) to make memory management with heap easier. The actual referenced object is on heap but is encapsulation by the smart pointer which is always on the stack. Hence when the stack rollbacks during function return event or during exception the destructor of smart pointer deletes the actual object on heap. In case of shared pointer the reference count is maintained and the actually object is deleted when the reference count is zero.
http://en.wikipedia.org/wiki/Smart_pointer
There are no general rules regarding use of stack allocated vs heap allocated variables. There are only guidelines, depending on what you are trying to do.
Here are some pros and cons:
Heap Allocation:
Pros:
more flexible - in case you have a lot of information that is not available at compile-time
bigger in size - you can allocate more - however, it's not infinite, so at some point your program might run out of memory if allocations/deallocations are not handled correctly
Cons:
slower - dynamic allocation is usually slower than stack allocation
may cause memory fragmentation - allocating and deallocating objects of different sizes will make the memory look like Swiss cheese :) causing some allocations to fail if there is no memory block of the required size available
harder to maintain - as you know each dynamic allocation must be followed by a deallocation, which should be done by the user - this is error prone as there are a lot of cases where people forget to match every malloc() call with a free() call or new() with delete()
Stack allocation:
Pros:
faster - which is important mostly on embedded systems (I believe that for embedded there is a MISRA rule which forbids dynamic allocation)
does not cause memory fragmentation
makes the behavior of applications more deterministic - e.g. removes the possibility to run out of memory at some point
less error prone - as the user is not needed to handle deallocation
Cons:
less flexible - you have to have all information available at compile-time (data size, data structure, etc.)
smaller in size - however there are ways to calculate total stack size of an application, so running out of stack can be avoided
I think this captures a few of the pros and cons. I'm sure there are more.
In the end it depends on what your application needs.
The stack should be prefered to the heap, as stack allocated variables are automatic variables: their destruction is done automatically when the program goes out of their context.
In fact, the lifespan of object created on the stack and on the heap is different:
The local variables of a function or a code block {} (not allocated by new), are on the stack. They are automatically destroyed when you are returning from the function. (their destructors are called and their memory is freed).
But, if you need something an object to be used outside of the the function, you will have to allocate in on the heap (using new) or return a copy.
Example:
void myFun()
{
A onStack; // On the stack
A* onHeap = new A(); // On the heap
// Do things...
} // End of the function onStack is destroyed, but the &onHeap is still alive
In this example, onHeap will still have its memory allocated when the function ends. Such that if you don't have a pointer to onHeap somewhere, you won't be able to delete it and free the memory. It's a memory leak as the memory will be lost until the program end.
However if you were to return a pointer on onStack, since onStack was destroyed when exiting the function, using the pointer could cause undefined behaviour. While using onHeap is still perfectly valid.
To better understand how stack variables are working, you should search information about the call stack such as this article on Wikipedia. It explains how the variables are stacked to be used in a function.
It is always better to avoid using new as much as possible in C++.
However, there are times when you cannot avoid it.
For ex:
Wanting variables to exist beyond their scopes.
So it should be horses for courses really, but if you have a choice always avoid heap allocated variables.
The answer is not as clear cut as some would make you believe.
In general, you should prefer automatic variables (on the stack) because it's just plain easier. However some situations call for dynamic allocations (on the heap):
unknown size at compile time
extensible (containers use heap allocation internally)
large objects
The latter is a bit tricky. In theory, the automatic variables could get allocated infinitely, but computers are finite and worse all, most of the times the size of the stack is finite too (which is an implementation issue).
Personally, I use the following guideline:
local objects are allocated automatically
local arrays are deferred to std::vector<T> which internally allocates them dynamically
it has served me well (which is just anecdotal evidence, obviously).
Note: you can (and probably should) tie the life of the dynamically allocated object to that of a stack variable using RAII: smart pointers or containers.
C++ has no mention of the Heap or the Stack. As far as the language is concerned they do not exist/are not separate things.
As for a practical answer - use what works best - do you need fast - do you need guarantees. Application A might be much better with everything on the Heap, App B might fragment OS memory so badly it kills the machine - there is no right answer :-(
Simply put, don't manage your own memory unless you need to. ;)
Stack = Static Data allocated during compile time. (not dynamic)
Heap = Dyanamic Data allocated during run time. (Very dynamic)
Although pointers are on the Stack...Those pointers are beautiful because they open the doors for dynamic, spontaneous creation of data (depending on how you code your program).
(But I'm just a savage, so why does it matter what i say)