Organization of a c++ program in memory - stack and heap [duplicate] - c++

This question already has answers here:
What and where are the stack and heap?
(31 answers)
Closed 8 years ago.
I am learning c++ and would like to know how a program like this is organized in primary-memory. I understand that there are a stack (with stackframes) and a heap. And I know that dynamically allocating something allocates it on the heap. This is done by operators like malloc or new. But I cant see them in this small c++ program.
The program consists of a main-class and a class named MyClass. This class has:
one constructor
one member variable (int)
one member-function
The main-class defines an object to Myclass and as well defines a pointer to this object.
SO - how are all this organized in memory?
#include <iostream>
using namespace std;
class MyClass {
int i;
public:
MyClass(int n) {
i = n;
}
int get_nmbr() {
return this->i;
}
};
int main() {
MyClass myClass(100), *p;
cout << myClass.get_nmbr() << endl;
p = &myClass;
cout << p;
return 0;
}

Let's go through this line by line.
int main() {
A new function starts with this line, and therewith a new scope.
MyClass myClass(100), *p;
Two things happen here. One, a variable myClass is declared within the function's scope, which makes it a local variable and thus it's allocated on the stack. The compiler will emit machine instructions that reserve enough space on the stack (usually by bumping the sp stack pointer register), and then a call to the class constructor executes. The this pointer passed to the constructor is the base of the stack allocation.
The second variable p is just a local pointer, and the compiler (depending on optimizations) may store this value on the local stack or in a register.
cout << myClass.get_nmbr() << endl;
Call the get_nmbr() method of the local myClass instance. Again, the this pointer points to the local stack frame allocation. This function finds the value of instance variable i and returns it to the caller. Note that, because the object is allocated on the stack frame, i lives on the stack frame as well.
p = &myClass;
Store the address of the myClass instance in variable p. This is a stack address.
cout << p;
return 0;
}
Print out the local variable p and return.
All of your code is only concerned with stack allocations. The result of that is that when the function's scope is left/closed at execution time (e.g. the function returns), the object will be automatically "destructed" and its memory freed. If there are pointers like p that you return from that function, you're looking at a dangling pointer, i.e. a pointer that points at an object that's freed and destructed. (The behavior of a memory access through such a dangling pointer is "undefined" as per language standard.)
If you want to allocate an object on the heap, and therefore expand its lifetime beyond the scope wherein it's declared, then you use the new operator in C++. Under the hood, new calls malloc and then calls an appropriate constructor.
You could extend your above example to something like the following:
{
MyClass stackObj(100); // Allocate an instance of MyClass on the function's stack frame
MyClass *heapObj = new MyClass(100); // Allocate an instance of MyClass from the process heap.
printf("stack = %p heap = %p\n", stackObj, heapObj);
// Scope closes, thus call the stackObj destructor, but no need to free stackObj memory as this is done automatically when the containing function returns.
delete heapObj; // Call heapObj destructor and free the heap allocation.
}
Note: You may want to take a look at placement new, and perhaps auto pointers and shared pointers in this context.

First things first. In C++ you should not use malloc.
In this program, all memory used is on the stack. Let's look at them one at a time:
MyClass myClass(100);
myClass is an automatic variable on the stack with a size equal to sizeof(MyClass);. This includes the member i.
MyClass *p;
p is an automatic variable on the stack which points to an instance of MyClass. Since it's not initialized it can point anywhere and should not be used. Its size is equal to sizeof(MyClass*); and it's probably (but not necessarily) placed on the stack immediately after myClass.
cout << myClass.get_nmbr() << endl;
MyClass::get_nmbr() returns a copy of the member i of the myClass instance. This copy is probably optimized away and therefore won't occupy any additional memory. If it did, it would be on the stack. The behavior of ostream& operator<< (int val); is implementation specific.
p = &myClass;
Here p is pointing to a valid MyClass instance, but since the instance already existed, nothing changes in the memory layout.
cout << p;
Outputs the address of myClass. Once again the behavior of operator<< is implementation specific.
This program (kind of) visualizes the layout of the automatic variables.

You have myClass object and pointer p (declared as MyClass *) created on stack. So. within myClass object the data members i also gets created on stack as part of myClass object. The pointer p is stored on stack, but you are not allocating for p, rather you are having p assigned to address of myClass object which is already stored on stack. So, no heap allocation here, at least from the program segment that you posted.

myClass is allocated on the stack. This program doesn't use heap exept maybe for allocating a stdout buffer behind the scenes. When myClass goes out of scope (e.g. when main returns or throws an exception), myClass is destructed and the stack is released, making p invalid.
In modern C++ it is considered safer to use smart pointers, such as the shared_ptr, instead of raw pointers.
To allocate myClass on the heap you could write:
#include <memory>
int main() {
std::shared_ptr<MyClass> p (new MyClass (100)); // Two heap allocations: for reference counter and for MyClass.
auto p2 = std::make_shared<MyClass> (101); // One heap allocation: reference counter and MyClass stored together.
return 0;
}

Related

C++ structs and arrays on stack and heap, values, references, and dereferencing questions

I'm a bit rusty at this. I'm trying to figure out exactly what happens with object declaration, values and references, stack and heap. I know the basics, but am unsure about the following examples:
struct MyObject
{
int foo;
MyObject(int val)
{
foo = val;
}
MyObject()//[EDIT] default constructor
{
}
}
//...
//1)
//this reserves memory on the stack for MyObject0 and initializes foo:
MyObject a = MyObject(1); //[EDIT] fixed from comments
//2)
MyObject* b = new MyObject(2);//This I know reserves memory on the heap,
//returns a pointer to it (and calls constructor).
//There are also other ways like malloc or memset.
//3)
MyObject c = *(new MyObject(3));// So here we instantiate MyObject on the heap,
//return a pointer to it but the pointer is dereferenced outside the parenthesis.
//Is c now a newly created value type copy on the stack?
//And is the original new MyObject(3) now inaccessible in the heap memory (leaked)
//because the pointer to it was never stored?
//4)
// void myFunc( MyObject c){}
myFunc(c); //Am I doing a member-wise assignment from c to a new function-scope
//temporary MyObject reserved somewhere else in memory (in a stack frame?)?
//Or am I somehow passing a reference to c even though c is not a pointer or reference?
//[EDIT] myFunc(c); would pass as a reference if I had void myFunc( MyObject &c){}
Finally, if I have a MyObject[] myObjArr1 = new MyObject[10]() I have 10 uninitialized MyObject structs on the heap, and a pointer of type MyObject Array on the heap as well.
How do I declare a MyObject array, on the stack, where the MyObject structs within the array are also on the stack? Or is that not a thing you do?
Let me re-comment on a more precise way:
1) This creates an a variable on the stack initializing it with the value of the temporary MyObject(1), that is created on the stack as a MyObject calling MyObject::MyObject(1) as constructor. The compiler may eliminate the intermediate steps. You can do it simply as MyObject a(1); Note: the = is not an assignment
2) Creates a MyObject on heap by calling MyObject::MyObject(2) and returns a pointer to it, used initialize the b pointer.
Forget about malloc and memset: they allocate memory, but DON'T
CONSTRUCT objects. The sooner you will forgot them, the better for you
and anybody else.
3) a Myobject is created on the heap and constructed by calling Myobkect::Myobject(3), then The new returned pointer is dereferenced and the resulting value used to initialize c by copy. The on-heap created object is "forgotten" in there. (Memory leak: you have no chance to access it further)
4) The object c is copied in the local parameter c of myFunc (in fact the two c are two distinct objects) that will exist into the stack up to the function exit.

object created in function, is it saved on stack or on heap?

I am using c++ specifically:
when I create an object in a function, will this object be saved on the stack or on the heap?
reason I am asking is since I need to save a pointer to an object, and the only place the object can be created is within functions, so if I have a pointer to that object and the method finishes, the pointer might be pointing to garbage after.
--> if I add a pointer to the object to a list (which is a member of the class) and then the method finishes I might have the element in the list pointing to garbage.
so again - when the object is created in a method, is it saved on the stack (where it will be irrelevant after the function ends) or is it saved on the heap (therefore I can point to it without causing any issues..)?
example:
class blah{
private:
list<*blee> b;
public:
void addBlee() {
blee b;
blee* bp = &b;
list.push_front(bp);
}
}
you can ignore syntax issues -- the above is just to understand the concept and dilemma...
Thanks all!
Keep in mind following thing: the object is NEVER created in the heap (more correctly called 'dynamic storage' in C++) unless explicitly allocated on the heap using new operator or malloc variant.
Everything else is either stack/register (which in C++ should be called 'automatic storage') or a a statically allocated variable. An example of statically allocated variables are global variables in your program, variables local to the function which are declared static or static data members of classess.
You also need to very clear disntguish between a pointer and the object itself. In the following single line:
void foo() {
int* i = new int(42);
}
int is allocated dynamically (on the heap), while pointer to that allocated int has an automatic storage (stack or register). As a result, once foo() exits, the pointer is obliterated, but the dynamically allocated object remains without any means to access it. This is called classic memory leak.
Heap is the segment where dynamic memory allocation usually takes place so when ever you explicitly allocate memory to anything in a program you have given it memory from the heap segment.
Obj yo = new Obj; //something like this.
Stack is the another segment where automatic variables are stored, along with information that is saved each time a function is called.
Like when we declare:
*int* variable;
It will be on the stack and its validity is only till a particular function exits, whereas variables, objects etc on the heap remain there till main() finishes.
void addBlee() {
blee b; // this is created on the stack
blee* bp = &b;
list.push_front(bp); // this is a pointer to that object
} // after this brace b no longer exists, hence you have a dangling pointer
Do this instead:
list.push_front(new blee); // this is created on the heap and
// therefore it's lifetime is until you call
// delete on it`

Where are pointers in C++ stored, on the stack or in the heap?

I am trying to understand the difference between the stack and heap memory, and this question on SO as well as this explanation did a pretty good job explaining the basics.
In the second explanation however, I came across an example to which I have a specific question, the example is this:
It is explained that the object m is allocated on the heap, I am just wondering if this is the full story. According to my understanding, the object itself indeed is allocated on the heap as the new keyword has been used for its instantiation.
However, isn't it that the pointer to object m is on the same time allocated on the stack? Otherwise, how would the object itself, which of course is sitting in the heap be accessed. I feel like for the sake of completeness, this should have been mentioned in this tutorial, leaving it out causes a bit of confusion to me, so I hope someone can clear this up and tell me that I am right with my understanding that this example should have basically two statements that would have to say:
1. a pointer to object m has been allocated on the stack
2. the object m itself (so the data that it carries, as well as access to its methods) has been allocated on the heap
Your understanding may be correct, but the statements are wrong:
A pointer to object m has been allocated on the stack.
m is the pointer. It is on the stack. Perhaps you meant pointer to a Member object.
The object m itself (the data that it carries, as well as access to its methods) has been allocated on the heap.
Correct would be to say the object pointed by m is created on the heap
In general, any function/method local object and function parameters are created on the stack. Since m is a function local object, it is on the stack, but the object pointed to by m is on the heap.
"stack" and "heap" are general programming jargon. In particular , no storage is required to be managed internally via a stack or a heap data structure.
C++ has the following storage classes
static
automatic
dynamic
thread
Roughly, dynamic corresponds to "heap", and automatic corresponds to "stack".
Moving onto your question: a pointer can be created in any of these four storage classes; and objects being pointed to can also be in any of these storage classes. Some examples:
void func()
{
int *p = new int; // automatic pointer to dynamic object
int q; // automatic object
int *r = &q; // automatic pointer to automatic object
static int *s = p; // static pointer to dynamic object
static int *s = r; // static pointer to automatic object (bad idea)
thread_local int **t = &s; // thread pointer to static object
}
Named variables declared with no specifier are automatic if within a function, or static otherwise.
When you declare a variable in a function, it always goes on the stack. So your variable Member* m is created on the stack. Note that by itself, m is just a pointer; it doesn't point to anything. You can use it to point to an object on either the stack or heap, or to nothing at all.
Declaring a variable in a class or struct is different -- those go where ever the class or struct is instantiated.
To create something on the heap, you use new or std::malloc (or their variants). In your example, you create an object on the heap using new and assign its address to m. Objects on the heap need to be released to avoid memory leaks. If allocated using new, you need to use delete; if allocated using std::malloc, you need to use std::free. The better approach is usually to use a "smart pointer", which is an object that holds a pointer and has a destructor that releases it.
Yes, the pointer is allocated on the stack but the object that pointer points to is allocated on the heap. You're correct.
However, isn't it that the pointer to object m is on the same time
allocated on the stack?
I suppose you meant the Member object. The pointer is allocated on the stack and will last there for the entire duration of the function (or its scope). After that, the code might still work:
#include <iostream>
using namespace std;
struct Object {
int somedata;
};
Object** globalPtrToPtr; // This is into another area called
// "data segment", could be heap or stack
void function() {
Object* pointerOnTheStack = new Object;
globalPtrToPtr = &pointerOnTheStack;
cout << "*globalPtrToPtr = " << *globalPtrToPtr << endl;
} // pointerOnTheStack is NO LONGER valid after the function exits
int main() {
// This can give an access violation,
// a different value after the pointer destruction
// or even the same value as before, randomly - Undefined Behavior
cout << "*globalPtrToPtr = " << *globalPtrToPtr << endl;
return 0;
}
http://ideone.com/BwUVgm
The above code stores the address of a pointer residing on the stack (and leaks memory too because it doesn't free Object's allocated memory with delete).
Since after exiting the function the pointer is "destroyed" (i.e. its memory can be used for whatever pleases the program), you can no longer safely access it.
The above program can either: run properly, crash or give you a different result. Accessing freed or deallocated memory is called undefined behavior.

the scope of a pointer?

Ok, so I did find some questions that were almost similar but they actually confused me even more about pointers.
C++ Pointer Objects vs. Non Pointer Objects
In the link above, they say that if you declare a pointer it is actually saved on the heap and not on the stack, regardless of where it was declared at. Is this true ?? Or am I misunderstanding ??? I thought that regardless of a pointer or non pointer, if its a global variable, it lives as long as the application. If its a local variable or declared within a loop or function, its life is only as long as the code within it.
The variable itself is stored on the stack or DATA segment, but the memory it points to after being allocated with new is within the heap.
void main()
{
int* p; // p is stored on stack
p = new int[20]; // 20 ints are stored on heap
}
// p no longer exists, but the 20 ints DO EXSIST!
Hope that helps.
void func()
{
int x = 1;
int *p = &x;
}
// p goes out of scope, so does the memory it points to
void func()
{
int *p = new int;
}
// p goes out of scope, the memory it points to DOES NOT
void func()
{
int x = 1;
int **pp = new int*;
*pp = &x;
}
// pp goes out of scope, the pointer it points to does not, the memory it points to does
And so forth. A pointer is a variable that contains a memory location. Like all variables, it can be on the heap or the stack, depending on how it's declared. It's value -- the memory location -- can also exist on the heap or the stack.
Typically, if you statically allocate something, it's on the stack. If you dynamically allocate something (using either new or malloc) then it's on the heap. Generally speaking you can only access dynamically allocated memory using a pointer. This is probably where the confusion arises.
It is necessary to distinguish between the pointer (a variable that holds a memory location) and the object to which the pointer points (the object at the memory address held by the pointer). A pointer can point to objects on the stack or on the heap. If you use new to allocate the object, it will be on the heap. The pointer can, likewise, live on the heap. If you declare it in the body of a function, then it will be a local variable and live in local storage (i.e. on the stack), whereas if it is a global variable, it will live somewhere in your application's data section. You can also have pointers to pointers, and similarly one can allocate a pointer on the heap (and have a pointer-to-a-pointer pointing to that), etc. Note that while I have referenced the heap and stack, the C++ only mentions local/automatic storage and dynamic storage... it does not speak to the implementation. In practice, though, local=stack and dynamic=heap.
A pointer is a variable containing the address of some other object in memory. The pointer variable can be allocated:
on the stack (as a local auto variable in a function or statement block)
statically (as a global variable or static class member)
on the heap (as a new object or as a class object member)
The object that the pointer points to (references) can likewise be allocated in these three places as well. Generally speaking, though, a pointed-to object is allocated using the new operator.
Local variables go out of scope when the program flow leaves the block (or function) that they are declared within, i.e., their presence on the stack disappears. Similarly, member variables of an object disappear when their parent object goes out of scope or is deleted from the heap.
If a pointer goes out of scope or its parent object is deleted, the object that the pointer references still exists in memory. Thus the rule of thumb that any code that allocates (news) an object owns the object and should also delete that object when it's no longer needed.
Auto-pointers take some of the drudgery out of the management of the pointed-to object. An object that has been allocated through an auto_ptr is deleted when that pointer goes out of scope. The object can be assigned from its owning auto_ptr to another auto_ptr, which transfers object ownership to the second pointer.
References are essentially pointers in disguise, but that's a topic for another discussion.
I thought that regardless of a pointer
or non pointer, if its a global
variable, it lives as long as the
application. If its a local variable
or declared within a loop or function,
its life is only as long as the code
within it.
That's true.
they say that if you declare a pointer
it is actually saved on the heap and
not on the stack
That's wrong, partially. You can have a pointer on the heap or the stack. It's a matter of where and how you declare it.
void main()
{
char c = 0x25;
char *p_stack = &c; // pointer on stack
StructWithPointer struct_stack; // stack
StructWithPointer *struct_heap = new StructWithPointer(); // heap, thus its pointer member "p" (see next line) is also on the heap.
struct_heap->p = &c; // pointer on heap points to a stack
}
... and, a compiler might decide to use a register for a pointer!
Looks like you need to grab the classic K&R C book and read through chapters 4 & 5 for thorough understanding of the differences between declaration and definition, scope of a variable and about pointers.

memory questions, new and free etc. (C++)

I have a few questions regarding memory handling in C++.
What's the different with Mystruct *s = new Mystruct and Mystruct s? What happens in the memory?
Looking at this code:
struct MyStruct{
int i;
float f;
};
MyStruct *create(){
MyStruct tmp;
tmp.i = 1337;
tmp.j = .5f;
return &tmp;
}
int main(){
MyStruct *s = create();
cout << s->i;
return 0;
}
When is MyStruct tmp free'd?
Why doesn't MyStruct tmp get automatically free'd in the end of create()?
Thank you!
When you use the new keyword to get a pointer, your struct is allocated on the heap which ensures it will persist for the lifetime of your application (or until it's deleted).
When you don't, the struct is allocated on the stack and will be destroyed when the scope it was allocated in terminates.
My understanding of your example (please don't hesitate to inform me if I'm wrong, anyone):
tmp will indeed be "freed" (not the best word choice for a stack variable) at the end of the function since it was allocated on the stack and that stack frame has been lost. The pointer/memory address you return does not have any meaning anymore, and if the code works, you basically just got lucky (nothing has overwritten the old data yet).
For question 1, you are looking at the heap memory and the stack memory. In short,
Mystruct S;
creates S on the stack. When S goes out of scope, it will be destroyed. Hence if S is inside a function, when the function returns, S is destroyed.
Whereas
MyStruct *S = new MyStruct();
Is on the heap. It is a block of memory set aside for programs to store variables in, and S will store a pointer to the start memory block of the new MyStruct. It will always be within the heap until you free it; if you do not free it when your program ends, you get the nefarious memory leak.
On to question 2 - the local MyStruct is destroyed upon the function exiting; the MyStruct pointer which points to its return value is pointing to undefined region. It may still work, because the OS has not yet reclaimed the memory, but it is definitely not correct behavior - or a safe thing to do.
First:
Mystruct* s = new Mystruct;
The new Mystryct part allocates memory on the heap for an object of that type. In C++ it will also execute the default constructor of the type. The Mystruct* s part declares a pointer variable that points to the address of the first byte of the newly allocated object memory.
Second:
Mystruct s;
It'll do the same as the first with two differences, which can be simplified as: The allocated memory for the object is on the stack and there's no pointer variable pointing to the memory, instead s is that object. The address of to that object is &s, so a pointer that points at the object s shall be assigned the value &s.
Why doesn't MyStruct tmp get automatically free'd in the end of create()?
It does. The tmp destructor runs after the return statement, so the address returned by the function will be to a memory that will soon be overwritten by something else, which at best will cause a segmentation fault (or the equivalent) and at worst corrupt your data.
Both of your questions deal with storage duration and scope.
First, when you dynamically allocate an object, the object and a pointer to it is valid until you free it. If it is an automatic variable (i.e., not dynamically allocated by new, malloc, etc., and not declared static), the variable goes out of scope as soon as the object's scope ends (usually that's the } at the same "level" as the one in which the object was defined). It also has "automatic storage duration", which means that the storage for it also goes away when the object is not in scope.
For your second question, tmp has a scope that ends with the ending } of create. It also has the same storage duration. A pointer to tmp is only valid within that storage duration. Once create() exits, the pointer to tmp becomes invalid, and cannot be used.
Mystruct *s = new Mystruct;
Dynamically allocates s on the heap. It will not be automatically freed. (Also, s is a pointer, not directly a Mystruct).
Mystruct s;
Statically allocates s on the stack. It will be "freed" when it goes out of scope.*
Your code is invalid. When you refer to tmp outside of create, you are using a wild pointer to access dead memory. This results in undefined behavior.
Pretty much. Using it out of scope is undefined behavior, even if the value is still in memory.
Q1:
Mystruct *s = new Mystryct;
Creates the structure variable on the heap, which is being pointed by variable s.
Mystruct s;
Here the structure is created on the stack.
Q2:
MyStruct *create(){ MyStruct tmp; tmp.i = 1337; return &tmp; }
is wrong!! You are creating a local struct variable on stack which vanishes when the function return and any reference to it will be invalid. You should dynamically allocate the variable and will have to manually deallocate it later say in main.
In Mystruct *s = new Mystruct there is static variable, pointer, that is allocated in the start of the function call on the stack. When this line is running, it's allocating on the heap the struct. In Mystruct s it allocates the struct on the stack, "staticly" when the function runs (the constructor, if any, will run in the decleration line).
"Why doesn't MyStruct tmp get automatically free'd in the end of create()?" - well, it is. But, the memory still exist, so you can access it and it may contain the old values.
Mystruct *s = new Mystryct allocates on heap.
One needs to explicitly delete it.
Whereas Mystruct s allocates the structure on stack.
It automatically gets freed as the stack unwinds (Variable goes out of scope)
so for Case 2, the tmp will be freed as soon as create() exits. So what you'll have returned is a dangling pointer which is very dangerous.
If it's too difficult, follow this rule of thumb,
For every new operator called, delete must be called in the end.
For every new[] operator called, delete[] must be called in the end.
use smart pointers which automatically delete the memory allocated instead of normal pointers.
If you're returning objects from a function like in create, make sure you're allocating it using the new operator and not on stack like you've done in the example. Make sure that the caller calls delete on the pointer once he finishes with it.
struct MyStruct{ int i; };
MyStruct create(){ MyStruct tmp; tmp.i = 1337; return tmp; }
int main(){
MyStruct s = create();
cout << s.i;
return 0;
}
or
struct MyStruct{ int i; };
MyStruct* create(){ MyStruct* tmp = new MyStruct; tmp->i = 1337; return tmp; }
int main(){
MyStruct* s = create();
cout << s->i;
delete s;
return 0;
}
would work. Because the copy constructor would create a copy of the struct when it is assigned to s in the first case.
The whole new/delete stuff (dynamic memory allocation) belongs to the basics of C++. You do not have to use any new or delete to implement basic algorithms. Copy constructor and so on will always do the job and make the code easier to understand.
If you want to use new you should also read about autopointer and so on. There is no garbage collection of memory in C++.
I think Thinking C++ explains the concepts of dynamic memory very well(chapter 13).