I am new to C and C++. I understand that whenever a function is called, its variables get memory allocated on the stack, that includes the case where the variable happens to be a pointer that points to data allocated on the heap via malloc or new (but I heard it is not guaranteed that the storage allocated by malloc is 100% on the Heap, please correct me if I am wrong). For example,
Void fn(){
Member *p = new Member()
}
Or
Void fn() {
int *p = (int*) malloc( sizeof(int) * 10 );
}
Please correct if I am wrong, in both cases, variable p (which holds the address to the object allocated on the heap) is on the stack, and it points to the object on the heap.
So is it correct to say that all the variables we declare are on the stack even though they might point to something on the heap?
Let’s say the address of local variable pointer p is loaded at memory address 001, it has the address of the member object located on Heap, and that address is 002. We can draw a diagram like this.
If that is correct, my next question is, can we have a pointer that is actually located on the heap, and it points to a variable located on Stack? If it is not possible, can that pointer points to a variable located on Heap?
Maybe another way to phrase this question is: in order to access something in heap, we can only access it via pointers on the stack??
A possible diagram could look like this
If that is possible, Can I have an example here?
Yes, you can put your pointer on the free store (heap) and have it point to a variable on the stack. The trick is to create a pointer to a pointer (int**):
int main()
{
int i = 0; // int on the stack
int** ip = new int*; // create an int* (int pointer) on the free store (heap)
// ip (the int**) is still on the stack
*ip = &i;
// Now your free store (heap) located pointer points
// to your stack based variable i
delete ip; // clean up
}
NOTE: The terms "heap" and "stack" are general, well understood, computing terms. In C++ they are referred to in the Standard as the "free store" and (although not directly named) a "stack" is 100% implied (eg. through references to "stack-unwinding") and therefore required.
stack and heap are not specifically defined by the standard. Those are implementation details.
Heap refers to a data structure that many operating systems use to help them safely manage the allocated space for different programs running at the same time. Read more here
Here is a diagram for a simple heap so that you can have a mental model of it:
Keep in mind that this is not exactly what operating systems use. In fact, operating systems use a far more advanced form of the heap data structure that allows them to perform many sorts of complex memory-related tasks. Also, not every OS implements the free store using the heap data structure. Some may use different techniques.
Whereas a stack is much simpler:
can we have a pointer that is actually located on the heap, and it points to a variable located on Stack?
Yes, it's possible but rarely needed:
#include <iostream>
int main( )
{
int a_variable_on_stack { 5 };
int** ptr_on_stack { new int*( &a_variable_on_stack ) };
std::cout << "address of `a_variable_on_stack`: " << &a_variable_on_stack << '\n'
<< "address of ptr on the heap: " << ptr_on_stack << '\n'
<< "value of ptr on the heap: " << *ptr_on_stack << '\n';
std::cin.get( );
}
Possible output:
address of `a_variable_on_stack`: 0x47eb5ffd2c
address of ptr on the heap: 0x1de33cc3810
value of ptr on the heap: 0x47eb5ffd2c
Notice how the address of a_variable_on_stack and value of ptr stored on heap are both 0x47eb5ffd2c. In other words, a pointer on the heap is holding the address of a variable that is on the stack.
In short:
Variables declared within a function are allocated on the stack, and can point to whatever you want (to address of other variables on the stack and to address of other variables on the heap).
Same is for variables declared on the heap. They can point to address of other variables on the heap or to address of variables on the stack. There is no limitation here.
However, variables declared on the stack, are by nature temporary, and when function return this memory is reclaimed. Therefor it is not a good practice to have pointers to variable's address at the stack, unless you know the function did not finish yet (i.e. using local variables address from within the same function or by functions calls from within the same function). A common mistake of novice C/C++ developers, is to return from function, address of variable declared on the stack. When function returns, this memory is reclaimed and will be soon reused for other function calls memory, so accessing this address has undefined behavior.
I am new to C and C++.
Your question is not C or C++ specific, but it is about programming languages in general.
... whenever a function is called, its variables get memory allocated on the stack ...
This is correct: Nearly all compilers do it this way.
However, there are exceptions - for example on SPARC or TriCore CPUs, which have a special feature...
... allocated on the heap via malloc ...
malloc never allocates memory on the stack but on the heap.
... is not guaranteed that the storage allocated by malloc is 100% on the heap ...
Unlike the word "stack", the meaning of the word "heap" differs a bit from situation to situation.
In some cases, the word "heap" is used to specify a certain memory area that is used by malloc and new.
If there is not enough memory in that memory area, malloc (or new) asks the operating system for memory in a different memory area.
However, other people would also call that memory area "heap".
... in both cases, variable p is on the stack, and it points to the object on the heap.
This is correct.
... can we have a pointer that is actually located on the heap, and it points to a variable located on Stack?
Sure:
int ** allocatedMemory;
void myFunction()
{
int variableOnStack;
allocatedMemory = (int **)malloc(sizeof(int *));
*allocatedMemory = &variableOnStack;
...
}
The variable allocatedMemory points to some data on the heap and that data is a pointer to a variable (variableOnStack) on the stack.
However, when the function myFunction() returns, the variable variableOnStack does no longer exist. Let's say the function otherFunction() is called after myFunction():
void otherFunction()
{
int a;
int b;
...
}
Now we don't know if *allocatedMemory points to a, to b or even the "return address" because we don't know which of the two variables is stored at the same address as variableOnStack.
Bad things may happen if we write to **allocatedMemory now...
In order to access something in heap, we can only access it via pointers on the stack??
... diagram "B" ...
To access some data on the heap, you definitely need some pointer that is not stored on the heap.
This pointer can be:
A global or static variable
In my example above, allocatedMemory is a global variable.
Global and static variables are neither stored in a completely different memory area (not heap nor stack)
A local variable on the stack
A local variable in a CPU register
(I already wrote that local variables are not always stored on the stack)
Theoretically, the situation in diagram "B" is possible: Simply overwrite the variable allocatedMemory by NULL (or another pointer).
However, a program cannot directly access data on the heap.
This means that p* (which is some data on the heap) cannot be accessed any more if there is no more pointer "outside" the heap that points to p*.
Related
I have been searching for about an hour now, but haven't found a clear answer. If I use an object pointer, are the local variables allocated on the stack?
For example:
class SomeClass {
public:
int a;
int some_method() {
int local_variable = 5;
return local_variable + a;
}
};
SomeClass *obj_ptr = new SomeClass();
obj_ptr->a = 5; // I'm aware that this variable is allocated on the heap.
// Is local_variable allocated on the stack? Is the return value on the stack?
obj_ptr->some_method();
As a mental model, it is not wrong to imagine that local_variable is allocated on some kind of stack, since it has automatic storage and its scope ends with the scope of the function.
A member function ("method") is no different in that regard from any other function. The one implementation difference is that member functions receive the pointer to the current object as a hidden argument accessible inside the function as this.
If the question is literally where the variables are allocated during runtime, that will depend on your compiler, platform, and optimization level. The optimizer can transform the code in surprising ways, including eliminating much of it. Many automatic variables will be allocated in registereds, and some will be optimized away completely - for example, in your case the compiler could emit a CPU instruction that increments the integer contents of a known memory address by 5. The constant 5 would only be present in the disassembly (the "text" segment), and not on the stack or the heap.
The implementation will, we hope, put them wherever is most efficient on that particular platform. For the examples you cited, that will likely be in registers.
There is no difference between object pointes and other local variables. In an non-optimizing compiler, usually all local variables will be on the stack. Although the object pointer itself is on the stack, it will point to an arbitrary memory location, most commonly on the heap.
In an optimizing compiler local variables will often be places in registers, but still, the pointer can point to any memory location.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
dynamic memory allocation by pointers
what is the link between pointers and dynamic memory allocation . why do we use pointers for
dynamic memory allocation . whenever we use new operator we use only pointer variables why?
can anyone explain with an example
According to your question to start with you need not programming but real life example.
Imagine you live in your ordinary flat, it has its own address and on the door you can see big sign "Robin Mandela". It's like static memory allocation. On the start of your program you have some room in memory and a name associated with it. On every vacation you fly to other country where you rent a room in a hotel. You can live one year in one room, another year in another room and even change room during your vacation. You may even not really be interested in what room exactly you will live in, but you need to know that you precisely will have one to live in.
When you ask for dynamic memory allocation, you get some portion of memory. This memory can be allocated almost anywhere, like a room in hotel, and of course you need a key with number to know where to find it. A pointer is like that number on the key - it grants you access to your allocated data.
Also one year you may decide not to go on vacation and rent a room at all. You can't do this with your flat - you have just got it, live there or not. It's like static memory in program.
The size and location (memory address) of objects of automatic and static storage duration is known at compile time. The objects can be accessed through variables and the compiler handles the memory implicitly.
Dynamic memory on the other hand, is allocated at run time, as the name implies. Because it happens at runtime, the compiler can not have knowledge of where the memory is allocated and therefore can not handle the memory for the you. You must access the memory using it's address.
How to get the address of dynamically allocated memory?
It is returned by the new operator which allocates the memory and constructs the object. Since using dynamic memory just in one statement is not useful, you must store the address in a variable.
What type of variable can store a memory address of an object?
A pointer type. The type of the return value of a new-expression is a pointer to the type of object you constructed.
An example:
void foo() {
Foo f; // The object of type Foo has automatic storage and can be accessed through the variable `f`.
Foo* f_ptr; // The pointer object has automatic storage and is not yet initialized.
new Foo; // Bad! The dynamically allocated memory can not be accessed in any way (it has leaked).
f_ptr = new Foo; // The address of another dynamically allocated object is assigned to the f_ptr variable. The object is accessible through this pointer.
delete f_ptr; // Dynamically allocated objects must be deleted explicitly.
// f and f_ptr are destroyed automatically when the scope ends. If the memory pointed by f_ptr was not deleted before this, the memory would leak.
}
The question itself is a bit nonsensical. The keyword new is for allocating memory on the heap and returns either a pointer to the allocated object or throws a std::bad_alloc if allocation fails. It's in a sense like asking why int main(int argc, char** argv) returns an int.
In C++ you have two address spaces to work with; the stack and the heap.
Generally you want to use the heap for allocating your objects and pass them to functions either by reference, the preferred way, or by pointer. In some cases you can't use the stack, mainly when you either don't know how long the object you are creating is going to be alive or when you know that the object should live longer than the scope that creates it. For those cases you should be using std::unique_ptr or std::shared_ptr if you are using C++11 since the object will be deleted automatically when it's no longer needed.
Use of the keyword new is generally discouraged and should only be used when you're certain that neither of the above works.
To read up on the operator new:
cppreference.com
wikipedia
shared_ptr:
cppreference.com
Difference between stack and heap:
learncpp.com
The function you use (for example: malloc() in c) try to get a part of the memory with the length you asked and then it gives you the address of this part of the memory if the allocation was successful else it gives you a 0 (in c and c++ at least).
(When you want to have memory for an array of n items, you must ask for n * item size and you will get the address of the array which is also the address of the first element of your array in fact.)
Example: I want an array of 10 integers
// ask for allocation and get the address of the memory
int * my_array = (int *)malloc(10*sizeof(int));
// you must verify that the allocation was successful
if(my_array != 0) {
// the system gave you the memory you need
// you can do your operations
// " * my_array " and " my_array[0] " have the same meaning
my_array[0] = 22;
* my_array = 22;
// This two lines make the same thing : the put 22 in the memory at the adresse my_array
// same thing for " *(my_array + 9*sizeof(int)) " and " my_array[9] "
my_array[9] = 50;
*(my_array + 9*sizeof(int)) = 50;
// This two lines make the same thing : the put 50 in the memory at the adresse my_array + 36
// 36 = 9*4 (sizeof(int) = 4 in most cases)
free(my_array); //always free the memory to give back the memory to the system and do not "lost it"
}
It is the same in c++ but you replace malloc by new and free by delete. They are keyword but you can see them as functions to understand what it is done.
I am trying to understand the difference between the stack and heap memory, and this question on SO as well as this explanation did a pretty good job explaining the basics.
In the second explanation however, I came across an example to which I have a specific question, the example is this:
It is explained that the object m is allocated on the heap, I am just wondering if this is the full story. According to my understanding, the object itself indeed is allocated on the heap as the new keyword has been used for its instantiation.
However, isn't it that the pointer to object m is on the same time allocated on the stack? Otherwise, how would the object itself, which of course is sitting in the heap be accessed. I feel like for the sake of completeness, this should have been mentioned in this tutorial, leaving it out causes a bit of confusion to me, so I hope someone can clear this up and tell me that I am right with my understanding that this example should have basically two statements that would have to say:
1. a pointer to object m has been allocated on the stack
2. the object m itself (so the data that it carries, as well as access to its methods) has been allocated on the heap
Your understanding may be correct, but the statements are wrong:
A pointer to object m has been allocated on the stack.
m is the pointer. It is on the stack. Perhaps you meant pointer to a Member object.
The object m itself (the data that it carries, as well as access to its methods) has been allocated on the heap.
Correct would be to say the object pointed by m is created on the heap
In general, any function/method local object and function parameters are created on the stack. Since m is a function local object, it is on the stack, but the object pointed to by m is on the heap.
"stack" and "heap" are general programming jargon. In particular , no storage is required to be managed internally via a stack or a heap data structure.
C++ has the following storage classes
static
automatic
dynamic
thread
Roughly, dynamic corresponds to "heap", and automatic corresponds to "stack".
Moving onto your question: a pointer can be created in any of these four storage classes; and objects being pointed to can also be in any of these storage classes. Some examples:
void func()
{
int *p = new int; // automatic pointer to dynamic object
int q; // automatic object
int *r = &q; // automatic pointer to automatic object
static int *s = p; // static pointer to dynamic object
static int *s = r; // static pointer to automatic object (bad idea)
thread_local int **t = &s; // thread pointer to static object
}
Named variables declared with no specifier are automatic if within a function, or static otherwise.
When you declare a variable in a function, it always goes on the stack. So your variable Member* m is created on the stack. Note that by itself, m is just a pointer; it doesn't point to anything. You can use it to point to an object on either the stack or heap, or to nothing at all.
Declaring a variable in a class or struct is different -- those go where ever the class or struct is instantiated.
To create something on the heap, you use new or std::malloc (or their variants). In your example, you create an object on the heap using new and assign its address to m. Objects on the heap need to be released to avoid memory leaks. If allocated using new, you need to use delete; if allocated using std::malloc, you need to use std::free. The better approach is usually to use a "smart pointer", which is an object that holds a pointer and has a destructor that releases it.
Yes, the pointer is allocated on the stack but the object that pointer points to is allocated on the heap. You're correct.
However, isn't it that the pointer to object m is on the same time
allocated on the stack?
I suppose you meant the Member object. The pointer is allocated on the stack and will last there for the entire duration of the function (or its scope). After that, the code might still work:
#include <iostream>
using namespace std;
struct Object {
int somedata;
};
Object** globalPtrToPtr; // This is into another area called
// "data segment", could be heap or stack
void function() {
Object* pointerOnTheStack = new Object;
globalPtrToPtr = &pointerOnTheStack;
cout << "*globalPtrToPtr = " << *globalPtrToPtr << endl;
} // pointerOnTheStack is NO LONGER valid after the function exits
int main() {
// This can give an access violation,
// a different value after the pointer destruction
// or even the same value as before, randomly - Undefined Behavior
cout << "*globalPtrToPtr = " << *globalPtrToPtr << endl;
return 0;
}
http://ideone.com/BwUVgm
The above code stores the address of a pointer residing on the stack (and leaks memory too because it doesn't free Object's allocated memory with delete).
Since after exiting the function the pointer is "destroyed" (i.e. its memory can be used for whatever pleases the program), you can no longer safely access it.
The above program can either: run properly, crash or give you a different result. Accessing freed or deallocated memory is called undefined behavior.
Ok, so I did find some questions that were almost similar but they actually confused me even more about pointers.
C++ Pointer Objects vs. Non Pointer Objects
In the link above, they say that if you declare a pointer it is actually saved on the heap and not on the stack, regardless of where it was declared at. Is this true ?? Or am I misunderstanding ??? I thought that regardless of a pointer or non pointer, if its a global variable, it lives as long as the application. If its a local variable or declared within a loop or function, its life is only as long as the code within it.
The variable itself is stored on the stack or DATA segment, but the memory it points to after being allocated with new is within the heap.
void main()
{
int* p; // p is stored on stack
p = new int[20]; // 20 ints are stored on heap
}
// p no longer exists, but the 20 ints DO EXSIST!
Hope that helps.
void func()
{
int x = 1;
int *p = &x;
}
// p goes out of scope, so does the memory it points to
void func()
{
int *p = new int;
}
// p goes out of scope, the memory it points to DOES NOT
void func()
{
int x = 1;
int **pp = new int*;
*pp = &x;
}
// pp goes out of scope, the pointer it points to does not, the memory it points to does
And so forth. A pointer is a variable that contains a memory location. Like all variables, it can be on the heap or the stack, depending on how it's declared. It's value -- the memory location -- can also exist on the heap or the stack.
Typically, if you statically allocate something, it's on the stack. If you dynamically allocate something (using either new or malloc) then it's on the heap. Generally speaking you can only access dynamically allocated memory using a pointer. This is probably where the confusion arises.
It is necessary to distinguish between the pointer (a variable that holds a memory location) and the object to which the pointer points (the object at the memory address held by the pointer). A pointer can point to objects on the stack or on the heap. If you use new to allocate the object, it will be on the heap. The pointer can, likewise, live on the heap. If you declare it in the body of a function, then it will be a local variable and live in local storage (i.e. on the stack), whereas if it is a global variable, it will live somewhere in your application's data section. You can also have pointers to pointers, and similarly one can allocate a pointer on the heap (and have a pointer-to-a-pointer pointing to that), etc. Note that while I have referenced the heap and stack, the C++ only mentions local/automatic storage and dynamic storage... it does not speak to the implementation. In practice, though, local=stack and dynamic=heap.
A pointer is a variable containing the address of some other object in memory. The pointer variable can be allocated:
on the stack (as a local auto variable in a function or statement block)
statically (as a global variable or static class member)
on the heap (as a new object or as a class object member)
The object that the pointer points to (references) can likewise be allocated in these three places as well. Generally speaking, though, a pointed-to object is allocated using the new operator.
Local variables go out of scope when the program flow leaves the block (or function) that they are declared within, i.e., their presence on the stack disappears. Similarly, member variables of an object disappear when their parent object goes out of scope or is deleted from the heap.
If a pointer goes out of scope or its parent object is deleted, the object that the pointer references still exists in memory. Thus the rule of thumb that any code that allocates (news) an object owns the object and should also delete that object when it's no longer needed.
Auto-pointers take some of the drudgery out of the management of the pointed-to object. An object that has been allocated through an auto_ptr is deleted when that pointer goes out of scope. The object can be assigned from its owning auto_ptr to another auto_ptr, which transfers object ownership to the second pointer.
References are essentially pointers in disguise, but that's a topic for another discussion.
I thought that regardless of a pointer
or non pointer, if its a global
variable, it lives as long as the
application. If its a local variable
or declared within a loop or function,
its life is only as long as the code
within it.
That's true.
they say that if you declare a pointer
it is actually saved on the heap and
not on the stack
That's wrong, partially. You can have a pointer on the heap or the stack. It's a matter of where and how you declare it.
void main()
{
char c = 0x25;
char *p_stack = &c; // pointer on stack
StructWithPointer struct_stack; // stack
StructWithPointer *struct_heap = new StructWithPointer(); // heap, thus its pointer member "p" (see next line) is also on the heap.
struct_heap->p = &c; // pointer on heap points to a stack
}
... and, a compiler might decide to use a register for a pointer!
Looks like you need to grab the classic K&R C book and read through chapters 4 & 5 for thorough understanding of the differences between declaration and definition, scope of a variable and about pointers.
I know that memory alloced using new, gets its space in heap, and so we need to delete it before program ends, to avoid memory leak.
Let's look at this program...
Case 1:
char *MyData = new char[20];
_tcscpy(MyData,"Value");
.
.
.
delete[] MyData; MyData = NULL;
Case 2:
char *MyData = new char[20];
MyData = "Value";
.
.
.
delete[] MyData; MyData = NULL;
In case 2, instead of allocating value to the heap memory, it is pointing to a string literal.
Now when we do a delete it would crash, AS EXPECTED, since it is not trying to delete a heap memory.
Is there a way to know where the pointer is pointing to heap or stack?
By this the programmer
Will not try to delete any stack memory
He can investigate why does this ponter, that was pointing to a heap memory initially, is made to refer local literals? What happened to the heap memory in the middle? Is it being made to point by another pointer and delete elsewhere and all that?
Is there a way to know where the pointer is pointing to heap or stack?
You can know this only if you remember it at the point of allocation. What you do in this case is to store your pointers in smart pointer classes and store this in the class code.
If you use boost::shared_ptr as an example you can do this:
template<typename T> void no_delete(T* ptr) { /* do nothing here */ }
class YourDataType; // defined elsewhere
boost::shared_ptr<YourDataType> heap_ptr(new YourDataType()); // delete at scope end
YourDataType stackData;
boost::shared_ptr<YourDataType> stack_ptr(&stackData, &no_delete); // never deleted
As soon as you need that knowledge you have already lost. Why? Because then even if you omit the wrong delete[], you still have a memory leak.
The one who creates the memory should always be the one who deletes it. If at some occasion a pointer might get lost (or overwritten) then you have to keep a copy of it for the proper delete.
There is no way in Standard C++ of determining whether a pointer points to dynamically allocated memory or not. And note that string literals are not allocated on the stack.
As most of the users said here there's no standard way to discover which memory you're dealing with.
Also, as many users pointed out, it;s a kinda perverted situation where you pass a pointer to a function which should delete it automatically if it's allocated on heap.
But if you insist, nevertheless there are some ways to discover which memory belongs to which type.
You actually deal with 3 types of memory
Stack
Heap
Global
For instance:
char* p = new char[10]; // p is a pointer, points to heap-allocated memory
char* p = "Hello, world!"; // p is a pointer, points to the global memory
char p[] = "Hello, world!"; // p is a buffer allocated on the stack and initialized with the string
Now let's distinguish them. I'll describe this in terms of Windows API and x86 assembler (since this is what I know :))
Let's start from stack memory.
bool IsStackPtr(PVOID pPtr)
{
// Get the stack pointer
PBYTE pEsp;
_asm {
mov pEsp, esp
};
// Query the accessible stack region
MEMORY_BASIC_INFORMATION mbi;
VERIFY(VirtualQuery(pEsp, &mbi, sizeof(mbi)));
// the accessible stack memory starts at mbi.BaseAddress and lasts for mbi.RegionSize
return (pPtr >= mbi.BaseAddress) && (pPtr < PBYTE(mbi.BaseAddress) + mbi.RegionSize);
}
If the pointer is allocated on the stack of another thread you should get its stack pointer by GetThreadContext instead of just taking the EIP register value.
Global memory
bool IsGlobalPtr(PVOID pPtr)
{
MEMORY_BASIC_INFORMATION mbi;
VERIFY(VirtualQuery(pPtr, &mbi, sizeof(mbi)));
// Global memory allocated (mapped) at once for the whole executable
return mbi.AllocationBase == GetModuleHandle(NULL);
}
If you're writing a DLL you should put its module handle (which is actually its base mapping pointer) instead of GetModuleHandle(NULL).
Heap
Theoretically you may assume that if the memory is neither global nor stack - it's allocated on heap.
But there's is actually a big ambiguity here.
You should know that there're different implementations of the heap (such as raw Windows heap accessed by HeapAlloc/HeapFree, or CRT-wrapped malloc/free or new/delete).
You may delete such a block via delete operator only if you know for sure it was either stack/global pointer or it was allocated via new.
In conclusion:
It's a kinda pervert trick. Should not be used generally. Better to provide some extra information with the pointer which tells how to release it.
You can only use it if you know for sure on which heap the memory was allocated (in case it's a heap memory).
I think there is no (easy) way how to tell where the memory is allocated (you might be able to determine it using a debugger, perhaps, but that is obviously not what you want). The bottom line is: never do the thing you did in case 2.
In case 2, MyData = "Value" causes a memory leak since there is no longer a reference to the memory returned from new.
There is no easy way or standard way for doing this. You can intercept the heap allocation function(s) and put each memory allocated zone in a list. Your "IsHeap" function should check if the zone passed to the function is the one from the list. This is just a hint - it is almost impossible to do this in a cross-platform manner.
But then again - why would you need that?