Consider the following function prototype:
swap(int &a,int &b);
'x' and 'y' are two integers local to main function.
Assume that the control is in main() function.
Now,when the compiler encounters this type of statement:swap(x,z);
What does actually happen underneath the hood(I mean,in terms of memory locations)?
My questions are:-
Is there any memory allocated for a,b in swap() function's stack?If allocated ,what would it store?
How does call by references work?
Is there any memory allocated for a,b in swap() function's stack?If allocated, what would it store?
First, consider the answer to the same question in a call to swap_ptr(int *a, int *b) Assuming that stack is used for passing parameters, some memory would be allocated to a and b in a call to swap_ptr(&x, &y). That memory would hold pointers to x and y.
The same thing happens when you pass references: some memory is allocated on the stack to pass references a and b, and initialized with references to x and y. This memory often has layout identical to a pointer, but the standard does not require this: the structure used for passing references is implementation-specific and non-transparent.
How does call by references work?
In the same way that call by value works, except the compiler knows to construct a reference, rather than making a copy of the object being passed.
Is there any memory allocated for a,b in swap() function's stack?
Excluding the possibility that the function call was expanded inline as an optimization: Yes, memory will be allocated on the call stack.
If allocated ,what would it store?
Something that can be used to access the referred object. In practice, the memory address of the referred object is used.
When you pass by value there is stack memory allocated for your object. When you pass by pointer or reference - memory on stack is only allocated for the pointer/address of the variable.
In syntax - calling by refrence works like by calling by value. But under the hood there are pointers, so dynamic dispatch using virtual functions is possible when using references.
The fundamental answer here is the difference between stack memory and heap memory.
When you pass by value you push a value onto the stack "pass by value" a space on the stack is allocated to be assigned that value (benefit here is it's very fast! but the stack is also very tightly controlled by the system).
When you pass by reference you are passing the pointer to an allocated memory object some where in the system and C++ will create a pointer to its best guess for resources out in the Heap.
Because C++ cannot anticipate the size the referenced resource will become as the program runs to completion it uses junk memory which it configures at instantiation of the application (yeah buzz words weeee)! (next time some one says "blew the stack" they are wrong, they over ran the heap, whoops! very hard to truly blow the stack, most of the time that memory is restricted)
Great article on this: http://tooslowexception.com/heap-vs-stack-value-type-vs-reference-type/
Cheers! and dont work over time its not good for you :)
Related
Suppose that you have the following function:
void doSomething(){
int *data = new int[100];
}
Why will this produce a memory leak? Since I can not access this variable outside the function, why doesn't the compiler call delete by itself every time a call to this function ends?
Why will this produce a memory leak?
Because you're responsible for deleting anything you create with new.
Why doesn't the compiler call delete by itself every time a call to this function ends?
Often, the compiler can't tell whether or not you still have a pointer to the allocated object. For example:
void doSomething(){
int *data = new int[100];
doSomethingElse(data);
}
Does doSomethingElse just use the pointer during the function call (in which case we still want to delete the array here)? Does it store a copy of the pointer for use later (in which case we don't want to delete it yet)? The compiler has no way of knowing; it's up to you to tell it. Rather than making a complicated, error-prone rule (like "you must delete it unless you can figure out that the compiler must know that there are no other references to it"), the rule is kept simple: you must delete it.
Luckily, we can do better than juggling a raw pointer and trying to delete it at the right time. The principle of RAII allows objects to take ownership of allocated resources, and automatically release them when their destructor is called as they go out of scope. Containers allow dynamic objects to be maintained within a single scope, and copied if needed; smart pointers allow ownership to be moved or shared between scopes. In this case, a simple container will give us a dynamic array:
void doSomething(){
std::vector<int> data(100);
} // automatically deallocated
Of course, for a small fixed-size array like this, you might as well just make it automatic:
void doSomething(){
int data[100];
} // all automatic variables are deallocated
The whole point of dynamic allocation like this is that you manually control the lifetime of the allocated object. It is needed and appropriate a lot less often than many people seem to think. In fact, I cannot think of a valid use of new[].
It is indeed a good idea to let the compiler handle the lifetime of objects. This is called RAII. In this particular case, you should use std::vector<int> data(100). Now the memory gets freed automatically as soon as data goes out of scope in any way.
Actually you can access this variable outside of your doSomething() function, you just need to know address which this pointer holds.
In picture below, rectangle is one memory cell, on top of rectangle is memory cell address, inside of rectangle is memory cell value.
For example, in picture above, if you know that allocated array starts from address 0x200, then outside of your function you can do next:
int *data = (int *)0x200;
std::cout << data[0];
So compiler can't delete this memory for you, but please pay attention to compiler warning:
main.cpp: In function ‘void doSomething()’:
main.cpp:3:10: warning: unused variable ‘data’ [-Wunused-variable]
int *data = new int[100];
When you do new, OS allocates memory in RAM for you, so you need to make OS know, when you don't need this memory anymore, doing delete, so it's only you, who knows when execute delete, not a compiler.
This is memory allocated on the heap, and is therefore available outside the function to anything which has its address. The compiler can't be expected to know what you intend to do with heap-allocated memory, and it will not deduce at compile time the need to free the memory by (for example) tracking whether anything has an active reference to the memory. There is also no automatic run-time garbage collection as in Java. In C++ the responsibility is on the programmer to free all memory allocated on the heap.
See also: Why doesn't C++ have a garbage collector?
Here you have an int array dynamically allocated on the heap with a pointer data pointing to the first element of your array.
Since you explicitly allocated your array on the heap with new, this one will not be deleted from the memory when you go out of scope. When you go out of scope, your pointer will be deleted, and then you will not have access to the array -> Memory leak
Since I can not access this variable outside the function, why doesn't
the compiler call delete by itself every time a call to this function
ends?
Because you should not allocate memory on the heap that you wouldn't use anyway in the first place.
That's how C++ works.
EDIT:
also if compiler would delete the pointer after function returns then there would be way of returning a pointer from function.
In C++, this pointer get passed to method as a hidden argument which actually points to current object, but where 'this' pointer stored in memory... in stack, heap, data where?
The standard doesn't specify where the this pointer is stored.
When it's passed to a member function in a call of that function, some compilers pass it in a register, and others pass it on the stack. It can also depend on the compiler options.
About the only thing you can be sure of is that this is an rvalue of basic type, so you can't take its address.
It wasn't always that way.
In pre-standard C++ you could assign to this, e.g. in order to indicate constructor failure. This was before exceptions were introduced. The modern standard way of indicating construction failure is to throw an exception, which guarantees an orderly cleanup (if not foiled by the user's code, such as the infamous MFC placement new bug).
The this pointer is allocated on the stack of your class functions (or sometimes a register).
This is however not likely the question you are actually asking.
In C++, this is "simply" a pointer to the current object. It allows you to access object-specific data.
For example, when code in a class has the following snippet:
this->temperature = 40.0f;
it sets the temperature for whatever object is being acted upon (assuming temperature is not a class-level static, shared amongst all objects of the class).
The this pointer itself doesn't have to be allocated (in terms of dynamic memory), it depends entirely on how it's all handled under the covers, something the standard doesn't actually mandate (the standard tends to focus more on behaviour than internals).
There are any number of places it could be: on the stack, at a specific memory location, in a register, and so on. All you have to concern yourself with is its behaviour which is basically how you use it to get access to the object.
What this points to is the object itself and it's usually allocated with new for dynamic allocation, or on the stack.
The this pointer is in the object itself*. Sort of. It is the memory location of the object.
If the object is on the stack, the this pointer is on the stack.
If the object is on the heap, the this pointer is on the heap.
Bottom line, it's nothing you need to worry about.
*[UPDATE] Let me backpedal/clarify/correct my answer. The physical this pointer is not in the object.
I would conclude the this pointer is derived by the compiler and is simply the address of the object which is stored in the symbol table. Semantically, it is inside the object but that was not what the OP was asking.
Here is the memory layout of 3 variables on the stack. The middle one is an object. You can see it holds it's one variable and nothing else:
I've been doing some research on how C++ handles pointers passed as arguments, but haven't been able to find a solid 'yes/no' answer to my question so far.
Do I need to call delete/free on them, or is C++ smart enough to clear those on it's own?
I've never seen anyone call delete on passed pointers, so I'd assume that you don't need to, but I'd like to know if anyone knows any situations when doing it is actually required.
Thanks!
The storage used for pointers passed as arguments will be cleaned up. But the memory that they point to will not be cleaned up.
So, for instance:
void foo(int *p) // p is a copy here
{
return;
// The memory used to hold p is cleared up,
// but not the memory used to hold *p
}
int main()
{
int *p = new int;
foo(p); // A copy of p is passed to the function (but not a copy of *p)
}
You will often hear people talk about "on the heap" and "on the stack".* Local variables (e.g. arguments) are stored on the stack, which is automatically cleaned up. Data allocated using new (or malloc) is stored on the heap, which is not automatically cleaned up.
* However, the C++ standard doesn't talk about "stack" and "heap" (these are actually implementation-specific details). It uses the terminology "automatic storage" and "allocated storage", respectively.
Arguments passed to a method are stored on the stack, therefore are automatically destroyed when the function returns - just as local variables. The memory where they point to is not automatically free-ed.
If you receive a pointer from your caller, it's the caller's responsibility to free that pointer, unless documented otherwise.
Pointers will be copied - accepted "by value". When the function exists they will be destroyed but because their destructors are trivial nothing will be done to the memory they point to.
On the pointers themselves, no. They go when the parameter stack frame is eliminated upon the function return, just like int or char parameters. Whether you have to do anything with the data pointed to is between you and your code..
I don't quite get the point of dynamically allocated memory and I am hoping you guys can make things clearer for me.
First of all, every time we allocate memory we simply get a pointer to that memory.
int * dynInt = new int;
So what is the difference between doing what I did above and:
int someInt;
int* dynInt = &someInt;
As I understand, in both cases memory is allocated for an int, and we get a pointer to that memory.
So what's the difference between the two. When is one method preferred to the other.
Further more why do I need to free up memory with
delete dynInt;
in the first case, but not in the second case.
My guesses are:
When dynamically allocating memory for an object, the object doesn't get initialized while if you do something like in the second case, the object get's initialized. If this is the only difference, is there a any motivation behind this apart from the fact that dynamically allocating memory is faster.
The reason we don't need to use delete for the second case is because the fact that the object was initialized creates some kind of an automatic destruction routine.
Those are just guesses would love it if someone corrected me and clarified things for me.
The difference is in storage duration.
Objects with automatic storage duration are your "normal" objects that automatically go out of scope at the end of the block in which they're defined.
Create them like int someInt;
You may have heard of them as "stack objects", though I object to this terminology.
Objects with dynamic storage duration have something of a "manual" lifetime; you have to destroy them yourself with delete, and create them with the keyword new.
You may have heard of them as "heap objects", though I object to this, too.
The use of pointers is actually not strictly relevant to either of them. You can have a pointer to an object of automatic storage duration (your second example), and you can have a pointer to an object of dynamic storage duration (your first example).
But it's rare that you'll want a pointer to an automatic object, because:
you don't have one "by default";
the object isn't going to last very long, so there's not a lot you can do with such a pointer.
By contrast, dynamic objects are often accessed through pointers, simply because the syntax comes close to enforcing it. new returns a pointer for you to use, you have to pass a pointer to delete, and (aside from using references) there's actually no other way to access the object. It lives "out there" in a cloud of dynamicness that's not sitting in the local scope.
Because of this, the usage of pointers is sometimes confused with the usage of dynamic storage, but in fact the former is not causally related to the latter.
An object created like this:
int foo;
has automatic storage duration - the object lives until the variable foo goes out of scope. This means that in your first example, dynInt will be an invalid pointer once someInt goes out of scope (for example, at the end of a function).
An object created like this:
int foo* = new int;
Has dynamic storage duration - the object lives until you explicitly call delete on it.
Initialization of the objects is an orthogonal concept; it is not directly related to which type of storage-duration you use. See here for more information on initialization.
Your program gets an initial chunk of memory at startup. This memory is called the stack. The amount is usually around 2MB these days.
Your program can ask the OS for additional memory. This is called dynamic memory allocation. This allocates memory on the free store (C++ terminology) or the heap (C terminology). You can ask for as much memory as the system is willing to give (multiple gigabytes).
The syntax for allocating a variable on the stack looks like this:
{
int a; // allocate on the stack
} // automatic cleanup on scope exit
The syntax for allocating a variable using memory from the free store looks like this:
int * a = new int; // ask OS memory for storing an int
delete a; // user is responsible for deleting the object
To answer your questions:
When is one method preferred to the other.
Generally stack allocation is preferred.
Dynamic allocation required when you need to store a polymorphic object using its base type.
Always use smart pointer to automate deletion:
C++03: boost::scoped_ptr, boost::shared_ptr or std::auto_ptr.
C++11: std::unique_ptr or std::shared_ptr.
For example:
// stack allocation (safe)
Circle c;
// heap allocation (unsafe)
Shape * shape = new Circle;
delete shape;
// heap allocation with smart pointers (safe)
std::unique_ptr<Shape> shape(new Circle);
Further more why do I need to free up memory in the first case, but not in the second case.
As I mentioned above stack allocated variables are automatically deallocated on scope exit.
Note that you are not allowed to delete stack memory. Doing so would inevitably crash your application.
For a single integer it only makes sense if you need the keep the value after for example, returning from a function. Had you declared someInt as you said, it would have been invalidated as soon as it went out of scope.
However, in general there is a greater use for dynamic allocation. There are many things that your program doesn't know before allocation and depends on input. For example, your program needs to read an image file. How big is that image file? We could say we store it in an array like this:
unsigned char data[1000000];
But that would only work if the image size was less than or equal to 1000000 bytes, and would also be wasteful for smaller images. Instead, we can dynamically allocate the memory:
unsigned char* data = new unsigned char[file_size];
Here, file_size is determined at runtime. You couldn't possibly tell this value at the time of compilation.
Read more about dynamic memory allocation and also garbage collection
You really need to read a good C or C++ programming book.
Explaining in detail would take a lot of time.
The heap is the memory inside which dynamic allocation (with new in C++ or malloc in C) happens. There are system calls involved with growing and shrinking the heap. On Linux, they are mmap & munmap (used to implement malloc and new etc...).
You can call a lot of times the allocation primitive. So you could put int *p = new int; inside a loop, and get a fresh location every time you loop!
Don't forget to release memory (with delete in C++ or free in C). Otherwise, you'll get a memory leak -a naughty kind of bug-. On Linux, valgrind helps to catch them.
Whenever you are using new in C++ memory is allocated through malloc which calls the sbrk system call (or similar) itself. Therefore no one, except the OS, has knowledge about the requested size. So you'll have to use delete (which calls free which goes to sbrk again) for giving memory back to the system. Otherwise you'll get a memory leak.
Now, when it comes to your second case, the compiler has knowledge about the size of the allocated memory. That is, in your case, the size of one int. Setting a pointer to the address of this int does not change anything in the knowledge of the needed memory. Or with other words: The compiler is able to take care about freeing of the memory. In the first case with new this is not possible.
In addition to that: new respectively malloc do not need to allocate exactly the requsted size, which makes things a bit more complicated.
Edit
Two more common phrases: The first case is also known as static memory allocation (done by the compiler), the second case refers to dynamic memory allocation (done by the runtime system).
What happens if your program is supposed to let the user store any number of integers? Then you'll need to decide during run-time, based on the user's input, how many ints to allocate, so this must be done dynamically.
In a nutshell, dynamically allocated object's lifetime is controlled by you and not by the language. This allows you to let it live as long as it is required (as opposed to end of the scope), possibly determined by a condition that can only be calculated at run-rime.
Also, dynamic memory is typically much more "scalable" - i.e. you can allocate more and/or larger objects compared to stack-based allocation.
The allocation essentially "marks" a piece of memory so no other object can be allocated in the same space. De-allocation "unmarks" that piece of memory so it can be reused for later allocations. If you fail to deallocate memory after it is no longer needed, you get a condition known as "memory leak" - your program is occupying a memory it no longer needs, leading to possible failure to allocate new memory (due to the lack of free memory), and just generally putting an unnecessary strain on the system.
I have this situation:
{
float foo[10];
for (int i = 0; i < 10; i++) {
foo[i] = 1.0f;
}
object.function1(foo); // stores the float pointer to a const void* member of object
}
object.function2(); // uses the stored void pointer
Are the contents of the float pointer unknown in the second function call? It seems that I get weird results when I run my program. But if I declare the float foo[10] to be const and initialize it in the declaration, I get correct results. Why is this happening?
For the first question, yes using foo once it goes out of scope is incorrect. I'm not sure if it's defined behavior in the spec or not but it's definitely incorrect to do so. Best case scenario is that your program will immediately crash.
As for the second question, why does making it const work? This is an artifact of implementation. Likely what's happenning is the data is being written out to the data section of the DLL and hence is valid for the life of the program. The original sample instead puts the data on the stack where it has a much shorter lifetime. The code is still wrong, it just happens to work.
Yes, foo[] is out of scope when you call function2. It is an automatic variable, stored on the stack. When the code exits the block it was defined in, it is deallocated. You may have stored a reference (pointer) to it elsewhere, but that is meaningless.
In both cases you are getting undefined behaviour. Anything might happen.
You are storing a pointer to the locally declared array, but once the scope containing the array definition is exited the array - and all its members are destroyed.
The pointer that you have stored now no longer points to a float or even a valid memory address that could be used for a float. It might be an address that is reused for something else or it might continue to contain the original data unchanged. Either way, it is still not valid to attempt to dereference the pointer, either for reading or writing a float value.
For any declaration like this:
{
type_1 variable_name_1;
type_2 variable_name_2;
type_3 variable_name_3;
}
declaration, the variables are allocated on the stack.
You can print out the address of each variable:
printf("%p\n", variable_name )
and you'll see that addresses increase by small amount roughly (but not always exactly equal to), the amount of space each variable needs to store its data.
The memory used by stack variables is recycled when the '}' is reached and the variables go out of scope. This is done nice an efficiently just by subtracting some number from a special pointer called the 'stack pointer', which says where the data for new stack variables will have their data allocated. By incrementing and decrementing the stack pointer, programs have an extremely fast way of working out were the memory for variables will live. Its such and important concept that every major processor maintains a special piece of memory just for the stack pointer.
The memory for your array is also pushed and popped from the program's data stack and your array pointer is a pointer into the program's stack memory. While the language specification says accessing the data owned by out-of-scope variables has undefined consequences, the result is typically easy to predict. Usually, your array pointer will continue to hold its original data until new stack variables are allocated and assigned data (i.e. the memory is reused for other purposes).
So don't do it. Copy the array.
I'm less clear about what the standard says about constant arrays (probably the same thing -- the memory is invalid when the original declaration goes out of scope). However, your different behavior is explainable if your compiler allocated a chunk of memory for constants that is initialized when your program starts, and later, foo is made to point to that data when it comes into scope. At least, if I were writing a compiler, that's probably what I'd do as its both very fast and leads to using the smallest amount of memory. This theory is easily testable in the following way:
void f()
{
const float foo[2] = {99, 101};
fprintf( "-- %f\n", foo[0] );
const_cast<foo*>(foo)[0] = 666;
}
Call foo() twice. If the printed value changed between calls (or an invalid memory access exception is thrown), its a fair bet that the data for foo is allocated in special area for constants that the above code wrote over.
Allocating the memory in a special area doesn't work for non-const data because recursive functions may cause many separate copies of a variable to exist on the stack at the same time, each of which may hold different data.
It's undefined behavior in both cases. You should consider the stack based variable deallocated when control leaves the block.
What's happening is currently you're probably just setting a pointer (can't see the code, so I can't be sure). This pointer will point to the object foo, which is in scope at that point. But when it goes out of scope, all hell can break loose, and the C standard can make no guarantees about what happens to that data once it goes out of scope. It can be overwritten by anything. It works for a const array because you're lucky. Don't do that.
If you want the code to work correctly as it is, function1() is going to need to copy the data into the object member. Which means you'll also have to know the length of the array, which means you'll have to pass it in or have some nice termination method.
The memory associated with foo goes out of scope and is reclaimed.
Outside the {}, the pointer is invalid.
It is a good idea to make objects manage their own memory rather than refer to an external pointer. In this specific case your object could allocate its own foo internally and copy the data into it. However it really depends on what you are trying to achieve.
For simple problems like this it is better to give a simple answer, not 3 paragraphs about stacks and memory addresses.
There are 2 pairs of braces {}, one is inside the other. The array was declared after the first left brace { so it stops existing before the last brace }
The end
When answering a question you must answer it at the level of the person asking regardless of how well you yourself comprehend the issue or you may confuse the student.
-experienced ESL teacher