I have the following C++ code, and I'm running g++ (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0:
#include <iostream>
const int& getConst() {
int x = 10;
printf("x ptr: %p\n", &x);
const int &y = 10;
printf("y ptr: %p\n", &y);
return y;
}
int main() {
int start = 0;
printf("start ptr: %p\n", &start);
const int &t = getConst();
printf("t: %d\n", t);
printf("t ptr: %p\n", &t);
int end = 0;
printf("end ptr: %p\n", &end);
return 0;
}
And the output of this code is as follows:
root#78600f6683dd:/home/test/question# ./a.out
start ptr: 0x7ffdcd2381f8
x ptr: 0x7ffdcd2381c8
y ptr: 0x7ffdcd2381cc
t: 10
t ptr: 0x7ffdcd2381cc
end ptr: 0x7ffdcd2381fc
There are two things I'm confused about the result:
The memory location of start and end within the function main() are 0x7ffdcd2381f8 and 0x7ffdcd2381fc respectively. The memory locations of variables of main function are ascending. The main function calls getConst() function, but the location of variables within the function getConst() are 0x7ffdcd2381c8 and 0x7ffdcd2381cc, which are both descending comparing to variables within the main() function. Since main functions calls getConst() function, shouldn't location of getConst() be on top of the stack to main()?
Within getConst() function, y is a const reference to 10. As far as I understand it, the procedure is that, a temporary int variable is created with a value of 10, and y references to it. As seen in the output of the program, both y and t point to the same memory location. However the temporary variable is a variable defined in stack, shouldn't it be cleaned up after getConst() function returns? If so, how can t still get the correct value?
Your code has undefined behaviour as it is returning a reference to a temporary variable so anything could happen.
However what is actually happening is probably that the returned reference is basically a pointer, the memory pointed to is no longer valid but the pointer itself is just a number so it is unsurprising that you are able to print its value. Printing the value of the reference will probably work as the runtime doesn't "clean up" deallocated stack memory, that would be a waste of time, when the memory is re-used it will be re-initialised. If you call a new function containing uninitialised variables it wouldn't be surprising if they also have the same values as set in getConst. This of course is all undefined behaviour.
Traditionally heap memory grew up from the bottom of the memory and stack grew down from the top, when the two met your program was out of memory. With modern virtual memory schemes this isn't literally the case any more but the stack is still generally a fixed size block of memory which is used from the end back to the front so it is not unusual for new stack allocations to have lower addresses than old ones. This is what makes overflowing stack variables so dangerous, you aren't overwriting unused memory you are actually overwriting variables earlier in the stack.
case 1:
int main()
{
int T=5;
while(T--){
int a;
cout<<&a<<"\n";
}
}
it prints the same address 5 times.
i suppose it should print 5 different addresses.
case 2:
int main()
{
int T=5;
while(T--){
int* a=new int;
cout<<a<<"\n";
}
}
prints 5 different addresses
My question is:
Why does'nt new memory is allocated every time a variable declaration is encountered in first case?
and the difference between 1st case and 2nd case.
In the first case, a is located on the stack. Basically, a gets "constructed" (a better wording might be "assigned space") there in each iteration and released afterwards. So after each iteration, the space previously allocated to a is free again, and the new a gets that space in the next iteration. This is why the address is the same.
In the second case, you allocate memory on the heap and (additionally) do not free it again. So the memory can't be reassigned in the next iteration.
In theory, the absolute position on the stack is allocated every time the variable comes into scope and deallocated every time it goes out of scope. The LIFO nature of the stack in which it is allocated then makes sure the same location is allocated each time.
But in practice, the compiler allocates relative positions on the stack at compile time whenever doing so is practical (which in this case is trivially true). With pre allocated relative positions, the simple act of entering the function effectively allocates all instances of all local variables. A local object in a loop like that would be constructed and/or initialized for each instance, but allocation was done once in advance for all instances. So the addresses are the same for an even more fundamental reason than the LIFO nature of a stack. They are the same because the allocation was only done once.
If your C++ compiler supports a common C99 feature, you could construct tests that might distinguish the above two cases. Something roughly like:
for (int i=0; i<2; ++i) {
int unpredictable[ f(i) ];
for (int j=0; j<2; ++j) {
int T=5;
// does the location of T vary as i changes ??
int U[ f(j) ]; // I'm pretty sure the location of U varies
}}
We want the values of f(0) and f(1) to be easy at run time, but hard for the optimizer to see at compile time. That is most robust if f is declared in this module but defined in another.
By preventing the compiler from doing all the allocation at compile time, maybe we prevent it from doing some easy allocation at compile time, or maybe it still sorts out the ones that can be allocated at compile time and run time allocation is used only as needed.
It depends on the compiler. Since the variable is declared in the innermost scope, the compiler thinks it is okay to reuse the same location for the variable. Of course it can be located in different addresses.
Here is some C++ code.
#include <iostream>
using namespace std;
class test{
int a;
public:
test(int b){
a = b;
cout << "test constructed with data " << b << endl;
}
void print(){
cout << "printing test: " << a << endl;
}
};
test * foo(){
test x(5);
return &x;
}
int main()
{
test* y = foo();
y->print();
return 0;
}
Here is its output:
test constructed with data 5
printing test: 5
My question: Why does the pointer to x still "work" outside of the context of function foo? As far as I understand, the function foo creates an instance of test and returns the address of that object.
After the function exits, the variable x is out of scope. I know that C++ isn't garbage collected- what happens to a variable when it goes out of scope? Why does the address returned in foo() still point to what seems like a valid object?
If I create an object in some scope, and want to use it in another, should I allocate it in the heap and return the pointer? If so, when/where would I delete it
Thanks
x is a local variable. After foo returns there's no guarantee that the memory on the stack that x resided in is either corrupt or intact. That's the nature of undefined behavior. Run a function before reading x and you'll see the danger of referencing a "dead" variable:
void nonsense(void)
{
int arr[1000] = {0};
}
int main()
{
test* y = foo();
nonsense();
y->print();
return 0;
}
Output on my machine:
test constructed with data 5
printing test: 0
When a variable goes out of scope, the destructor is called (for non POD data) and the location occupied by that variable is now considered unallocated, but the memory isn't actually written, so the old value remains. This doesn't mean that you can still safely access this value because it resides in a location marked as 'free'. New variables can reside or allocation can occur in this memory space.
The reason why the memory isn't erased is because you can't actually erase memory, what you could do is write something to it like all zeros or all ones or random, which is not only pointless, but performance-degrading.
It has nothing to do with garbage-collection. A garbage collector doesn't "erase" memory, but marks it as being free. The reason why the behaviour you described exists in C and not in Java for instance is not the garbage-collector, but the fact that C lets you access via pointers any memory you want, allocated or not, valid or not, and Java doesn't (To be fair the garbage collector is a reason why Java can make so that you can't access any memory).
An analogy can be made with what happens on disk when you delete a file. The file contents remain (they are not overwritten), but instead pointers (handles) in the file system are modified so that that memory on the disk is considered free. That's why special tools can recover deleted files: the information is still there until something new writes over it, and if you can point to it you can obtain it. Is almost the same thing with pointers in C. Think what would it mean to actually write 4GB on disk every time you delete a 4GB file. There is no need to write in memory for each variable that goes out of scope the entire size of that variable. You just mark it's free.
Doesn't the space occupied by a variable get deallocated as soon as the control is returned from the function??
I thought it got deallocated.
Here I have written a function which is working fine even after returning a local reference of an array from function CoinDenom,Using it to print the result of minimum number of coins required to denominate a sum.
How is it able to print the right answer if the space got deallocated??
int* CoinDenom(int CoinVal[],int NumCoins,int Sum) {
int min[Sum+1];
int i,j;
min[0]=0;
for(i=1;i<=Sum;i++) {
min[i]=INT_MAX;
}
for(i=1;i<=Sum;i++) {
for(j=0;j< NumCoins;j++) {
if(CoinVal[j]<=i && min[i-CoinVal[j]]+1<min[i]) {
min[i]=min[i-CoinVal[j]]+1;
}
}
}
return min; //returning address of a local array
}
int main() {
int Coins[50],Num,Sum,*min;
cout<<"Enter Sum:";
cin>>Sum;
cout<<"Enter Number of coins :";
cin>>Num;
cout<<"Enter Values";
for(int i=0;i<Num;i++) {
cin>>Coins[i];
}
min=CoinDenom(Coins,Num,Sum);
cout<<"Min Coins required are:"<< min[Sum];
return 0;
}
The contents of the memory taken by local variables is undefined after the function returns, but in practice it'll stay unchanged until something actively changes it.
If you change your code to do some significant work between populating that memory and then using it, you'll see it fail.
What you have posted is not C++ code - the following is illegal in C++:
int min[Sum+1];
But in general, your program exhibits undefined behaviour. That means anything could happen - it could even appear to work.
The space is "deallocated" when the function returns - but that doesn't mean the data isn't still there in memory. The data will still be on the stack until some other function overwrites it. That is why these kinds of bugs are so tricky - sometimes it'll work just fine (until all the sudden it doesn't)
You need to allocate memory on the heap for return variable.
int* CoinDenom(int CoinVal[],int NumCoins,int Sum) {
int *min= new int[Sum+1];
int i,j;
min[0]=0;
for(i=1;i<=Sum;i++) {
min[i]=INT_MAX;
}
for(i=1;i<=Sum;i++) {
for(j=0;j< NumCoins;j++) {
if(CoinVal[j]<=i && min[i-CoinVal[j]]+1<min[i]) {
min[i]=min[i-CoinVal[j]]+1;
}
}
}
return min; //returning address of a local array
}
min=CoinDenom(Coins,Num,Sum);
cout<<"Min Coins required are:"<< min[Sum];
delete[] min;
return 0;
In your case you able to see the correct values only, because no one tried to change it. In general this is unpredictable situation.
That array is on the stack, which in most implementations, is a pre-allocated contiguous block of memory. You have a stack pointer that points to the top of the stack, and growing the stack means just moving the pointer along it.
When the function returned, the stack pointer was set back, but the memory is still there and if you have a pointer to it, you could access it, but it's not legal to do so -- nothing will stop you, though. The memory values in the array's old space will change the next time the stack depth runs over the area where the array is.
The variable you use for the array is allocated on stack and stack is fully available to the program - the space is not blocked or otherwise hidden.
It is deallocated in the sense that it can be reused later for other function calls and in the sense that destructors get called for variables allocated there. Destructors for integers are trivial and don't do anything. That's why you can access it and it can happen that the data has not been overwritten yet and you can read it.
The answer is that there's a difference between what the language standard allows, and what turns out to work (in this case) because of how the specific implementation works.
The standard says that the memory is no longer used, and so must not be referenced.
In practice, local variables on the stack. The stack memory is not freed until the application terminates, which means you'll never get an access violation/segmentation fault for writing to stack memory. But you're still violating the rules of C++, and it won't always work. The compiler is free to overwrite it at any time.
In your case, the array data has simply not been overwritten by anything else yet, so your code appears to work. Call another function, and the data gets overwritten.
How is it able to print the right answer if the space got deallocated??
When memory is deallocated, it still exists, but it may be reused for something else.
In your example, the array has been deallocated but its memory hasn't yet been reused, so its contents haven't yet been overwritten with other values, which is why you're still able from it the values that you wrote.
The fact that it won't have been reused yet is not guaranteed; and the fact that you can even read from it at all after it's deallocated is also not guaranteed: so don't do it.
This might or might not work, behaviour is undefined and it's definitely wrong to do it like this. Most compilers also give a compiler warning, for example GCC:
test.cpp:8: warning: address of local variable `min' returned
Memory is like clay that never hardens. Allocating memory is like taking some clay out of the clay pot. Maybe you make a cat and a cheeseburger. When you have finished, you deallocate the clay by putting your figures back into the pot, but just being put into the pot does not make them lose their shape: if you or someone else looks into the pot, they will continue to observe your cat and cheeseburger sitting on the top of the clay stack until someone else comes along and makes them into something else.
The NAND gates in the memory chips are the clay, the strata that holds the NAND gates is the clay pot, and the particular voltages that represent the value of your variables are your sculptures. Those voltages do not change just because your program has taken them off the list of things it cares about.
You need to understand the stack. Add this function,
void f()
{
int a[5000];
memset( a, 0, sizeof(a) );
}
and then call it immediately after calling CoinDenom() but before writing to cout. You'll find that it no longer works.
Your local variables are stored on the stack. CoinDenom() returns a memory address that points into the stack. Very simplified and leaving out lots of details, say the stack pointer is pointing to 0x1000 just before you call CoinDenom. An int* (Coins) is pushed on the stack. This becomes CoinVal[]. Then an int, Num which becomes NumCoins. Then another int, Sum which becomes Sum. Thats 3 ints at 4 bytes/int. Then space for the local variables:
int min[Sum+1];
int i,j;
which would be (Sum + 3) * 4 bytes/int. Say Sum = 2, that gives us another 20 bytes total, so the stack pointer gets incremented by 32 bytes to 0x1020. (All of main's locals are below 0x1000 on the stack.) min is going to point to 0x100c. When CoinDenom() returns, the stack pointer is decremented "freeing" that memory but unless another function is called to have it's own locals allocated in that memory, nothing's going to happen to change what's stored in that memory.
See http://en.wikipedia.org/wiki/Calling_convention for more detail on how the stack is managed.
I have a simple add program.
int main() {
int x=10,y=10,result=0;
result=x+y;
return 0;
}
I created a LLVM frontend module pass which can traverse through the entire module.
So my pass iterates through the basic block and fetches me instructions.
FORE(iter, (*bb)) {
if(isa<AllocaInst>(iter)) {
errs()<<"The address of allocated variable is "<<&(*iter);
}
}
The output of this would be the address of alloca instruction but not the real stack address of the local variable.
Is there any way I can get the stack address of local variable using pass?
You can't.
It is not even guaranteed the address of the variables will be the same when you run the program multiple times (see Address Space Layout Randomization), so there's no way one could predict the address statically.
Even if we did know that the stack always started at a fixed address, it is perfectly normal for the same variable to have a different address during different calls of the function. Take this for example:
#include <stdio.h>
void f() {
int x;
printf("The address of x is: %p\n", &x);
}
void g() {
int y;
f();
}
int main() {
f();
g();
return 0;
}
Assuming you compile this without optimizations (which would remove the definition of y), this will print two different addresses for x. So when looking at the definition of f, we couldn't possibly predict the address of its variables because it isn't even going to be the same within the same run of the program.
Furthermore your phase isn't going to know which optimizations are going to run after it, which variables are going to be stored in registers or which register are going to be spilled to stack memory - all of which would affect the addresses.