Why does this C++ code work? Stack memory and pointers - c++

Here is some C++ code.
#include <iostream>
using namespace std;
class test{
int a;
public:
test(int b){
a = b;
cout << "test constructed with data " << b << endl;
}
void print(){
cout << "printing test: " << a << endl;
}
};
test * foo(){
test x(5);
return &x;
}
int main()
{
test* y = foo();
y->print();
return 0;
}
Here is its output:
test constructed with data 5
printing test: 5
My question: Why does the pointer to x still "work" outside of the context of function foo? As far as I understand, the function foo creates an instance of test and returns the address of that object.
After the function exits, the variable x is out of scope. I know that C++ isn't garbage collected- what happens to a variable when it goes out of scope? Why does the address returned in foo() still point to what seems like a valid object?
If I create an object in some scope, and want to use it in another, should I allocate it in the heap and return the pointer? If so, when/where would I delete it
Thanks

x is a local variable. After foo returns there's no guarantee that the memory on the stack that x resided in is either corrupt or intact. That's the nature of undefined behavior. Run a function before reading x and you'll see the danger of referencing a "dead" variable:
void nonsense(void)
{
int arr[1000] = {0};
}
int main()
{
test* y = foo();
nonsense();
y->print();
return 0;
}
Output on my machine:
test constructed with data 5
printing test: 0

When a variable goes out of scope, the destructor is called (for non POD data) and the location occupied by that variable is now considered unallocated, but the memory isn't actually written, so the old value remains. This doesn't mean that you can still safely access this value because it resides in a location marked as 'free'. New variables can reside or allocation can occur in this memory space.
The reason why the memory isn't erased is because you can't actually erase memory, what you could do is write something to it like all zeros or all ones or random, which is not only pointless, but performance-degrading.
It has nothing to do with garbage-collection. A garbage collector doesn't "erase" memory, but marks it as being free. The reason why the behaviour you described exists in C and not in Java for instance is not the garbage-collector, but the fact that C lets you access via pointers any memory you want, allocated or not, valid or not, and Java doesn't (To be fair the garbage collector is a reason why Java can make so that you can't access any memory).
An analogy can be made with what happens on disk when you delete a file. The file contents remain (they are not overwritten), but instead pointers (handles) in the file system are modified so that that memory on the disk is considered free. That's why special tools can recover deleted files: the information is still there until something new writes over it, and if you can point to it you can obtain it. Is almost the same thing with pointers in C. Think what would it mean to actually write 4GB on disk every time you delete a 4GB file. There is no need to write in memory for each variable that goes out of scope the entire size of that variable. You just mark it's free.

Related

How variable 'a' will exist without any object creation?

How variable int a is in existence without object creation? It is not of static type also.
#include <iostream>
using namespace std;
class Data
{
public:
int a;
void print() { cout << "a is " << a << endl; }
};
int main()
{
Data *cp;
int Data::*ptr = &Data::a;
cp->*ptr = 5;
cp->print();
}
Your code shows some undefined behavior, let's go through it:
Data *cp;
Creates a pointer on the stack, though, does not initialize it. On it's own not a problem, though, it should be initialized at some point. Right now, it can contain 0x0badc0de for all we know.
int Data::*ptr=&Data::a;
Nothing wrong with this, it simply creates a pointer to a member.
cp->*ptr=5;
Very dangerous code, you are now using cp without it being initialized. In the best case, this crashes your program. You are now assigning 5 to the memory pointed to by cp. As this was not initialized, you are writing somewhere in the memory space. This can include in the best case: memory you don't own, memory without write access. In both cases, your program can crash. In the worst case, this actually writes to memory that you do own, resulting in corruption of data.
cp->print();
Less dangerous, still undefined, so will read the memory. If you reach this statement, the memory is most likely allocated to your program and this will print 5.
It becomes worse
This program might actually just work, you might be able to execute it because your compiler has optimized it. It noticed you did a write, followed by a read, after which the memory is ignored. So, it could actually optimize your program to: cout << "a is "<< 5 <<endl;, which is totally defined.
So if this actually, for some unknown reason, would work, you have a bug in your program which in time will corrupt or crash your program.
Please write the following instead:
int main()
{
int stackStorage = 0;
Data *cp = &stackStorage;
int Data::*ptr=&Data::a;
cp->*ptr=5;
cp->print();
}
I'd like to add a bit more on the types used in this example.
int Data::*ptr=&Data::a;
For me, ptr is a pointer to int member of Data. Data::a is not an instance, so the address operator returns the offset of a in Data, typically 0.
cp->*ptr=5;
This dereferences cp, a pointer to Data, and applies the offset stored in ptr, namely 0, i.e., a;
So the two lines
int Data::*ptr=&Data::a;
cp->*ptr=5;
are just an obfuscated way of writing
cp->a = 5;

Null Pointers & Object Pointers

I was given instructions to show that a pointer variable can contain a pointer to a valid object, deleted object, null, or a random value. Set four pointer variables a,b,c, and d to show these possibilities. The the thing im not sure about are the pointer objects. Can someone explain what I need to do to show case these pointers. or if I did it right.
#include <iostream>
#include <string>
#include <time.h>
using namespace std;
class Pointer
{
public:
Pointer()
{
int num = 2;
}
Pointer(int num)
{
this->numb = num;
}
void set(int num)
{
numb = num;
}
int Get()
{
return numb;
}
private:
int numb;
};
int main ()
{
Pointer point;
Pointer* a;
a = &point;
Pointer*b = new Pointer(10);
delete b;
int* c = NULL;
srand(unsigned(time(0)));
int randNum = rand()%100;
int *d;
d = &randNum;
cout <<"Pointer a: " << a << endl;
cout <<"Pointer b: " << b << endl;
cout <<"Pointer c: " << c << endl;
cout <<"Pointer d: " << *d << endl;
//(*a) = (*b);
return 0;
}
Not a complete answer, because this is homework, but here’s a hint: once you generate random bits (The C++ STL way to do this is with <random> and perhaps <chrono>), you can store them in a uintptr_t, an unsigned integer the same size as a pointer, and convert those bits to a pointer with reinterpret_cast<void*>(random_bits). The results are undefined behavior. Trying to dereference the pointer might crash the program, or it might appear to work and corrupt some other memory location, or it might do different things on different runs of the program, but nothing you do with it is guaranteed to work predictably. This is Very Bad.
If “random” really means arbitrary, for this assignment, you could just declare a pointer off the stack (that is, inside a function and not static) and not initialize it. Ask your instructor if you aren’t sure.
That’s a very common source of irreproducible bugs, and I would recommend you get into the habit now of always initializing your pointer variables when you declare them. Often, that lets you declare them int * const, which is good practice, but even if you need to wait to assign a real value to them, if you initialize your pointers to NULL or nullptr, you will always see immediately in the debugger that the value is uninitialized, and you will always crash the program if you try to use it uninitialized.
I would like to first point out what the word NULL means or nullptr, which is what you should be using in C++11. The null pointer just assigns the pointer to an inaccessible address or location in the memory. This allows for safer code, because the "dangling pointers" left over could be bad.
A header file and another cpp file to define the class
Class Something{
// Class implementation here
};
Now lets go to the main.cpp file
#include "The header you made.hpp" // (.h or .hpp)
int main(int argc, char *argv[]){
Something objOne;
Something *objPointer = &objOne;
// Do stuff with objPointer
// Like use your member functions
Something *objPointerTwo = &objOne;
// Now objPointerTwo points to the same object
// Lets try some runtime allocation of memory
Something *objHeap = new Something();
// Do something with the pointer
// Watch this
delete objHeap;
objHeap = nulltpr;
// Now what happens when you try to access methods again with objHeap
// Your program will display a segmentation fault error
// Which means you are trying to mess with memory
// that the compiler does not want you too
}
All right so what is this memory? Well to put it simply you have the heap and the stack. All the stuff you put into your program is put into the stack when it is compiled. So when the program runs the code follows main and traces the stack kind of like a stack of boxes. You may get need to search through all the top ones to get to the bottom. Now what if a person didn't know for example how many students were going to be in the class that year. Well the program is already running how do you make more room? This is where the heap comes in. The new keyword allows you to allocate memory at runtime, and you can "point" to memory on the "heap" if you will. Now while this is all good and cool, it can be potentially dangerous which is why people consider C++ and C dangerous languages. When you are done using that memory on the heap you have to "delete" it so when the pointer moves, the memory does not get lost and cause problems. It is kind of like a sneaky ninja and it goes and causes trouble.
A good way to think about how pointers can be good for objects is if you create say a deck class, well how many cards does the user want we wont know until runtime, so lets have them type in the number, and we can allocate it on the heap! Look Below
int main(void){
int count;
Deck *deck = nullptr;
std::cout << "Enter amount of cards please:";
std::cin >> count;
deck = new Deck(count); // RUNTIME ALLOCATION ON HEAP!!!!
// This is so cool right, you can have a deck of any size
// Do stuff with deck
// Now that we are done with the deck don't forget to delete it
// We do not need all those cards on the heap anymore so....
delete deck; // Ahhh almost done
deck = nullptr; // Just in case since dangling pointers are weird
}
The key thing to understand here is, what is a pointer, what is it pointing to, and how do they work with the memory!

stack variable lifetime curious example [duplicate]

This question already has answers here:
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 8 years ago.
Please consider this simple example:
#include <iostream>
const int CALLS_N = 3;
int * hackPointer;
void test()
{
static int callCounter = 0;
int local = callCounter++;
hackPointer = &local;
}
int main()
{
for(int i = 0; i < CALLS_N; i++)
{
test();
std::cout << *hackPointer << "(" << hackPointer << ")";
std::cout << *hackPointer << "(" << hackPointer << ")";
std::cout << std::endl;
}
}
The output (VS2010, MinGW without optimization) has the same structure:
0(X) Y(X)
1(X) Y(X)
2(X) Y(X)
...
[CALLS_N](X) Y(X)
where X - some address in memory, Y - some rubbish number.
What is done here is the case of undefined behaviour. However I want to understand why there is such behaviour in current conditions (and it is rather stable for two compilers).
It seems that after test() call first read of hackPointer leads to valid memory, but second successive instant read of it leads to rubbish. Also on any call address of local is the same. I always thought that memory for stack variable is allocated on every function call and is released after return but I can't explain output of the program from this point of view.
"Releasing" automatic storage doesn't make the memory go away, or change the pattern of bits stored there. It just makes it available for reuse, and causes undefined behaviour if you try to access the object that used to be there.
Immediately after returning from the function, the memory occupied by the local probably hasn't been overwritten, so reading it will probably give the value that was assigned within the function.
After calling another function (in this case, operator<<()), the memory is likely to have been reused for a variable within that function, so probably has a different value.
You are quite right that this is undefined behaviour.
That aside, what's happening is that std::cout << *hackPointer involves a function call: operator<<() gets called after the value of *hackPointer has been read. In all likelihood, operator<<() uses its own local variables that end up on the stack where local was, wiping out the latter.

how can I corrupt the data segment of a linux process?

I am coding in C/C++. What are the possible ways to corrupt a static variable that is stored in the data segment? Is this considered a memory leak?
#include <stdio.h>
int aaa[5];
int bbb;
int main()
{
int i;
bbb=41;
for (i = 0; i < 6; ++i)
aaa[i] = 42;
printf("%d\n", bbb);
return 0;
}
the code above prints bbb=42 and not 41. this is a possible cause. another way is to modify
static data accessed via multiple threads.
Any other ways?
No, this is not a memory leak. A memory leak is when you allocate on the free store (with malloc/new) and then never free/delete the allocated block.
Note that this is undefined behaviour and is not guaranteed:
int bbb;
int aaa[5];
int main()
{
int i;
bbb=41;
for (i = 0; i < 6; ++i)
aaa[i] = 42;
printf("%d\n", bbb);
return 0;
}
g++ -o test test.cpp && ./test
41
In this particular case bbb is stored after aaa, but you shouldn't be relying on this at all cause it could be somewhere else.
Yes, there is more than one method to destroy the contents of global variables (your variables are not static in the example you posted).
Pointers are a good tool to corrupt memory and write where your program should not. Casting can also add some excitement:
#include <iostream>
using namespace std;
int aaa[5];
int bbb;
int main(void) // Do *your* main() functions always return a value????
{
double * ptr_double = 0;
// Assign the pointer to double to point to the last integer.
// Treat the last integer in the array as a double.
aaa[4] = 45;
cout << "aaa[4] = " << aaa[4] << endl;
ptr_double = (double *)(&aaa[4]);
*ptr_double = 3.14159267;
cout << "aaa[4] = " << aaa[4] << endl;
return -1;
}
With multiple threads, you can have each thread write to the global variable, then have them read back the value. Placing random delays, before writing, and after writing, can show you in more detail how it works.
Another method is to assign the address of your variable to destination register of an I/O hardware device, like a UART. When the UART receives data, it will place that data in that variable, with no regards to the purpose of the variable.
In general values are corrupted by code writing to a location it should not. The primary cause is buffer overrun: Writing more data than is allocated for the variable. Overruns can also occur from hardware devices such as DMA controllers and USB controllers. Another cause is via pointers: the pointer is pointing to an invalid location.
Variables can become corrupted by Stack Overflows and Heap Overflows. On many architectures these structures expand towards each other. Too many variables on the stack, or function recursions (or calling depth) can make the stack overwrite into the heap. Likewise, allocating too much memory from the heap can make the heap overwrite the stack.
Rather than exploring how to damage variables, I believe you should work on improving code safety: design and write your code so it has no buffer overruns, writes to correct locations and shared variables are protected from simultaneous writes by multiple tasks and threads.
All global variables, static or otherwise are initialized to the value specified in their declaration (or zero if not specified) by the loader prior to the execution of any code in the process. The default values will only be modified by code executed within the process (barring any outside interferrence from a debugger).
If you're seeing a "corrupted" value at the beginning of your program's main() function, it's most likely due to a bad action preformed from within the constructor of a global C++ object. The constructors for all global C++ objects are run prior to the invocation of main().
The easiest way to track down the source of this kind of corruption is probably to run the process under a debugger and set a watchpoint on the address of the global variable. The debugger will break when the data is modified. Odds are you have an array overrun or errant pointer problem. They can be a bitch to track down manually.
This is a typical programming error.
You define 5 items of an array of aaa while using 6 aaa's, which means you can only use aaa[0], aaa[1], aaa[2], aaa[3], and aaa[4]. aaa[5] is undefined.
This is a bug, a programming error, and nothing else. Period.

The meaning of static in C++

I thought I was fairly good with C++, it turns out that I'm not. A previous question I asked: C++ const lvalue references had the following code in one of the answers:
#include <iostream>
using namespace std;
int& GenX(bool reset)
{
static int* x = new int;
*x = 100;
if (reset)
{
delete x;
x = new int;
*x = 200;
}
return *x;
}
class YStore
{
public:
YStore(int& x);
int& getX() { return my_x; }
private:
int& my_x;
};
YStore::YStore(int& x)
: my_x(x)
{
}
int main()
{
YStore Y(GenX(false));
cout << "X: " << Y.getX() << endl;
GenX(true); // side-effect in Y
cout << "X: " << Y.getX() << endl;
return 0;
}
The above code outputs X: 100, X:200. I do not understand why.
I played with it a bit, and added some more output, namely, a cout before the delete x; and a cout after the new x; within the reset control block.
What I got was:
before delete: 0x92ee018
after new: 0x92ee018
So, I figured that static was silently failing the update to x, and the second getX was playing with (after the delete) uninitialized memory; To test this, I added a x = 0; after the delete, before the new, and another cout to ensure that x was indeed reset to 0. It was.
So, what is going on here? How come the new returns the exact same block of memory that the previous delete supposedly free'd? Is this just because that's what the OS's memory manager decided to do, or is there something special about static that I'm missing?
Thank you!
That's just what the memory manager decided to do. If you think about it, it makes a lot of sense: You just freed an int, then you ask for an int again... why shouldn't the memory manager give you back the int you just freed?
More technically, what is probably happening when you delete is that the memory manager is appending the memory block you freed to the beginning of the free list. Then when you call new, the memory manager goes scanning through its free list and finds a suitably sized block at the very first entry.
For more information about dynamic memory allocation, see "Inside storage allocation".
To your first question:
X: 100, X:200. I do not understand why.
Since Y.my_x is just a reference to the static *x in GenX, this is exactly how it supposed it to be - both are referencing to the same address in memory, and when you change the content of *x, you get a side effect.
You are accessing the memory block that is deallocated. By the c++ standard, that is an undefined behaviour, therefore anything can happen.
EDIT
I guess I have to draw :
You allocate the memory for an int, and you pass the object allocated on the heap to the constructor of Y, which stores that in the reference
then you deallocate that memory, but your object Y still holds the reference to the deallocated object
then you access the Y object again, which holds an invalid reference, referencing deallocated object, and the result you got is the result of an undefined behaviour.
EDIT2
The answer to why : implementation defined. The compiler can create new object at any location it likes.
i test your code in VC2008, the output is X: 100, X: -17221323. I think the reason is that the static x is freed. i think my test is reasonable.
This makes perfect sense for the code.
Remember that it is your pointer that is static, so when you enter this function a second time you do not need to make a new pointer, but ever time you enter this function you are making a new int for the pointer to point to.
You are also probably in debug mode where a bit more time is spent giving you nice addresses.
Exactly why the int that your pointer points to is in the same space is probably just down to pure luck, that and you are not declaring any other variables before it so the same space in memory is still free