C++: How to verify a deleted pointer - c++

I am learning about pointers in C++ currently, in college. I have coded a program that is a binary tree of objects that points to a linked list of sub-objects. IF I am even wording that correctly. Anyways, my program seems to work correctly, but I am having trouble wrapping my head around how to test pointer deletion.
For instance, my delete function for single object of the binary tree is:
void EmployeeRecord::destroyCustomerList()
{
if(m_oCustomerList != NULL)
{
delete m_oCustomerList;
m_oCustomerList = NULL;
}
}
When printing my tree, everything populates and is taken off correctly (meaning the tree is kept intact through every removal of a node)...but how do I confirm what happens to the deallocated memory? I know that since I am setting the pointer *m_oCustomerList to NULL, that I can test for a NULL value on a previously populated object, but what happens to the actual memory?
I am using Visual Studio/C++ and have read that the debugger will use a code starting at 0xCC for deallocated memory...but I can't seem to figure out how to use that information.

Note that your code
void EmployeeRecord::destroyCustomerList()
{
if(m_oCustomerList != NULL)
{
delete m_oCustomerList;
m_oCustomerList = NULL;
}
}
Simplifies to:
void EmployeeRecord::destroyCustomerList()
{
delete m_oCustomerList;
m_oCustomerList = NULL;
}
It is safe to invoke the delete operator on a null pointer in C++. It does nothing. In other words, the check for null is already "built in".
Once you delete an object, it no longer exists, and the pointer to that object becomes and indeterminate value (so it's not a bad idea to null out all copies of that pointer).
What really happens to the memory in actual C++ implementations, rather than in the abstract sense, is that it continues to exist at the same address, but is marked as free, so that it can be allocated for another purpose. An allocation request coming from the program (possibly a completely unrelated module) or possibly from another program in the system, could obtain that memory for its own use.
Any uses of a pointer to an object which no longer exists are "undefined behavior". Functions for safely verifying such a pointer do exist, but they are very platform-specific and rarely perfect.
The problem is that whereas it is not particularly hard for an implementation to confirm that a pointer is bad, it is not possible to confirm that a pointer is good. We can walk the internal memory data structures of the memory allocator to determine that some pointer refers to free storage. But what if the storage is subsequently allocated? Then the pointer no longer refers to free storage. But it does not refer to the original object which was allocated, either! This is known as an "ABA ambiguity": because some A changed into a B, but then back into A, indistinguishable from the original A.
Approaches exist to solve the ABA ambiguity (if not completely than at least partially). For instance, pointers be made "fat" so the they have an extra field in addition to the address bits. The field could contain a sequence number which is used to stamp the pointer that are returned from the allocator. Now when an object is deleted and reallocated, the new pointer to the same location has a different sequence number: we have ABA'. The pointer A has gone bad, making it B, but the when it is resurrected it comes back as A'. If we ask the system to validate A, it will correctly determine that A is bad, because it does not have the expected sequence number. The correct, valid pointer to the object is A', which does not match A.
However, sequence number fields are only so many bits wide and they will wrap around eventually. So the ABA problem has not really been solved. The validation of good versus bad pointers has only been made substantially more reliable. To absolutely deal with the ABA problem, the system must always hand out new pointers which are not equal to any pointers which could still be in use. This means never actually freeing anything (thereby running out of memory) or implementing garbage collection. (Meaning that delete actually does nothing: deleted objects are destructed, but stick around in memory until they are garbage-collected, which happens when the program no longer remembers any copies of the pointer. At that point, the program no longer remembers A, and so A can be re-introduced, and there is no ABA problem.)
To make all pointers "fat", you have to change the entire toolchain and runtime: compilers, libraries, et cetera. There are further difficulties because large programs tend to have multiple memory allocators. If you ask the wrong allocator "is this pointer valid", all it can say is "this pointer is not from my arena". Another approach you can do is to invent your own pointers and implement them as smart pointers in C++. Your pointers can support an is_valid method which tries to be as reliable as possible (dealing with the ABA problem somehow: either partially with some sequence numbers and such, or by implementing your own garbage collection scheme.)

Accessing deleted memory is undefined behaviour by the standard. For instance, if this was a multithreaded application (or some other process had injected a thread into your application) then a new allocation could allocate the memory you just deallocated before you are able to "verify" it.

Once you delete your memory and set your pointer to NULL you no longer have access to that memory even if you want it. So, there is no way to verify that it really gone. However, if you did something wrong and the memory was never deleted it would consist of a memory leak which would cause your program to increase the amount of ram it uses, you could see this as a symptom of a pointer not properly disposed of.
You will probably learn later that you will not have to worry about the deletion of your pointers because of std::shared_ptr which will delete your object when the pointer goes out of scope. Which will be safer later on because you will probably will learn that exceptions can cause your destructor to never fire leaving a memory leak.

...
...
delete m_oCustomerList;
// Try using the deleted pointer here
// This should cause a runtime exception
// which means you did free the pointer
m_oCustomerList->someStrMemberVariable = "This will fail"
...
...
Needless to say, don't do this in the actual code. Hope this helps.

Related

A pointer that can point to anywhere, how to determine if "delete" can be safely called on it?

Is there any way to distinguish two following situations at run time:
double ptr * = new double(3.14159);
double variable = 3.14159
double * testPtr_1 = ptr;
double * testPtr_2 = &variable;
delete testPtr_1 // fine...
delete testPtr_2 // BIG RUN TIME ERROR !!!
I have find myself in situation in with I need to call delete operator for some unknown pointer. The pointer can point to anywhere (to a "local" variable or to dynamically allocated variable).
How can I find out where my "unknown" pointer points, and therefore choose when to and when not to call operator delete on it
EDIT:
Ok I see that everyone is pointing me to the smart pointers, but what if I am trying to write my own set of smart pointers (that is The reason behind my question)?
There is no way to test if a pointer is pointing to a memory area that would be valid to delete. Moreover,
There is no way to tell between pointers that must be freed with delete vs. delete[],
There is no way to tell between the pointers that have been freed and pointers that have not been freed,
There is no way to tell among pointers to an automatic variable, pointers to static variable, and pointers to dynamically allocated blocks.
The approach that you should take is tracking allocations/deallocations by some other means, such as storing flags along with your pointers. However, this is rather tedious: a much better practice is to switch to smart pointers, which would track resources for you.
You need to set some better coding practices for yourself (or for your project).
Especially since most platforms have, at the very least, a C++11-compliant compiler, there's no reason not to be using the following paradigm:
Raw Pointers (T*) should ONLY be used as non-owning pointers. If you receive a T* as the input for a function or constructor, you should assume you have no responsibility for deleting it. If you have an instance or local variable that is a T*, you should assume you have no responsibility for deleting it.
Unique Pointers (std::unique_ptr<T>) should be used as single-ownership pointers, and in general, these should be your default go-to choice for any situation where you need to dynamically allocate memory. std::make_unique<T>() should be preferred for creating any kind of Unique Pointer, as this prevents you from ever seeing the raw pointer in use, and it prevents issues like you described in your original post.
Shared Pointers (std::shared_ptr<T> and std::weak_ptr<T>) should ONLY be used in situations where it is logically correct to have multiple owners of an object. These situations occur less often than you think, by the way! std::make_shared<T>() is the preferred method of creating Shared Pointers, for the same reasons as std::make_unique, and also because std::make_shared can perform some optimizations on the allocations, improving performance.
Vectors (std::vector<T>) should be used in situations where you need to allocate multiple objects into heap space, the same as if you called new T[size]. There's no reason to use pointers at all except in very exotic situations.
It should go without saying that you need to take my rules of "ONLY do 'x'" with a grain of salt: Occasionally, you will have to break those rules, and you might be in a situation where you need a different set of rules. But for 99% of use-cases, those rules are correct and will best convey the semantics you need to prevent memory leaks and properly reason about the behavior of your code.
You cannot.
Avoid raw pointers and use smart pointers, particularly std::unique_ptr. It conveys clearly who is responsible for deleting the object, and the object will be deleted when the std::unique_ptr goes out of scope.
When creating objects, avoid using new. Wrap them in a smart pointer directly and do not take addresses of anything to wrap it in a smart pointer. This way, all raw pointers will never need freeing and all smart pointers will get cleaned up properly when their time has come.
Okay, some things you can distinguish in a very platform-specific, implementation-defined manner. I won’t go into details here, because it’s essentially insane to do (and, again, depends on the platform and implementation), but you are asking for it.
Distinguish local, global and heap variables. This is actually possible on many modern architectures, simply because those three are different ranges of the address space. Global variables live in the data section (as defined by the linker and run-time loader), local variables on the stack (usually at the end of the address space) and heap variables live in memory obtained during run-time (usually not at the end of the address space and of course not overlapping the data and code sections, a.k.a. "mostly everything else"). The memory allocator knows which range that is and can tell you details about the blocks in there, see below.
Detect already-freed variables: you can ask the memory allocator that, possibly by inspecting its state. You can even find out when a pointer points into a allocated region and then find out the block to which it belongs. This is however probably computationally expensive to do.
Distinguishing heap and stack is a bit tricky. If your stack grows large and your program is running long and some piece of heap has been returned to the OS, it is possible that an address which formerly belonged to the heap now belongs to the stack (and the opposite may be possible too). So as I mentioned, it is insane to do this.
You can't reliably. This is why owning raw pointers are dangerous, they do not couple the lifetime to the pointer but instead leave it up to you the programmers to know all the things that could happen and prepare for them all.
This is why we have smart pointers now. These pointers couple the life time to the pointer which means the pointer is only deleted once it is no longer in use anywhere. This makes dealing with pointer much more manageable.
The cpp core guildlines suggests that a raw pointer should never be deleted as it is just a view. You are just using it like a reference and it's lifetime is managed by something else.
Ok I see that everyone is pointing me to the smart pointers, but what if I am trying to write my own set of smart pointers (that is The reason behind my question)?
In that case do like the standard smart pointers do and take a deleter which you default to just using delete. That way if the user of the class wants to pass in a pointer to a stack object they can specify a do nothing deleter and you smart pointer will use that and, well, do nothing. This puts the onus on the person using the smart pointer to tell the pointer how to delete what it points to. Normally they will never need to use something other than the default but if they happen to use a custom allocator and need to use a custom deallocator they can do so using this method.
Actually you can. But memory overhead occurs.
You overload new and delete operator and then keep track of allocations and store it somewhere(void *)
#include<iostream>
#include<algorithm>
using namespace std;
void** memoryTrack=(void **)malloc(sizeof(void *)*100); //This will store address of newly allocated memory by new operator(really malloc)
int cnt=0;//just to count
//New operator overloaded
void *operator new( size_t stAllocateBlock ) {
cout<<"in new";
void *ptr = malloc(stAllocateBlock); //Allocate memory using malloc
memoryTrack[cnt] = ptr;//Store it in our memoryTrack
cnt++; //Increment counter
return ptr; //return address generated by malloc
}
void display()
{
for(int i=0;i<cnt;i++)
cout<<memoryTrack[i]<<endl;
}
int main()
{
double *ptr = new double(3.14159);
double variable = 3.14159;
double * testPtr_1 = ptr;
double * testPtr_2 = &variable;
delete testPtr_1; // fine...
delete testPtr_2;
return 0;
}
Now the most important function(You will have to work on this because it is not complete)
void operator delete( void *pvMem )
{
//Just printing the address to be searched in our memoryTrack
cout<<pvMem<<endl;
//If found free the memory
if(find(memoryTrack,memoryTrack+cnt,pvMem)!=memoryTrack+cnt)
{
//cout<<*(find(memoryTrack,memoryTrack+cnt,pvMem));
cout<<"Can be deleted\n";
free (pvMem);
//After this make that location of memoryTrack as NULL
//Also keep track of indices that are NULL
//So that you can insert next address there
//Or better yet implement linked list(Sorry was too lazy to do)
}
else
cout<<"Don't delete memory that was not allocated by you\n";
}
Output
in new
0xde1360
0xde1360
Can be deleted
0xde1360
0x7ffe4fa33f08
Dont delete memory that was not allocated by you
0xde1360
Important Node
This is just basics and just code to get you started
Open for others to edit and make necessary changes/optimization
Cannot use STL, they use new operator(if some can implement them please do,it would help to reduce and optimize the code)

Nulling pointers defined within function

In the codebase I inherited, I notice that the previous coder would null pointers init'ed within the function before the function closes.
Something like:
void MainClass::run() {
MyClass* _classPtr = GetClassPtr(); // Assume no problems here.
// do stuff to _classPtr
_classPtr = nullptr; // Is this even necessary?
return;
}
I find it unnecessary since the pointer's memory (Not the object itself, just the pointer) should be freed up at function close. Is that true?
It's not necessary to do this. When the function goes out of scope the the memory used by the pointer will be release regardless of what it's set to. However, the object pointed to will not be released.
This is probably just a coding standard adopted by the previous developer.
The variable _classPtr will disappear when it goes out of scope, which happens when the function returns, so no there's no need to reassign it to be a null pointer.
Talking about returning, once the function reaches its end, it will return automatically, there's no need for an explicit return at the end of a function without return value.
You are correct, it's just a personal programming practice, and does not accomplish anything. The compiler is likely to optimize the whole thing away, so this will not actually end up producing any actual code at run time.
It could also be remnants of prior versions of this functions, that might have had more code that followed, but it was removed, except that the re-initialization part was forgotten.
only the memory allocated in the heap i.e memory initialized using new operator are not freed at closing the functions .
Local variables or auto variables defined inside the functions would be automatically freed .
One useful pattern that could be used is that each pointer is nulled immediately after the resources it holds are recycled.
If a pointers resources need not be recycled, and is no longer going to be used, it is nulled without the recycling.
Then, at any point in the function while debugging or reasoning about it, the only non-zero pointers are those that have "live" (valid) memory, or are in the midst of being cleaned up.
If you know a given pointer always holds a resource, you can "blindly" do an "if (ptr) delete ptr; ptr=0;` without worrying that someone has already cleaned it up.
On the other hand, a line of code saying ptr=0; without a delete right next to it tells you that the pointer was not owning any memory, as opposed to it being leaked. (or, someone deleted the delete accidentally, which is less likely)
You might think this is inefficient; but if the compiler can prove that nobody will read a pointer after a point where you set it to zero, the compiler can (under the as-if rule) remove the nulling of the pointer.
So the cost is zero, if the compiler can figure it out. If the code is complex enough that the compiler cannot figure it out, maybe nulling it redundantly would be a good idea anyhow?

Pointer pointing to deleted stack memory

I believe what I have just experienced is called "undefined behavior", but I'm not quite sure. Basically, I had an instance declared in an outer scope that holds addresses of a class. In the inner level I instantiated an object on the stack and stored the address of that instance into the holder.
After the inner scope had escaped, I checked to see if I could still access methods and properties of the removed instance. To my surprise it worked without any problem.
Is there a simple way to combat this? Is there a way I can clear deleted pointers from the list?
example:
std::vector<int*> holder;
{
int inside = 12;
holder.push_back(&inside);
}
cout << "deleted variable:" << holder[0] << endl;
Is there a simple way to combat this?
Sure, there are a number of ways to avoid this sort of problem.
The easiest way would be to not use pointers at all -- pass objects by value instead. i.e. In your example code, you could use a std::vector<int> instead of a std::vector<int *>.
If your objects are not copy-able for some reason, or are large enough that you think it will be too expensive to make copies of them, you could allocate them on the heap instead, and manage their lifetimes automatically using shared_ptr or unique_ptr or some other smart-pointer class. (Note that passing objects by value is more efficient than you might think, even for larger objects, since it avoids having to deal with the heap, which can be expensive... and modern CPUs are most efficient when dealing with contiguous memory. Finally, modern C++ has various optimizations that allow the compiler to avoid actually doing a data copy in many circumstances)
In general, retaining pointers to stack objects is a bad idea unless you are 100% sure that the pointer's lifetime will be a subset of the lifetime of the stack object it points to. (and even then it's probably a bad idea, because the next programmer who takes over the code after you've moved on to your next job might not see this subtle hazard and is therefore likely to inadvertently introduce dangling-pointer bugs when making changes to the code)
After the inner scope had escaped, I checked to see if I could still
access methods and properties of the removed instance. To my surprise
it worked without any problem.
That can happen if the memory where the object was hasn't been overwritten by anything else yet -- but definitely don't rely on that behavior (or any other particular behavior) if/when you dereference an invalid pointer, unless you like spending a lot of quality time with your debugger chasing down random crashes and/or other odd behavior :)
Is there a way I can clear deleted pointers from the list?
In principle, you could add code to the objects' destructors that would go through the list and look for pointers to themselves and remove them. In practice, I think that is a poor approach, since it uses up CPU cycles trying to recover from an error that a better design would not have allowed to be made in the first place.
Btw this is off topic but it might interest you that the Rust programming language is designed to detect and prevent this sort of error by catching it at compile-time. Maybe someday C++ will get something similar.
There is no such thing as deleted pointer. Pointer is just a number, representing some address in your process virtual address space. Even if stack frame is long gone, memory, that was holding it is still available, since it was allocated when thread started, so technically speaking, it is still a valid pointer, valid in terms, that you could dereference it and get something. But since object it was pointing is already gone, valid term will be dangling pointer. Moral is that if you have pointer to the object in the stack frame, there is no way to determine is it valid or not, not even using functions like IsBadReadPtr (Win32 API just for example). The best way to prevent such situations is avoid returning and storing pointers to the stack objects.
However, if you wish to track your heap allocated memory and automatically deallocate it after it is no longer used, you could utilize smart pointers (std::shared_ptr, boost::shared_ptr, etc).

Do I really have to nullify all the members in move constructor/move asigment or just pointers?

I'm having really hard time to learn nullifying the "other" object, I've just read the whole big article about move semantics here and I'm disappointed because it does not cover nullifying.
Please explain me do we really need to nullify all the members of "other" or just pointers that are pointing to dynamically allocated memory?
Why would we care about nullifying the members (of other object) that are not on the heap? any good reason?
After you've moved from the object, its destructor is still going to get called once it's cleaned up. This means that it needs to know whether or not it should free resources it owns, and hence they need to be nullified (or whatever you call it!) so pointers don't get double-deleted etc.
You need to leave the object that has been moved from in a valid state. I take this to mean that no use of that object should result in undefined behaviour. In practice, that means not leaving pointers pointing to released memory. I can imagine there could be other scenarios, such as if the object is holding onto any other resources (open sockets, files, whatever).
I dont know if you actually mean this in case of just constructor or even in case of normal pointers allocated dynamically .
But Nullyfing a pointer saves it from being termed as Dangling pointer as it points to null after the memory allocated for it is free. Further use of these pointers and dereferencing those will lead to core dumps which are easy to locate than when not done so . Not doing so can give unexpected results and lead to core dumps hard to find and even data corruption.
In short You should take care of only nullyfing the pointer.
As usual, "it depends".
The moved-from object will still have to be destroyed, or possibly assigned a new value. You have to leave it in a state where these operations will work.
If you have "stolen" pointers to dynamic data, you will probably have to assign null to the original pointers so they can be deleted. If one of the integer members holds the size of the object pointed to, you might have to zero that (to be consistent with the null pointer). Other values can possibly be left as they are.

Deleting memory and effect on data in concerned locations

I am relatively new to programming so this may well sound like a stupid question to you seasoned pros out there. Here goes:
In C++, when I use the delete operator on arrays, I have noticed that the data contained in the released memory locations is preserved. For example:
int* testArray=new int[5];
testArray[3]=24;
cout<<testArray[3]; //prints 24
delete [] testArray;
cout<<testArray[3]; // still prints 24
Subsequently, am I right in assuming that since testArray[3] still prints 42 , the data in the deleted memory location is still preserved? If so, does this notion hold true for other languages, and is there any particular reason for this?
Shouldn't "freed" memory locations have null data, or is "free memory" just a synonym for memory that can be used by other applications, irrespective of whether the locations contain data or not?
I've noticed this is not the case when it comes to non array types such as int, double etc. Dereferencing and outputting the deleted variable prints 0 rather than the data. I also have a sneaking suspicion that I might be using wrong syntax to delete testArray, which will probably make this question all the more stupid. I'd love to hear your thoughts nonetheless.
Once you deallocate the memory by calling delete and try to access the memory at that address again it is an Undefined Behavior.
The Standard does not mandate the compilers to do anything special in this regard. It does not ask the compilers to mark the de-allocated memory with 0 or some special magic numbers.It is left out as the implementation detail for the compilers. Some compiler implementations do mark such memory with some special magic numbers but it is left up to each compiler implementation.
In your case, the data, still exists at the deallocated addresses because perhaps there is no other memory requirement which needed that memory to be re-utilized and the compiler didn't clear the contents of previous allocation(since it is not needed to).
However, You should not rely on this at all as this might not be the case always. It still is and will be an Undefined Behavior.
EDIT: To answer the Q in comment.
The delete operator does not return any value so you cannot check the return status however the Standard guarantees that the delete operator will sucessfully do it's job.
Relevant quote from the C++03 Standard:
Section §3.7.3.2.4:
If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, render-ing invalid all pointers referring to any part of the deallocated storage.
The data is still there, because when you free, it frees it in the allocation table -- the system would be very slow if it had to zero over all the memory each time free() or delete is called.
This is the same in any language.
I think the non-array types were set to zero because they were in fact statically allocated rather than dynamically allocated.
non-POD data will be altered in a destructor (which might appear as being null-ed in a debugger).
Freed data is just usable, indeed.
You can NOT depend on the data being unaltered after delete. On a related note, debugging malloc's or runtime libraries will frequently reset the data to a specific signature (0xdeadbeef, 0xdcdcdcdc etc) so you can easily spot accesses to deleted memory in a debugger.