It has been my observation that if free( ptr ) is called where ptr is not a valid pointer to system-allocated memory, an access violation occurs. Let's say that I call free like this:
LPVOID ptr = (LPVOID)0x12345678;
free( ptr );
This will most definitely cause an access violation. Is there a way to test that the memory location pointed to by ptr is valid system-allocated memory?
It seems to me that the the memory management part of the Windows OS kernel must know what memory has been allocated and what memory remains for allocation. Otherwise, how could it know if enough memory remains to satisfy a given request? (rhetorical) That said, it seems reasonable to conclude that there must be a function (or set of functions) that would allow a user to determine if a pointer is valid system-allocated memory. Perhaps Microsoft has not made these functions public. If Microsoft has not provided such an API, I can only presume that it was for an intentional and specific reason. Would providing such a hook into the system prose a significant threat to system security?
Situation Report
Although knowing whether a memory pointer is valid could be useful in many scenarios, this is my particular situation:
I am writing a driver for a new piece of hardware that is to replace an existing piece of hardware that connects to the PC via USB. My mandate is to write the new driver such that calls to the existing API for the current driver will continue to work in the PC applications in which it is used. Thus the only required changes to existing applications is to load the appropriate driver DLL(s) at startup. The problem here is that the existing driver uses a callback to send received serial messages to the application; a pointer to allocated memory containing the message is passed from the driver to the application via the callback. It is then the responsibility of the application to call another driver API to free the memory by passing back the same pointer from the application to the driver. In this scenario the second API has no way to determine if the application has actually passed back a pointer to valid memory.
There's actually some functions called IsBadReadPtr(), IsBadWritePtr(), IsBadStringPtr(), and IsBadCodePtr() that might do the job, but do not use it ever. I mention this only so that you are aware that these options are not to be pursued.
You're much better off making sure you set all your pointers to NULL or 0 when it points to nothing and check against that.
For example:
// Set ptr to zero right after deleting the pointee.
delete ptr; // It's okay to call delete on zero pointers, but it
// certainly doesn't hurt to check.
Note: This might be a performance issue on some compilers (see the section "Code Size" on this page) so it might actually be worth it to do a self-test against zero first.
ptr = 0;
// Set ptr to zero right after freeing the pointee.
if(ptr != 0)
{
free(ptr); // According to Matteo Italia (see comments)
// it's also okay to pass a zero pointer, but
// again it doesn't hurt.
ptr = 0;
}
// Initialize to zero right away if this won't take on a value for now.
void* ptr = 0;
Even better is to use some variant of RAII and never have to deal with pointers directly:
class Resource
{
public:
// You can also use a factory pattern and make this constructor
// private.
Resource() : ptr(0)
{
ptr = malloc(42); // Or new[] or AcquiteArray() or something
// Fill ptr buffer with some valid values
}
// Allow users to work directly with the resource, if applicable
void* GetPtr() const { return ptr; }
~Resource()
{
if(ptr != 0)
{
free(ptr); // Or delete[] or ReleaseArray() or something
// Assignment not actually necessary in this case since
// the destructor is always the last thing that is called
// on an object before it dies.
ptr = 0;
}
}
private:
void* ptr;
};
Or use the standard containers if applicable (which is really an application of RAII):
std::vector<char> arrayOfChars;
Short answer: No.
There is a function in windows that supposedly tells you if a pointer points to real memory (IsBadreadPtr() and it's ilk) but it doesn't work and you should never use it!
The true solution to your problem is to always initialize pointers to NULL, and reset them to NULL once you've deleted them.
EDIT based on your edits:
You're really hinting at a much larger question: How can you be sure your code continues to function properly in the face of client code that screws up?
This really should be a question on its own. There are no simple answers. But it depends on what you mean by "continue to function properly."
There are two theories. One says that even if client code sends you complete crap, you should be able to trudge along, discarding the garbage and processing the good data. A key to accomplishing this is exception handling. If you catch an exception when processing the client's data, roll your state back and try to return as if they had never called you at all.
The other theory is to not even try to continue, and to just fail. Failing can be graceful, and should include some comprehensive logging so that the problem can be identified and hopefully fixed in the lab. Kick up error messages. Tell the user some things to try next time. Generate minidumps, and send them automatically back to the shop. But then, shut down.
I tend to subscribe to the second theory. When client code starts sending crap, the stability of the system is often at risk. They might have corrupted heaps. Needed resources might not be available. Who knows what the problem might be. You might get some good data interspersed with bad, but you dont even know if the good data really is good. So shut down as quickly as you can, to mitigate the risk.
To address your specific concern, I don't think you have to worry about checking the pointer. If the application passes your DLL an invalid address, it represents a memory management problem in the application. No matter how you code your driver, you can't fix the real bug.
To help the application developers debug their problem, you could add a magic number to the object you return to the application. When the your library is called to free an object, check for the number, and if it isn't there, print a debug warning and don't free it! I.e.:
#define DATA_MAGIC 0x12345678
struct data {
int foo; /* The actual object data. */
int magic; /* Magic number for memory debugging. */
};
struct data *api_recv_data() {
struct data *d = malloc(sizeof(*d));
d->foo = whatever;
d->magic = DATA_MAGIC;
return d;
}
void api_free_data(struct data *d) {
if (d->magic == DATA_MAGIC) {
d->magic = 0;
free(d);
} else {
fprintf(stderr, "api_free_data() asked to free invalid data %p\n", d);
}
}
This is only a debugging technique. This will work the correctly if the application has no memory errors. If the application does have problems, this will probably alert the developer to the mistake. It only works because you're actual problem is much more constrained that your initial question indicates.
No, you are supposed to know if your pointers point to correctly allocated memory.
No. You are only supposed to have a pointer to memory that you know is valid, usually because you allocated it in the same program. Track your memory allocations properly and then you won't even need this!
Also, you are invoking Undefined Behaviour by attempting to free an invalid pointer, so it may crash or do anything at all.
Also, free is a function of the C++ Standard Library inherited from C, not a WinAPI function.
First of all, in the standard there's nothing that guarantees such a thing (freeing a non-malloced pointer is undefined behavior).
Anyhow, passing by free is just a twisted route to just trying to access that memory; if you wanted to check if the memory pointed by a pointer is readable/writable on Windows, you really should just try and be ready to deal with the SEH exception; this is actually what the IsBadxxxPtr functions do, by translating such exception in their return code.
However, this is an approach that hides subtle bugs, as explained in this Raymond Chen's post; so, long story short, no there's no safe way to determine if a pointer points to something valid, and I think that, if you need to have such a test somewhere, there's some design flaw in that code.
I'm not going to echo what every one has already said, just to add to those answers though, this is why smart pointers exist - use them!
Any time you find yourself having to work around crashes due to memory errors - take a step back, a large breath, and fix the underlying problem - it's dangerous to attempt to work around them!
EDIT based on your update:
There are two sane ways that I can think of to do this.
The client application provides a buffer where you put the message, meaning your API does not have worry about managing that memory - this requires changes to your interface and client code.
You change the semantics of the interface, and force the clients of the interface to not worry about memory management (i.e. you call with a pointer to something that is only valid in the context of the callback - if client requires, they make their own copy of the data). This does not change your interface - you can still callback with a pointer, however your clients will need to check that they don't use the buffer outside of that context - potentially if they do, it's probably not what you want and so it could be a good thing that they fix it(?)
Personally I would go for the latter as long as you can be sure that the buffer is not used outside of the callback. If it is, then you'll have to use hackery (such as has been suggested with the magic number - though this is not always guaranteed to work, for example let's say there was some form of buffer overrun from the previous block, and you somehow over-write the magic number with crap - what happens there?)
Application memory management is up to the application developer to maintain, not the operating system (even in managed languages, the operating system doesn't do that job, a garbage collector does). If you allocate an object on the heap, it is your responsibility to free it properly. If you fail to do so, your application will leak memory. The operating system (in the case of Windows at least) does know how much memory it has let your application have, and will reclaim it when your application closes (or crashes), but there is no documented way (that works) to query a memory address to see if it is an allocated block.
The best suggestion I can give you: learn to manage your memory properly.
Not without access to the internals of the malloc implementation.
You could perhaps identify some invalid pointers (e.g., ones that don't point anywhere within your process's virtual memory space), but if you take a valid pointer and add 1 to it, it will be invalid for calling free() but will still point within system-allocated memory. (Not to mention the usual problem of calling free on the same pointer more than once).
Aside from the obvious point made by others about this being very bad practice, I see another problem.
Just because a particular address doesn't cause free() to generate an access violation, does not mean it's safe to free that memory. The address could actually be an address within the heap so that no access violation occurs, and freeing it would result in heap corruption. Or it might even be a valid address to free, in which case you've freed some block of memory that might still be in use!
You've really offered no explanation of why such a poor approach should even be considered.
You apparently have determined that you're done with an object that you currently have a pointer to and if that object was malloced you want to free it. This doesn't sound like an unreasonable idea, but the fact that you have a pointer to an object doesn't tell you anything about how that object was allocated (with malloc, with new, with new[], on the stack, as shared memory, as a memory-mapped file, as an APR memory pool, using the Boehm-Demers-Weiser garbage collector, etc.) so there is no way to determine the correct way to deallocate the object (or if deallocation is needed at all; you may have a pointer to an object on the stack). That's the answer to your actual question.
But sometimes it's better to answer the question that should have been asked. And that question is "how can I manage memory in C++ if I can't always tell things like 'how was this object allocated, and how should it be deallocated'?" That's a tricky question, and, while it's not easy, it is possible to manage memory if you follow a few policies. Whenever you hear people complain about properly pairing each malloc with free, each new with delete and each new[] with delete[], etc., you know that they are making their lives harder than necessary by not following a disciplined memory management regime.
I'm going to make a guess that you're passing pointers to a function and when the function is done you want it to clean up the pointers. This policy is generally impossible to get right. Instead I would recommend following a policy that (1) if a function gets a pointer from somebody else, then that "somebody else" is expected to clean up (after all, that "somebody else" knows how the memory was allocated) and (2) if a function allocates an object, then that function's documentation will say what method should be used to deallocate the object. Second, I would highly recommend smart pointers and similar classes.
Stroustrup's advice is:
If I create 10,000 objects and have pointers to them, I need to delete those 10,000 objects, not 9,999, and not 10,001. I don't know how to do that. If I have to handle the 10,000 objects directly, I'm going to screw up. ... So, quite a long time ago I thought, "Well, but I can handle a low number of objects correctly." If I have a hundred objects to deal with, I can be pretty sure I have correctly handled 100 and not 99. If I can get then number down to 10 objects, I start getting happy. I know how to make sure that I have correctly handled 10 and not just 9."
For instance, you want code like this:
#include <cstdlib>
#include <iostream>
#include "boost/shared_ptr.hpp"
namespace {
// as a side note, there is no reason for this sample function to take int*s
// instead of ints; except that I need a simple function that uses pointers
int foo(int* bar, int* baz)
{
// note that since bar and baz come from outside the function, somebody
// else is responsible for cleaning them up
return *bar + *baz;
}
}
int main()
{
boost::shared_ptr<int> quux(new int(2));
// note, I would not recommend using malloc with shared_ptr in general
// because the syntax sucks and you have to initialize things yourself
boost::shared_ptr<int> quuz(reinterpret_cast<int*>(std::malloc(sizeof(int))), std::free);
*quuz = 3;
std::cout << foo(quux.get(), quuz.get()) << '\n';
}
Why would 0x12345678 necessarily be invalid? If your program uses a lot of memory, something could be allocated there. Really, there's only one pointer value you should absolutely rely on being an invalid allocation: NULL.
C++ does not use 'malloc' but 'new', which usually has a different implementation; therefore 'delete' and 'free' can't be mixed – neither can 'delete' and 'delete[]' (its array-version).
DLLs have their own memory-area and can't be mixed with the memory management system of the non-DLL memory-area.
Every API and language has its own memory management for its own type of memory-objects.
I.e.: You do not 'free()' or 'delete' open files, you 'close()' them. The same goes for very other API, even if the type is a pointer to memory instead of a handle.
Related
Writing a dll for file manipulation, I'm running into some issue.
To read bytes from a file via file.read I require a char* array of the desired length.
Since the length is variable, I cannot use
char* ret_chars[next_bytes];
It gives the error that next_bytes is not a constant.
Another topic here in StackOverflow says to use:
char* ret_chars = new char[next_bytes];
Creating it with "new" requires to use "delete" later though, as far as I know.
Now, how am I supposed to delete the array if the return-value of this function is supposed to be exactly this array?
Isn't it a memory leak if I don't use "delete" anywhere?
If that helps anything: This is a DLL I'll be calling from "Game Maker". Therefore I don't have the possibility to delete anything afterwards.
Hope someone can help me!
When you're writing a callback which will be invoked by existing code, you have to follow its rules.
Assuming that the authors of "Game Maker" aren't complete idiots, they will free the memory you return. So you have to check the documentation to find out what function they will use to free the memory, and then you have to call the matching allocator.
In these cases, the framework usually will provide an allocation function which is specially designed for you to use to allocate a return buffer.
Another common approach is that you never return a buffer allocated by the callback. Instead, the framework passes a buffer to your callback, and you simply fill it in. Check the documentation for that possibility as well.
Is there no sample code for writing "Game Maker" plugins/extensions?
It looks like the developers are indeed complete idiots, at least when it comes to design of plugin interfaces, but they do provide some guidance.
Note that you have to be careful with memory management. That is why I declared the resulting string global.
This implies that the Game Maker engine makes no attempt to free the returned buffer.
You too can use a global, or indeed any variable with static storage duration such as a function-local static variable. std::vector<char> would be a good choice, because it's easy to resize. This way, every time the function is called, the memory allocated for the previous call will be reused or freed. So your "leak" will be limited to the amount you return at once.
char* somefunc( void )
{
static std::vector<char> ret_buffer;
ret_buffer.resize(next_bytes);
// fill it in, blah blah
return &ret_buffer[0];
}
// std::string and return ret_string.c_str(); is another reasonable option
Your script in Game Maker Language will be responsible for making a copy of that result string before it calls your function again and overwrites it.
The new char[ n ] trick works with a runtime value, and yes - you need to delete[] the array when you're done with it or it leaks.
If you are unable to change how "Game Maker" (whatever that is) works, then the memory will be leaked.
If you can change "Game Maker" to do the right thing, then it must manage the lifetime of the returned array.
That's the real problem here - the DLL code can't know when it's no longer needed, so the calling code needs to delete it when it's done, but the calling code cannot delete it directly - it must call back to the DLL to delete it, since it was the DLL's memory manager that allocated it in the first place.
Since you say the return value must be a char[], you therefore need to export a second function from your DLL that takes the char[], and calls delete[] on it. The calling code can then call that function when it's finished with the array returned previously.
Use vector <char *> (or vector <char> depending on which you really want - the question isn't entirely clear), that way, you don't need to delete anything.
You can not use new inside a function, without calling delete, or your application will be leaking memory (which is a bad thing, because EVENTUALLY, you'll have no memory left). There is no EASY solution for this, that doesn't have some relatively strict restrictions in some way or another.
The first code sample you quoted allocates memory on the stack.
The second code sample you quote allocates memory on the heap. (Two totally different concepts).
If you are returning the array, then the function allocating the memory does not free it. It is up to the caller to delete the memory. If the caller forgets, then yes, it is a memory leak.
First, if you use new char[]; you can't use delete, but you have to use delete [].
But like you said, if you use new [] in this function without using delete [] at the end your program will be leaking.
If you want a kind of garbage collection, you can use the *smart ptr* now in the standard c++ library.
I think a `shared_ptr` would be good to achieve what you want.
> **Shared ptr** : Manages the storage of a pointer, providing a limited garbage-collection facility, possibly sharing that management with other objects.
Here is some documentation about it : http://www.cplusplus.com/reference/memory/shared_ptr/
Ok, I'll jump in as well.
If the Game Maker doesn't explicitly say it will delete this memory, then you should check to see just how big a buffer it wants and pass in a static buffer of that size instead. This avoids all sorts of nastiness relating to cross dll versioning issues with memory management. There has to be some documentation on this in their code or their API and I strongly suggest you find and read it. Game Maker is a pretty large and well known API so Google should work for info if you don't have the docs yourself.
If you're returning a char pointer, which it looks as though you are, then you can simply call delete on that pointer.
Example:
char * getString()
{
char* ret_chars = new char[next_bytes];
strcpy(ret_chars, "Hello world")
return ret_chars
}
void displayChars()
{
char* chars = getString()
cout << chars
delete [] chars
}
Just be sure to deallocate (delete) all allocated (new'd) pointers or else you'll have memory leaks where the memory is allocated and then not collected after runtime and becomes unusable. A quick and dirty way to glance through and see if you've deallocated all allocated space to count your new's and count your deletes, and they should be 1-to-1 unless some appear in condition or looped blocks.
I have a plugin architecture, where I call functions in a dynamic library and they return me a char* which is the answer, it is used at some later stage.
This is the signature of a plugin function:
char* execute(ALLOCATION_BEHAVIOR* free_returned_value, unsigned int* length);
where ALLOCATION_BEHAVIOR must be either: DO_NOT_FREE_ME, FREE_ME, DELETE_ME where the plugin (in the library) tells me how the plugin allocated the string it has just returned: DO_NOT_FREE_ME tells me, this is a variable I'm not supposed to touch (such as a const static char* which never changes) FREE_ME tells me I should use free() to free the returned value and DELETE_ME tells me to use delete[] to get rid of the memory leaks.
Obviously, I don't trust the plugins, so I would like to be able to check that if he tells me to free() the variable, indeed it is something that can really be freed ... Is this possible using todays' C/C++ technology on Linux/Windows?
Distinguishing between malloc/free and new/delete is generally not possible, at least not in a reliable and/or portable way. Even more so as new simply wrapps malloc anyway in many implementations.
None of the following alternatives to distinguish heap/stack have been tested, but they should all work.
Linux:
Solution proposed by Luca Tettananti, parse /proc/self/maps to get the address range of the stack.
As the first thing at startup, clone your process, this implies supplying a stack. Since you supply it, you automatically know where it is.
Call GCC's __builtin_frame_address function with increasing level parameter until it returns 0. You then know the depth. Now call __builtin_frame_address again with the maximum level, and once with a level of 0. Anything that lives on the stack must necessarily be between these two addresses.
sbrk(0) as the first thing at startup, and remember the value. Whenever you want to know if something is on the heap, sbrk(0) again -- something that's on the heap must be between the two values. Note that this will not work reliably with allocators that use memory mapping for large allocations.
Knowing the location and size of the stack (alternatives 1 and 2), it's trivial to find out if an address is within that range. If it's not, is necessarily "heap" (unless someone tries to be super smart-ass and gives you a pointer to a static global, or a function pointer, or such...).
Windows:
Use CaptureStackBackTrace, anything living on the stack must be between the returned pointer array's first and last element.
Use GCC-MinGW (and __builtin_frame_address, which should just work) as above.
Use GetProcessHeaps and HeapWalk to check every allocated block for a match. If none match for none of the heaps, it's consequently allocated on the stack (... or a memory mapping, if someone tries to be super-smart with you).
Use HeapReAlloc with HEAP_REALLOC_IN_PLACE_ONLY and with exactly the same size. If this fails, the memory block starting at the given address is not allocated on the heap. If it "succeeds", it is a no-op.
Use GetCurrentThreadStackLimits (Windows 8 / 2012 only)
Call NtCurrentTeb() (or read fs:[18h]) and use the fields StackBase and StackLimit of the returned TEB.
I did the same question a couple of years ago on comp.lang.c, I liked the response of James Kuyper:
Yes. Keep track of it when you allocate it.
The way to do this is to use the concept of ownership of memory. At
all times during the lifetime of a block of allocated memory, you
should always have one and only one pointer that "owns" that block.
Other pointers may point into that block, but only the owning pointer
should ever be passed to free().
If at all possible, an owning pointer should be reserved for the
purpose of owning pointers; it should not be used to store pointers to
memory it does not own. I generally try to arrange that an owning
pointer is initialized with a call to malloc(); if that's not
feasible, it should be set to NULL sometime before first use. I also
try to make sure that the lifetime of an owning pointer ends
immediately after I free() the memory it owns. However, when that's
not possible, set it to NULL immediately after free()ing that memory.
With those precautions in place, you should not let the lifetime of a
non-null owning pointer end without first passing it to free().
If you have trouble keeping track of which pointers are 'owning'
pointers, put a comment about that fact next to their declaration. If
you have lots of trouble, use a naming convention to keep track of
this feature.
If, for any reason, it is not possible to reserve an owning pointer
variable exclusively for ownership of the memory it points at, you
should set aside a separate flag variable to keep track of whether or
not that pointer currently owns the memory it points at. Creating a
struct that contains both the pointer and the ownership flag is a very
natural way to handle this - it ensures that they don't get separated.
If you have a rather complicated program, it may be necessary to
transfer ownership of memory from one owning pointer variable to
another. If so, make sure that any memory owned by target pointer is
free()d before the transfer, and unless the lifetime of the source
pointer ends immediately after the transfer, set the source pointer to
NULL. If you're using ownership flags, reset them accordingly.
The plugin/library/whatever should not be returning an enum through a passed 'ALLOCATION_BEHAVIOR*' pointer. It's messy, at best. The 'deallocation' scheme belongs with the data and should be encapsulated with it.
I would prefer to return an object pointer of some base class that has a virtual 'release()' function member that the main app can call whenever it wants/needs to and handles the 'dealloaction' as required for that object. release() could do nothing, repool the object in a cache specified in a private data memebr of the object, of just delete() it, depending on whatever override is applied by the plugin subclasses.
If this is not possible because the plugin is written in a different language, or built with a different compiler, the plugin could return a function as well as the data so that the main app can call it back with the data pointer as a parameter for the purpose of deallocation. This at least allows you to put the char* and function* into the same object/struct on the C++ side, so maintaining at least some semblance of encapsulation and allowing the plugin to choose any deallocation scheme it wants to.
Edit - a scheme like this would also work safely if the plugin used a different heap than the main app - maybe it's in a DLL that has its own sub-allocator.
On Linux you can parse /proc/self/maps to extract the location of the stack and of the heap and then check whether the pointer falls into one of ranges.
This won't tell you if the memory should be handled by free or delete though. If you control the architecture you can let the plugin free the allocated memory adding the appropriate API (IOW, a plugin_free function that is symmetrical to your execute). Another common pattern is to keep track of the allocations in a context object (created at init time) that is passed to the plugin at each call and is then used by the plugin at shutdown to do the clean up.
I'm using the following code to check student assignments. Returning stack memory is a common pitfall, so I wanted to automatically check for it.
Using sbrk
This method should work on all Unix variants and on all CPU architectures.
#include <unistd.h>
#include <stdlib.h>
#include <stdbool.h>
#include <assert.h>
bool points_to_heap(void* init_brk, void* pointer){
void* cur_brk = sbrk(0);
return ((init_brk <= pointer) && (pointer <= cur_brk));
}
int main(void){
void* init_brk = sbrk(0);
int* heapvar = malloc(10);
int i = 0;
int* stackvar = &i;
assert(points_to_heap(init_brk, heapvar));
assert(!points_to_heap(init_brk, stackvar));
return 0;
}
Using /proc/self/maps
Two issues with this method:
This code is specific to Linux running on a 64-bit x86 CPU.
This method doesn't seem to work in unit tests written using the libcheck framework. There, all stack variables are also seen as heap variables.
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
void get_heap_bounds(uint64_t* heap_start, uint64_t* heap_end){
FILE *stream;
char *line = NULL;
size_t len = 0;
ssize_t nread;
stream = fopen("/proc/self/maps", "r");
while ((nread = getline(&line, &len, stream)) != -1) {
if (strstr(line, "[heap]")){
sscanf(line, "%" SCNx64 "-%" SCNx64 "", heap_start, heap_end);
break;
}
}
free(line);
fclose(stream);
}
bool is_heap_var(void* pointer){
uint64_t heap_start = 0;
uint64_t heap_end = 0;
get_heap_bounds(&heap_start, &heap_end);
if (pointer >= (void*)heap_start && pointer <= (void*)heap_end){
return true;
}
return false;
}
Feedback on this code is welcome!
How do they allocate something on the stack that you can then free, as they've returned? That's just going to die horribly. Even using it is going to die horribly.
If you want to check whether they've returned you pointer to static data, then you probably want to get hold of your heap top and bottom (which I'm pretty sure is available on linux, using sbrk), and see if the returned pointer is in that range or not.
Of course, it's possible that even a valid pointer in that range shouldn't be freed because they've stashed another copy to it which they're going to use later. And if you're not going to trust them, you should not trust them at all.
You have to use some debugging tools to determine whether the pointer is on stack or on heap.
On windows, download Sysinternals Suite. This provides various tool for debugging.
I am learning about pointers in C++ currently, in college. I have coded a program that is a binary tree of objects that points to a linked list of sub-objects. IF I am even wording that correctly. Anyways, my program seems to work correctly, but I am having trouble wrapping my head around how to test pointer deletion.
For instance, my delete function for single object of the binary tree is:
void EmployeeRecord::destroyCustomerList()
{
if(m_oCustomerList != NULL)
{
delete m_oCustomerList;
m_oCustomerList = NULL;
}
}
When printing my tree, everything populates and is taken off correctly (meaning the tree is kept intact through every removal of a node)...but how do I confirm what happens to the deallocated memory? I know that since I am setting the pointer *m_oCustomerList to NULL, that I can test for a NULL value on a previously populated object, but what happens to the actual memory?
I am using Visual Studio/C++ and have read that the debugger will use a code starting at 0xCC for deallocated memory...but I can't seem to figure out how to use that information.
Note that your code
void EmployeeRecord::destroyCustomerList()
{
if(m_oCustomerList != NULL)
{
delete m_oCustomerList;
m_oCustomerList = NULL;
}
}
Simplifies to:
void EmployeeRecord::destroyCustomerList()
{
delete m_oCustomerList;
m_oCustomerList = NULL;
}
It is safe to invoke the delete operator on a null pointer in C++. It does nothing. In other words, the check for null is already "built in".
Once you delete an object, it no longer exists, and the pointer to that object becomes and indeterminate value (so it's not a bad idea to null out all copies of that pointer).
What really happens to the memory in actual C++ implementations, rather than in the abstract sense, is that it continues to exist at the same address, but is marked as free, so that it can be allocated for another purpose. An allocation request coming from the program (possibly a completely unrelated module) or possibly from another program in the system, could obtain that memory for its own use.
Any uses of a pointer to an object which no longer exists are "undefined behavior". Functions for safely verifying such a pointer do exist, but they are very platform-specific and rarely perfect.
The problem is that whereas it is not particularly hard for an implementation to confirm that a pointer is bad, it is not possible to confirm that a pointer is good. We can walk the internal memory data structures of the memory allocator to determine that some pointer refers to free storage. But what if the storage is subsequently allocated? Then the pointer no longer refers to free storage. But it does not refer to the original object which was allocated, either! This is known as an "ABA ambiguity": because some A changed into a B, but then back into A, indistinguishable from the original A.
Approaches exist to solve the ABA ambiguity (if not completely than at least partially). For instance, pointers be made "fat" so the they have an extra field in addition to the address bits. The field could contain a sequence number which is used to stamp the pointer that are returned from the allocator. Now when an object is deleted and reallocated, the new pointer to the same location has a different sequence number: we have ABA'. The pointer A has gone bad, making it B, but the when it is resurrected it comes back as A'. If we ask the system to validate A, it will correctly determine that A is bad, because it does not have the expected sequence number. The correct, valid pointer to the object is A', which does not match A.
However, sequence number fields are only so many bits wide and they will wrap around eventually. So the ABA problem has not really been solved. The validation of good versus bad pointers has only been made substantially more reliable. To absolutely deal with the ABA problem, the system must always hand out new pointers which are not equal to any pointers which could still be in use. This means never actually freeing anything (thereby running out of memory) or implementing garbage collection. (Meaning that delete actually does nothing: deleted objects are destructed, but stick around in memory until they are garbage-collected, which happens when the program no longer remembers any copies of the pointer. At that point, the program no longer remembers A, and so A can be re-introduced, and there is no ABA problem.)
To make all pointers "fat", you have to change the entire toolchain and runtime: compilers, libraries, et cetera. There are further difficulties because large programs tend to have multiple memory allocators. If you ask the wrong allocator "is this pointer valid", all it can say is "this pointer is not from my arena". Another approach you can do is to invent your own pointers and implement them as smart pointers in C++. Your pointers can support an is_valid method which tries to be as reliable as possible (dealing with the ABA problem somehow: either partially with some sequence numbers and such, or by implementing your own garbage collection scheme.)
Accessing deleted memory is undefined behaviour by the standard. For instance, if this was a multithreaded application (or some other process had injected a thread into your application) then a new allocation could allocate the memory you just deallocated before you are able to "verify" it.
Once you delete your memory and set your pointer to NULL you no longer have access to that memory even if you want it. So, there is no way to verify that it really gone. However, if you did something wrong and the memory was never deleted it would consist of a memory leak which would cause your program to increase the amount of ram it uses, you could see this as a symptom of a pointer not properly disposed of.
You will probably learn later that you will not have to worry about the deletion of your pointers because of std::shared_ptr which will delete your object when the pointer goes out of scope. Which will be safer later on because you will probably will learn that exceptions can cause your destructor to never fire leaving a memory leak.
...
...
delete m_oCustomerList;
// Try using the deleted pointer here
// This should cause a runtime exception
// which means you did free the pointer
m_oCustomerList->someStrMemberVariable = "This will fail"
...
...
Needless to say, don't do this in the actual code. Hope this helps.
I stumbled upon Stack Overflow question Memory leak with std::string when using std::list<std::string>, and one of the comments says this:
Stop using new so much. I can't see any reason you used new anywhere you did. You can create objects by value in C++ and it's one of the huge advantages to using the language. You do not have to allocate everything on the heap. Stop thinking like a Java programmer.
I'm not really sure what he means by that.
Why should objects be created by value in C++ as often as possible, and what difference does it make internally? Did I misinterpret the answer?
There are two widely-used memory allocation techniques: automatic allocation and dynamic allocation. Commonly, there is a corresponding region of memory for each: the stack and the heap.
Stack
The stack always allocates memory in a sequential fashion. It can do so because it requires you to release the memory in the reverse order (First-In, Last-Out: FILO). This is the memory allocation technique for local variables in many programming languages. It is very, very fast because it requires minimal bookkeeping and the next address to allocate is implicit.
In C++, this is called automatic storage because the storage is claimed automatically at the end of scope. As soon as execution of current code block (delimited using {}) is completed, memory for all variables in that block is automatically collected. This is also the moment where destructors are invoked to clean up resources.
Heap
The heap allows for a more flexible memory allocation mode. Bookkeeping is more complex and allocation is slower. Because there is no implicit release point, you must release the memory manually, using delete or delete[] (free in C). However, the absence of an implicit release point is the key to the heap's flexibility.
Reasons to use dynamic allocation
Even if using the heap is slower and potentially leads to memory leaks or memory fragmentation, there are perfectly good use cases for dynamic allocation, as it's less limited.
Two key reasons to use dynamic allocation:
You don't know how much memory you need at compile time. For instance, when reading a text file into a string, you usually don't know what size the file has, so you can't decide how much memory to allocate until you run the program.
You want to allocate memory which will persist after leaving the current block. For instance, you may want to write a function string readfile(string path) that returns the contents of a file. In this case, even if the stack could hold the entire file contents, you could not return from a function and keep the allocated memory block.
Why dynamic allocation is often unnecessary
In C++ there's a neat construct called a destructor. This mechanism allows you to manage resources by aligning the lifetime of the resource with the lifetime of a variable. This technique is called RAII and is the distinguishing point of C++. It "wraps" resources into objects. std::string is a perfect example. This snippet:
int main ( int argc, char* argv[] )
{
std::string program(argv[0]);
}
actually allocates a variable amount of memory. The std::string object allocates memory using the heap and releases it in its destructor. In this case, you did not need to manually manage any resources and still got the benefits of dynamic memory allocation.
In particular, it implies that in this snippet:
int main ( int argc, char* argv[] )
{
std::string * program = new std::string(argv[0]); // Bad!
delete program;
}
there is unneeded dynamic memory allocation. The program requires more typing (!) and introduces the risk of forgetting to deallocate the memory. It does this with no apparent benefit.
Why you should use automatic storage as often as possible
Basically, the last paragraph sums it up. Using automatic storage as often as possible makes your programs:
faster to type;
faster when run;
less prone to memory/resource leaks.
Bonus points
In the referenced question, there are additional concerns. In particular, the following class:
class Line {
public:
Line();
~Line();
std::string* mString;
};
Line::Line() {
mString = new std::string("foo_bar");
}
Line::~Line() {
delete mString;
}
Is actually a lot more risky to use than the following one:
class Line {
public:
Line();
std::string mString;
};
Line::Line() {
mString = "foo_bar";
// note: there is a cleaner way to write this.
}
The reason is that std::string properly defines a copy constructor. Consider the following program:
int main ()
{
Line l1;
Line l2 = l1;
}
Using the original version, this program will likely crash, as it uses delete on the same string twice. Using the modified version, each Line instance will own its own string instance, each with its own memory and both will be released at the end of the program.
Other notes
Extensive use of RAII is considered a best practice in C++ because of all the reasons above. However, there is an additional benefit which is not immediately obvious. Basically, it's better than the sum of its parts. The whole mechanism composes. It scales.
If you use the Line class as a building block:
class Table
{
Line borders[4];
};
Then
int main ()
{
Table table;
}
allocates four std::string instances, four Line instances, one Table instance and all the string's contents and everything is freed automagically.
Because the stack is faster and leak-proof
In C++, it takes but a single instruction to allocate space—on the stack—for every local scope object in a given function, and it's impossible to leak any of that memory. That comment intended (or should have intended) to say something like "use the stack and not the heap".
The reason why is complicated.
First, C++ is not garbage collected. Therefore, for every new, there must be a corresponding delete. If you fail to put this delete in, then you have a memory leak. Now, for a simple case like this:
std::string *someString = new std::string(...);
//Do stuff
delete someString;
This is simple. But what happens if "Do stuff" throws an exception? Oops: memory leak. What happens if "Do stuff" issues return early? Oops: memory leak.
And this is for the simplest case. If you happen to return that string to someone, now they have to delete it. And if they pass it as an argument, does the person receiving it need to delete it? When should they delete it?
Or, you can just do this:
std::string someString(...);
//Do stuff
No delete. The object was created on the "stack", and it will be destroyed once it goes out of scope. You can even return the object, thus transfering its contents to the calling function. You can pass the object to functions (typically as a reference or const-reference: void SomeFunc(std::string &iCanModifyThis, const std::string &iCantModifyThis). And so forth.
All without new and delete. There's no question of who owns the memory or who's responsible for deleting it. If you do:
std::string someString(...);
std::string otherString;
otherString = someString;
It is understood that otherString has a copy of the data of someString. It isn't a pointer; it is a separate object. They may happen to have the same contents, but you can change one without affecting the other:
someString += "More text.";
if(otherString == someString) { /*Will never get here */ }
See the idea?
Objects created by new must be eventually deleted lest they leak. The destructor won't be called, memory won't be freed, the whole bit. Since C++ has no garbage collection, it's a problem.
Objects created by value (i. e. on stack) automatically die when they go out of scope. The destructor call is inserted by the compiler, and the memory is auto-freed upon function return.
Smart pointers like unique_ptr, shared_ptr solve the dangling reference problem, but they require coding discipline and have other potential issues (copyability, reference loops, etc.).
Also, in heavily multithreaded scenarios, new is a point of contention between threads; there can be a performance impact for overusing new. Stack object creation is by definition thread-local, since each thread has its own stack.
The downside of value objects is that they die once the host function returns - you cannot pass a reference to those back to the caller, only by copying, returning or moving by value.
C++ doesn't employ any memory manager by its own. Other languages like C# and Java have a garbage collector to handle the memory
C++ implementations typically use operating system routines to allocate the memory and too much new/delete could fragment the available memory
With any application, if the memory is frequently being used it's advisable to preallocate it and release when not required.
Improper memory management could lead memory leaks and it's really hard to track. So using stack objects within the scope of function is a proven technique
The downside of using stack objects are, it creates multiple copies of objects on returning, passing to functions, etc. However, smart compilers are well aware of these situations and they've been optimized well for performance
It's really tedious in C++ if the memory being allocated and released in two different places. The responsibility for release is always a question and mostly we rely on some commonly accessible pointers, stack objects (maximum possible) and techniques like auto_ptr (RAII objects)
The best thing is that, you've control over the memory and the worst thing is that you will not have any control over the memory if we employ an improper memory management for the application. The crashes caused due to memory corruptions are the nastiest and hard to trace.
I see that a few important reasons for doing as few new's as possible are missed:
Operator new has a non-deterministic execution time
Calling new may or may not cause the OS to allocate a new physical page to your process. This can be quite slow if you do it often. Or it may already have a suitable memory location ready; we don't know. If your program needs to have consistent and predictable execution time (like in a real-time system or game/physics simulation), you need to avoid new in your time-critical loops.
Operator new is an implicit thread synchronization
Yes, you heard me. Your OS needs to make sure your page tables are consistent and as such calling new will cause your thread to acquire an implicit mutex lock. If you are consistently calling new from many threads you are actually serialising your threads (I've done this with 32 CPUs, each hitting on new to get a few hundred bytes each, ouch! That was a royal p.i.t.a. to debug.)
The rest, such as slow, fragmentation, error prone, etc., have already been mentioned by other answers.
Pre-C++17:
Because it is prone to subtle leaks even if you wrap the result in a smart pointer.
Consider a "careful" user who remembers to wrap objects in smart pointers:
foo(shared_ptr<T1>(new T1()), shared_ptr<T2>(new T2()));
This code is dangerous because there is no guarantee that either shared_ptr is constructed before either T1 or T2. Hence, if one of new T1() or new T2() fails after the other succeeds, then the first object will be leaked because no shared_ptr exists to destroy and deallocate it.
Solution: use make_shared.
Post-C++17:
This is no longer a problem: C++17 imposes a constraint on the order of these operations, in this case ensuring that each call to new() must be immediately followed by the construction of the corresponding smart pointer, with no other operation in between. This implies that, by the time the second new() is called, it is guaranteed that the first object has already been wrapped in its smart pointer, thus preventing any leaks in case an exception is thrown.
A more detailed explanation of the new evaluation order introduced by C++17 was provided by Barry in another answer.
Thanks to #Remy Lebeau for pointing out that this is still a problem under C++17 (although less so): the shared_ptr constructor can fail to allocate its control block and throw, in which case the pointer passed to it is not deleted.
Solution: use make_shared.
To a great extent, that's someone elevating their own weaknesses to a general rule. There's nothing wrong per se with creating objects using the new operator. What there is some argument for is that you have to do so with some discipline: if you create an object you need to make sure it's going to be destroyed.
The easiest way of doing that is to create the object in automatic storage, so C++ knows to destroy it when it goes out of scope:
{
File foo = File("foo.dat");
// Do things
}
Now, observe that when you fall off that block after the end-brace, foo is out of scope. C++ will call its destructor automatically for you. Unlike Java, you don't need to wait for the garbage collection to find it.
Had you written
{
File * foo = new File("foo.dat");
you would want to match it explicitly with
delete foo;
}
or even better, allocate your File * as a "smart pointer". If you aren't careful about that it can lead to leaks.
The answer itself makes the mistaken assumption that if you don't use new you don't allocate on the heap; in fact, in C++ you don't know that. At most, you know that a small amount of memory, say one pointer, is certainly allocated on the stack. However, consider if the implementation of File is something like:
class File {
private:
FileImpl * fd;
public:
File(String fn){ fd = new FileImpl(fn);}
Then FileImpl will still be allocated on the stack.
And yes, you'd better be sure to have
~File(){ delete fd ; }
in the class as well; without it, you'll leak memory from the heap even if you didn't apparently allocate on the heap at all.
new() shouldn't be used as little as possible. It should be used as carefully as possible. And it should be used as often as necessary as dictated by pragmatism.
Allocation of objects on the stack, relying on their implicit destruction, is a simple model. If the required scope of an object fits that model then there's no need to use new(), with the associated delete() and checking of NULL pointers.
In the case where you have lots of short-lived objects allocation on the stack should reduce the problems of heap fragmentation.
However, if the lifetime of your object needs to extend beyond the current scope then new() is the right answer. Just make sure that you pay attention to when and how you call delete() and the possibilities of NULL pointers, using deleted objects and all of the other gotchas that come with the use of pointers.
When you use new, objects are allocated to the heap. It is generally used when you anticipate expansion. When you declare an object such as,
Class var;
it is placed on the stack.
You will always have to call destroy on the object that you placed on the heap with new. This opens the potential for memory leaks. Objects placed on the stack are not prone to memory leaking!
One notable reason to avoid overusing the heap is for performance -- specifically involving the performance of the default memory management mechanism used by C++. While allocation can be quite quick in the trivial case, doing a lot of new and delete on objects of non-uniform size without strict order leads not only to memory fragmentation, but it also complicates the allocation algorithm and can absolutely destroy performance in certain cases.
That's the problem that memory pools where created to solve, allowing to to mitigate the inherent disadvantages of traditional heap implementations, while still allowing you to use the heap as necessary.
Better still, though, to avoid the problem altogether. If you can put it on the stack, then do so.
I tend to disagree with the idea of using new "too much". Though the original poster's use of new with system classes is a bit ridiculous. (int *i; i = new int[9999];? really? int i[9999]; is much clearer.) I think that is what was getting the commenter's goat.
When you're working with system objects, it's very rare that you'd need more than one reference to the exact same object. As long as the value is the same, that's all that matters. And system objects don't typically take up much space in memory. (one byte per character, in a string). And if they do, the libraries should be designed to take that memory management into account (if they're written well). In these cases, (all but one or two of the news in his code), new is practically pointless and only serves to introduce confusions and potential for bugs.
When you're working with your own classes/objects, however (e.g. the original poster's Line class), then you have to begin thinking about the issues like memory footprint, persistence of data, etc. yourself. At this point, allowing multiple references to the same value is invaluable - it allows for constructs like linked lists, dictionaries, and graphs, where multiple variables need to not only have the same value, but reference the exact same object in memory. However, the Line class doesn't have any of those requirements. So the original poster's code actually has absolutely no needs for new.
I think the poster meant to say You do not have to allocate everything on the heap rather than the the stack.
Basically, objects are allocated on the stack (if the object size allows, of course) because of the cheap cost of stack-allocation, rather than heap-based allocation which involves quite some work by the allocator, and adds verbosity because then you have to manage data allocated on the heap.
Two reasons:
It's unnecessary in this case. You're making your code needlessly more complicated.
It allocates space on the heap, and it means that you have to remember to delete it later, or it will cause a memory leak.
Many answers have gone into various performance considerations. I want to address the comment which puzzled OP:
Stop thinking like a Java programmer.
Indeed, in Java, as explained in the answer to this question,
You use the new keyword when an object is being explicitly created for the first time.
but in C++, objects of type T are created like so: T{} (or T{ctor_argument1,ctor_arg2} for a constructor with arguments). That's why usually you just have no reason to want to use new.
So, why is it ever used at all? Well, for two reasons:
You need to create many values the number of which is not known at compile time.
Due to limitations of the C++ implementation on common machines - to prevent a stack overflow by allocating too much space creating values the regular way.
Now, beyond what the comment you quoted implied, you should note that even those two cases above are covered well enough without you having to "resort" to using new yourself:
You can use container types from the standard libraries which can hold a runtime-variable number of elements (like std::vector).
You can use smart pointers, which give you a pointer similar to new, but ensure that memory gets released where the "pointer" goes out of scope.
and for this reason, it is an official item in the C++ community Coding Guidelines to avoid explicit new and delete: Guideline R.11.
The core reason is that objects on heap are always difficult to use and manage than simple values. Writing code that are easy to read and maintain is always the first priority of any serious programmer.
Another scenario is the library we are using provides value semantics and make dynamic allocation unnecessary. Std::string is a good example.
For object oriented code however, using a pointer - which means use new to create it beforehand - is a must. In order to simplify the complexity of resource management, we have dozens of tools to make it as simple as possible, such as smart pointers. The object based paradigm or generic paradigm assumes value semantics and requires less or no new, just as the posters elsewhere stated.
Traditional design patterns, especially those mentioned in GoF book, use new a lot, as they are typical OO code.
new is the new goto.
Recall why goto is so reviled: while it is a powerful, low-level tool for flow control, people often used it in unnecessarily complicated ways that made code difficult to follow. Furthermore, the most useful and easiest to read patterns were encoded in structured programming statements (e.g. for or while); the ultimate effect is that the code where goto is the appropriate way to is rather rare, if you are tempted to write goto, you're probably doing things badly (unless you really know what you're doing).
new is similar — it is often used to make things unnecessarily complicated and harder to read, and the most useful usage patterns can be encoded have been encoded into various classes. Furthermore, if you need to use any new usage patterns for which there aren't already standard classes, you can write your own classes that encode them!
I would even argue that new is worse than goto, due to the need to pair new and delete statements.
Like goto, if you ever think you need to use new, you are probably doing things badly — especially if you are doing so outside of the implementation of a class whose purpose in life is to encapsulate whatever dynamic allocations you need to do.
One more point to all the above correct answers, it depends on what sort of programming you are doing. Kernel developing in Windows for example -> The stack is severely limited and you might not be able to take page faults like in user mode.
In such environments, new, or C-like API calls are prefered and even required.
Of course, this is merely an exception to the rule.
new allocates objects on the heap. Otherwise, objects are allocated on the stack. Look up the difference between the two.
Do you know if there is a way to bring back malloc in its initial state, as if the program was just starting ?
reason : I am developing an embedded application with the nintendods devkitpro and I would like to be able to improve debugging support in case of software faults. I can already catch most errors and e.g. return to the console menu, but this fails to work when catching std::bad_alloc.
I suspect that the code I use for "soft reboot" involves malloc() itself at some point I cannot control, so I'd like to "forget everything about the running app and get a fresh start".
There is no way of doing this portably, though concievably an embedded implementation of C++ might supply it as an extension. You should instead look at writing your own allocation system, using memory pools, or use an existing library.
Only time I did something similar, we used our own allocator which would keep a reference to each allocated blocks. If we wanted to rollback, we would free all the allocated blocks and do a longjmp to restart the programme.
Squirrel away a bit of memory in a global location e.g.
int* not_used = new i[1024];
Then when you get a std::bad_alloc, delete not_used and move on to your error console. The idea is to give your crash handler just enough space to do what you need. You'll have to tune how much memory is reserved so that your console doesn't also received out of memory errors.
If you're clever, not_used could actually be used. But you'd have to be careful that whatever was using memory could be deleted without notice.
The only way to get a fresh start is to reload the application from storage. The DS loads everything into RAM which means that the data section is modified in place.
I suppose if nothing else is running you could zero-write the whole memory block that the API provides on the Nintendo? But otherwise just keep track of your allocates.
In fact, if you create a CatchAndRelease class to keep a reference to each and every allocated memory block, at the required time you could go back and clear those out.
Otherwise, you may need to write your own memory pool, as mentioned by Neil.
Do you ever need to free memory in anything other than last-in-first-out order? If not, I'd suggest that you define an array to use all available memory (you'll probably have to tweak the linker files to do this) and then initialize a pointer to the start of that array. Then write your own malloc() function:
char *allocation_ptr = big_array;
void *malloc(size_t n)
{
void *temp = (void*)allocation_ptr;
if (allocation_ptr > END_OF_ALLOCATION_AREA - n)
return 0;
allocation_ptr += n;
return temp;
}
void free_all_after(void *ptr)
{
if (ptr)
allocation_ptr = (char*)ptr;
}
In this implementation, free_all_after() will free the indicated pointer and everything allocated after it. Note that unlike other implementations of malloc(), this one has zero overhead. The LIFO allocation is very limiting, but for many embedded systems it would be entirely adequate.
std::bad_alloc occurs when new fails and cannot allocate the memory requested. This will normally occur when the heap has run out of memory and therefore cannot honour the request. For this reason, you will not be able to allocate any new memory reliably in the cleanup.
This means that you may not allocate new memory for cleanup. Your only hope of cleaning up successfully is to ensure that memory for the cleanup code is pre-allocated well before you actually need it.
Objects can still be newed into this cleanup memory using the inplace new operator (ie new where you supply a memory address)