Writing a dll for file manipulation, I'm running into some issue.
To read bytes from a file via file.read I require a char* array of the desired length.
Since the length is variable, I cannot use
char* ret_chars[next_bytes];
It gives the error that next_bytes is not a constant.
Another topic here in StackOverflow says to use:
char* ret_chars = new char[next_bytes];
Creating it with "new" requires to use "delete" later though, as far as I know.
Now, how am I supposed to delete the array if the return-value of this function is supposed to be exactly this array?
Isn't it a memory leak if I don't use "delete" anywhere?
If that helps anything: This is a DLL I'll be calling from "Game Maker". Therefore I don't have the possibility to delete anything afterwards.
Hope someone can help me!
When you're writing a callback which will be invoked by existing code, you have to follow its rules.
Assuming that the authors of "Game Maker" aren't complete idiots, they will free the memory you return. So you have to check the documentation to find out what function they will use to free the memory, and then you have to call the matching allocator.
In these cases, the framework usually will provide an allocation function which is specially designed for you to use to allocate a return buffer.
Another common approach is that you never return a buffer allocated by the callback. Instead, the framework passes a buffer to your callback, and you simply fill it in. Check the documentation for that possibility as well.
Is there no sample code for writing "Game Maker" plugins/extensions?
It looks like the developers are indeed complete idiots, at least when it comes to design of plugin interfaces, but they do provide some guidance.
Note that you have to be careful with memory management. That is why I declared the resulting string global.
This implies that the Game Maker engine makes no attempt to free the returned buffer.
You too can use a global, or indeed any variable with static storage duration such as a function-local static variable. std::vector<char> would be a good choice, because it's easy to resize. This way, every time the function is called, the memory allocated for the previous call will be reused or freed. So your "leak" will be limited to the amount you return at once.
char* somefunc( void )
{
static std::vector<char> ret_buffer;
ret_buffer.resize(next_bytes);
// fill it in, blah blah
return &ret_buffer[0];
}
// std::string and return ret_string.c_str(); is another reasonable option
Your script in Game Maker Language will be responsible for making a copy of that result string before it calls your function again and overwrites it.
The new char[ n ] trick works with a runtime value, and yes - you need to delete[] the array when you're done with it or it leaks.
If you are unable to change how "Game Maker" (whatever that is) works, then the memory will be leaked.
If you can change "Game Maker" to do the right thing, then it must manage the lifetime of the returned array.
That's the real problem here - the DLL code can't know when it's no longer needed, so the calling code needs to delete it when it's done, but the calling code cannot delete it directly - it must call back to the DLL to delete it, since it was the DLL's memory manager that allocated it in the first place.
Since you say the return value must be a char[], you therefore need to export a second function from your DLL that takes the char[], and calls delete[] on it. The calling code can then call that function when it's finished with the array returned previously.
Use vector <char *> (or vector <char> depending on which you really want - the question isn't entirely clear), that way, you don't need to delete anything.
You can not use new inside a function, without calling delete, or your application will be leaking memory (which is a bad thing, because EVENTUALLY, you'll have no memory left). There is no EASY solution for this, that doesn't have some relatively strict restrictions in some way or another.
The first code sample you quoted allocates memory on the stack.
The second code sample you quote allocates memory on the heap. (Two totally different concepts).
If you are returning the array, then the function allocating the memory does not free it. It is up to the caller to delete the memory. If the caller forgets, then yes, it is a memory leak.
First, if you use new char[]; you can't use delete, but you have to use delete [].
But like you said, if you use new [] in this function without using delete [] at the end your program will be leaking.
If you want a kind of garbage collection, you can use the *smart ptr* now in the standard c++ library.
I think a `shared_ptr` would be good to achieve what you want.
> **Shared ptr** : Manages the storage of a pointer, providing a limited garbage-collection facility, possibly sharing that management with other objects.
Here is some documentation about it : http://www.cplusplus.com/reference/memory/shared_ptr/
Ok, I'll jump in as well.
If the Game Maker doesn't explicitly say it will delete this memory, then you should check to see just how big a buffer it wants and pass in a static buffer of that size instead. This avoids all sorts of nastiness relating to cross dll versioning issues with memory management. There has to be some documentation on this in their code or their API and I strongly suggest you find and read it. Game Maker is a pretty large and well known API so Google should work for info if you don't have the docs yourself.
If you're returning a char pointer, which it looks as though you are, then you can simply call delete on that pointer.
Example:
char * getString()
{
char* ret_chars = new char[next_bytes];
strcpy(ret_chars, "Hello world")
return ret_chars
}
void displayChars()
{
char* chars = getString()
cout << chars
delete [] chars
}
Just be sure to deallocate (delete) all allocated (new'd) pointers or else you'll have memory leaks where the memory is allocated and then not collected after runtime and becomes unusable. A quick and dirty way to glance through and see if you've deallocated all allocated space to count your new's and count your deletes, and they should be 1-to-1 unless some appear in condition or looped blocks.
Related
I got something like this:
ClassA& ClassA::operator>>(char*& string) {
string = (char*)malloc(size);
//read something in that string then return
return *this;
}
Now i want to free the memory allocated in this function without losing the information stored.If i free it before the return i lose information.
I tried to free it inside the constructor but the parameter is not known there or in another function. So how can I do it? I use valgrind to check memory leaks and it pointed me to this malloc.
first, using "malloc" is bad practice, in c++ (you should use new, even if it's for making a char array; use new char[size]);
if you must, add this pointer to a list of pointers this class is responsible for, and in the destructor of this class, go over that list and call delete for each (or free, if you choose to keep using malloc).
another option, is to call delete/free, where this function is being used (not as clean, and may result in memory leaks, as you came across).
as mentioned in another answer here, using string may also solve your problem (although, you'll need to rewrite the code that calls this function).
one other way, may be using a smart pointer. but that may be over complicating things for what you need here (as it means rewriting the code that calls this function).
I'm not normally a C++ developer. My usual languages are all garbage collected and they do the work for me, but C++ interests me.
There's a question or two I have about returning dynamically allocated objects or structs. It's my understanding that they have to be dynamically allocated so the data is in the heap and not the stack. Please, correct me if I'm wrong.
What is the best practice to return pointers? Say I'm writing a library, how do I indicate in code if/when returned pointers should be deleted? If I'm returning an array, how do I return the size of the array?
These aren't problems I have to face in C# or javascript. These questions do go both ways: if I'm using somebody else's library, what do I look for?
C++ has an idiom called RAII. What it means for you is that you won't have to worry about cleaning up, and the resource will be freed at a defined point in the code.
For example, making an array in a function and returning it. Here's one typical implementation without RAII (another being the caller allocating memory and passing it in):
int *makeIntArray(std::size_t length) {
return new int[length];
}
Now the caller needs to remember to free this memory. Compare this with an RAII version:
std::vector<int> makeIntArray(std::size_t length) {
return std::vector<int>(length);
}
What is returned from this will have its memory deallocated when the vector goes out of scope, which is up to the caller. It also provides, among others, a size() member function to obtain the number of elements.
That said, it's best to keep things not dynamically allocated when possible. If you need to return a structure, say Str, just return it by value:
Str makeStr() {
return Str();
}
No dynamic allocation means no extra work, whether to free the memory, or wrap it in something (a smart pointer such as std::unique_ptr in this case).
As for other libraries, you need to read the documentation to ensure you must own what it returns. If you must, one thing you can do is make an RAII object out of it. For example:
int *makeSomeInt(int value) {
return new int(value);
}
...
std::unique_ptr<int> myInt(makeSomeInt(5));
//memory freed when myInt goes out of scope
I see chris has already provided a nice answer. Few things to add to that:
Stay away from allocating dynamic memory in your code. Let the dynamic memory allocation (and deallocation) be done by the library as much as possible. (See the example of vector above.)
If you must do dynamic memory allocation by yourself, then every memory (i.e. pointer) must have a owner. It is the owner who should construct and destruct the memory, others can only use it.
If you are using C++11, then familiarize yourself to unique_ptr, that is the one you would most commonly need.
From Dr.Dobbs:
There are a lot of great features in C++11, but unique_ptr stands out
in the area of code hygiene. Simply put, this is a magic bullet for
dynamically created objects.
I stumbled upon Stack Overflow question Memory leak with std::string when using std::list<std::string>, and one of the comments says this:
Stop using new so much. I can't see any reason you used new anywhere you did. You can create objects by value in C++ and it's one of the huge advantages to using the language. You do not have to allocate everything on the heap. Stop thinking like a Java programmer.
I'm not really sure what he means by that.
Why should objects be created by value in C++ as often as possible, and what difference does it make internally? Did I misinterpret the answer?
There are two widely-used memory allocation techniques: automatic allocation and dynamic allocation. Commonly, there is a corresponding region of memory for each: the stack and the heap.
Stack
The stack always allocates memory in a sequential fashion. It can do so because it requires you to release the memory in the reverse order (First-In, Last-Out: FILO). This is the memory allocation technique for local variables in many programming languages. It is very, very fast because it requires minimal bookkeeping and the next address to allocate is implicit.
In C++, this is called automatic storage because the storage is claimed automatically at the end of scope. As soon as execution of current code block (delimited using {}) is completed, memory for all variables in that block is automatically collected. This is also the moment where destructors are invoked to clean up resources.
Heap
The heap allows for a more flexible memory allocation mode. Bookkeeping is more complex and allocation is slower. Because there is no implicit release point, you must release the memory manually, using delete or delete[] (free in C). However, the absence of an implicit release point is the key to the heap's flexibility.
Reasons to use dynamic allocation
Even if using the heap is slower and potentially leads to memory leaks or memory fragmentation, there are perfectly good use cases for dynamic allocation, as it's less limited.
Two key reasons to use dynamic allocation:
You don't know how much memory you need at compile time. For instance, when reading a text file into a string, you usually don't know what size the file has, so you can't decide how much memory to allocate until you run the program.
You want to allocate memory which will persist after leaving the current block. For instance, you may want to write a function string readfile(string path) that returns the contents of a file. In this case, even if the stack could hold the entire file contents, you could not return from a function and keep the allocated memory block.
Why dynamic allocation is often unnecessary
In C++ there's a neat construct called a destructor. This mechanism allows you to manage resources by aligning the lifetime of the resource with the lifetime of a variable. This technique is called RAII and is the distinguishing point of C++. It "wraps" resources into objects. std::string is a perfect example. This snippet:
int main ( int argc, char* argv[] )
{
std::string program(argv[0]);
}
actually allocates a variable amount of memory. The std::string object allocates memory using the heap and releases it in its destructor. In this case, you did not need to manually manage any resources and still got the benefits of dynamic memory allocation.
In particular, it implies that in this snippet:
int main ( int argc, char* argv[] )
{
std::string * program = new std::string(argv[0]); // Bad!
delete program;
}
there is unneeded dynamic memory allocation. The program requires more typing (!) and introduces the risk of forgetting to deallocate the memory. It does this with no apparent benefit.
Why you should use automatic storage as often as possible
Basically, the last paragraph sums it up. Using automatic storage as often as possible makes your programs:
faster to type;
faster when run;
less prone to memory/resource leaks.
Bonus points
In the referenced question, there are additional concerns. In particular, the following class:
class Line {
public:
Line();
~Line();
std::string* mString;
};
Line::Line() {
mString = new std::string("foo_bar");
}
Line::~Line() {
delete mString;
}
Is actually a lot more risky to use than the following one:
class Line {
public:
Line();
std::string mString;
};
Line::Line() {
mString = "foo_bar";
// note: there is a cleaner way to write this.
}
The reason is that std::string properly defines a copy constructor. Consider the following program:
int main ()
{
Line l1;
Line l2 = l1;
}
Using the original version, this program will likely crash, as it uses delete on the same string twice. Using the modified version, each Line instance will own its own string instance, each with its own memory and both will be released at the end of the program.
Other notes
Extensive use of RAII is considered a best practice in C++ because of all the reasons above. However, there is an additional benefit which is not immediately obvious. Basically, it's better than the sum of its parts. The whole mechanism composes. It scales.
If you use the Line class as a building block:
class Table
{
Line borders[4];
};
Then
int main ()
{
Table table;
}
allocates four std::string instances, four Line instances, one Table instance and all the string's contents and everything is freed automagically.
Because the stack is faster and leak-proof
In C++, it takes but a single instruction to allocate space—on the stack—for every local scope object in a given function, and it's impossible to leak any of that memory. That comment intended (or should have intended) to say something like "use the stack and not the heap".
The reason why is complicated.
First, C++ is not garbage collected. Therefore, for every new, there must be a corresponding delete. If you fail to put this delete in, then you have a memory leak. Now, for a simple case like this:
std::string *someString = new std::string(...);
//Do stuff
delete someString;
This is simple. But what happens if "Do stuff" throws an exception? Oops: memory leak. What happens if "Do stuff" issues return early? Oops: memory leak.
And this is for the simplest case. If you happen to return that string to someone, now they have to delete it. And if they pass it as an argument, does the person receiving it need to delete it? When should they delete it?
Or, you can just do this:
std::string someString(...);
//Do stuff
No delete. The object was created on the "stack", and it will be destroyed once it goes out of scope. You can even return the object, thus transfering its contents to the calling function. You can pass the object to functions (typically as a reference or const-reference: void SomeFunc(std::string &iCanModifyThis, const std::string &iCantModifyThis). And so forth.
All without new and delete. There's no question of who owns the memory or who's responsible for deleting it. If you do:
std::string someString(...);
std::string otherString;
otherString = someString;
It is understood that otherString has a copy of the data of someString. It isn't a pointer; it is a separate object. They may happen to have the same contents, but you can change one without affecting the other:
someString += "More text.";
if(otherString == someString) { /*Will never get here */ }
See the idea?
Objects created by new must be eventually deleted lest they leak. The destructor won't be called, memory won't be freed, the whole bit. Since C++ has no garbage collection, it's a problem.
Objects created by value (i. e. on stack) automatically die when they go out of scope. The destructor call is inserted by the compiler, and the memory is auto-freed upon function return.
Smart pointers like unique_ptr, shared_ptr solve the dangling reference problem, but they require coding discipline and have other potential issues (copyability, reference loops, etc.).
Also, in heavily multithreaded scenarios, new is a point of contention between threads; there can be a performance impact for overusing new. Stack object creation is by definition thread-local, since each thread has its own stack.
The downside of value objects is that they die once the host function returns - you cannot pass a reference to those back to the caller, only by copying, returning or moving by value.
C++ doesn't employ any memory manager by its own. Other languages like C# and Java have a garbage collector to handle the memory
C++ implementations typically use operating system routines to allocate the memory and too much new/delete could fragment the available memory
With any application, if the memory is frequently being used it's advisable to preallocate it and release when not required.
Improper memory management could lead memory leaks and it's really hard to track. So using stack objects within the scope of function is a proven technique
The downside of using stack objects are, it creates multiple copies of objects on returning, passing to functions, etc. However, smart compilers are well aware of these situations and they've been optimized well for performance
It's really tedious in C++ if the memory being allocated and released in two different places. The responsibility for release is always a question and mostly we rely on some commonly accessible pointers, stack objects (maximum possible) and techniques like auto_ptr (RAII objects)
The best thing is that, you've control over the memory and the worst thing is that you will not have any control over the memory if we employ an improper memory management for the application. The crashes caused due to memory corruptions are the nastiest and hard to trace.
I see that a few important reasons for doing as few new's as possible are missed:
Operator new has a non-deterministic execution time
Calling new may or may not cause the OS to allocate a new physical page to your process. This can be quite slow if you do it often. Or it may already have a suitable memory location ready; we don't know. If your program needs to have consistent and predictable execution time (like in a real-time system or game/physics simulation), you need to avoid new in your time-critical loops.
Operator new is an implicit thread synchronization
Yes, you heard me. Your OS needs to make sure your page tables are consistent and as such calling new will cause your thread to acquire an implicit mutex lock. If you are consistently calling new from many threads you are actually serialising your threads (I've done this with 32 CPUs, each hitting on new to get a few hundred bytes each, ouch! That was a royal p.i.t.a. to debug.)
The rest, such as slow, fragmentation, error prone, etc., have already been mentioned by other answers.
Pre-C++17:
Because it is prone to subtle leaks even if you wrap the result in a smart pointer.
Consider a "careful" user who remembers to wrap objects in smart pointers:
foo(shared_ptr<T1>(new T1()), shared_ptr<T2>(new T2()));
This code is dangerous because there is no guarantee that either shared_ptr is constructed before either T1 or T2. Hence, if one of new T1() or new T2() fails after the other succeeds, then the first object will be leaked because no shared_ptr exists to destroy and deallocate it.
Solution: use make_shared.
Post-C++17:
This is no longer a problem: C++17 imposes a constraint on the order of these operations, in this case ensuring that each call to new() must be immediately followed by the construction of the corresponding smart pointer, with no other operation in between. This implies that, by the time the second new() is called, it is guaranteed that the first object has already been wrapped in its smart pointer, thus preventing any leaks in case an exception is thrown.
A more detailed explanation of the new evaluation order introduced by C++17 was provided by Barry in another answer.
Thanks to #Remy Lebeau for pointing out that this is still a problem under C++17 (although less so): the shared_ptr constructor can fail to allocate its control block and throw, in which case the pointer passed to it is not deleted.
Solution: use make_shared.
To a great extent, that's someone elevating their own weaknesses to a general rule. There's nothing wrong per se with creating objects using the new operator. What there is some argument for is that you have to do so with some discipline: if you create an object you need to make sure it's going to be destroyed.
The easiest way of doing that is to create the object in automatic storage, so C++ knows to destroy it when it goes out of scope:
{
File foo = File("foo.dat");
// Do things
}
Now, observe that when you fall off that block after the end-brace, foo is out of scope. C++ will call its destructor automatically for you. Unlike Java, you don't need to wait for the garbage collection to find it.
Had you written
{
File * foo = new File("foo.dat");
you would want to match it explicitly with
delete foo;
}
or even better, allocate your File * as a "smart pointer". If you aren't careful about that it can lead to leaks.
The answer itself makes the mistaken assumption that if you don't use new you don't allocate on the heap; in fact, in C++ you don't know that. At most, you know that a small amount of memory, say one pointer, is certainly allocated on the stack. However, consider if the implementation of File is something like:
class File {
private:
FileImpl * fd;
public:
File(String fn){ fd = new FileImpl(fn);}
Then FileImpl will still be allocated on the stack.
And yes, you'd better be sure to have
~File(){ delete fd ; }
in the class as well; without it, you'll leak memory from the heap even if you didn't apparently allocate on the heap at all.
new() shouldn't be used as little as possible. It should be used as carefully as possible. And it should be used as often as necessary as dictated by pragmatism.
Allocation of objects on the stack, relying on their implicit destruction, is a simple model. If the required scope of an object fits that model then there's no need to use new(), with the associated delete() and checking of NULL pointers.
In the case where you have lots of short-lived objects allocation on the stack should reduce the problems of heap fragmentation.
However, if the lifetime of your object needs to extend beyond the current scope then new() is the right answer. Just make sure that you pay attention to when and how you call delete() and the possibilities of NULL pointers, using deleted objects and all of the other gotchas that come with the use of pointers.
When you use new, objects are allocated to the heap. It is generally used when you anticipate expansion. When you declare an object such as,
Class var;
it is placed on the stack.
You will always have to call destroy on the object that you placed on the heap with new. This opens the potential for memory leaks. Objects placed on the stack are not prone to memory leaking!
One notable reason to avoid overusing the heap is for performance -- specifically involving the performance of the default memory management mechanism used by C++. While allocation can be quite quick in the trivial case, doing a lot of new and delete on objects of non-uniform size without strict order leads not only to memory fragmentation, but it also complicates the allocation algorithm and can absolutely destroy performance in certain cases.
That's the problem that memory pools where created to solve, allowing to to mitigate the inherent disadvantages of traditional heap implementations, while still allowing you to use the heap as necessary.
Better still, though, to avoid the problem altogether. If you can put it on the stack, then do so.
I tend to disagree with the idea of using new "too much". Though the original poster's use of new with system classes is a bit ridiculous. (int *i; i = new int[9999];? really? int i[9999]; is much clearer.) I think that is what was getting the commenter's goat.
When you're working with system objects, it's very rare that you'd need more than one reference to the exact same object. As long as the value is the same, that's all that matters. And system objects don't typically take up much space in memory. (one byte per character, in a string). And if they do, the libraries should be designed to take that memory management into account (if they're written well). In these cases, (all but one or two of the news in his code), new is practically pointless and only serves to introduce confusions and potential for bugs.
When you're working with your own classes/objects, however (e.g. the original poster's Line class), then you have to begin thinking about the issues like memory footprint, persistence of data, etc. yourself. At this point, allowing multiple references to the same value is invaluable - it allows for constructs like linked lists, dictionaries, and graphs, where multiple variables need to not only have the same value, but reference the exact same object in memory. However, the Line class doesn't have any of those requirements. So the original poster's code actually has absolutely no needs for new.
I think the poster meant to say You do not have to allocate everything on the heap rather than the the stack.
Basically, objects are allocated on the stack (if the object size allows, of course) because of the cheap cost of stack-allocation, rather than heap-based allocation which involves quite some work by the allocator, and adds verbosity because then you have to manage data allocated on the heap.
Two reasons:
It's unnecessary in this case. You're making your code needlessly more complicated.
It allocates space on the heap, and it means that you have to remember to delete it later, or it will cause a memory leak.
Many answers have gone into various performance considerations. I want to address the comment which puzzled OP:
Stop thinking like a Java programmer.
Indeed, in Java, as explained in the answer to this question,
You use the new keyword when an object is being explicitly created for the first time.
but in C++, objects of type T are created like so: T{} (or T{ctor_argument1,ctor_arg2} for a constructor with arguments). That's why usually you just have no reason to want to use new.
So, why is it ever used at all? Well, for two reasons:
You need to create many values the number of which is not known at compile time.
Due to limitations of the C++ implementation on common machines - to prevent a stack overflow by allocating too much space creating values the regular way.
Now, beyond what the comment you quoted implied, you should note that even those two cases above are covered well enough without you having to "resort" to using new yourself:
You can use container types from the standard libraries which can hold a runtime-variable number of elements (like std::vector).
You can use smart pointers, which give you a pointer similar to new, but ensure that memory gets released where the "pointer" goes out of scope.
and for this reason, it is an official item in the C++ community Coding Guidelines to avoid explicit new and delete: Guideline R.11.
The core reason is that objects on heap are always difficult to use and manage than simple values. Writing code that are easy to read and maintain is always the first priority of any serious programmer.
Another scenario is the library we are using provides value semantics and make dynamic allocation unnecessary. Std::string is a good example.
For object oriented code however, using a pointer - which means use new to create it beforehand - is a must. In order to simplify the complexity of resource management, we have dozens of tools to make it as simple as possible, such as smart pointers. The object based paradigm or generic paradigm assumes value semantics and requires less or no new, just as the posters elsewhere stated.
Traditional design patterns, especially those mentioned in GoF book, use new a lot, as they are typical OO code.
new is the new goto.
Recall why goto is so reviled: while it is a powerful, low-level tool for flow control, people often used it in unnecessarily complicated ways that made code difficult to follow. Furthermore, the most useful and easiest to read patterns were encoded in structured programming statements (e.g. for or while); the ultimate effect is that the code where goto is the appropriate way to is rather rare, if you are tempted to write goto, you're probably doing things badly (unless you really know what you're doing).
new is similar — it is often used to make things unnecessarily complicated and harder to read, and the most useful usage patterns can be encoded have been encoded into various classes. Furthermore, if you need to use any new usage patterns for which there aren't already standard classes, you can write your own classes that encode them!
I would even argue that new is worse than goto, due to the need to pair new and delete statements.
Like goto, if you ever think you need to use new, you are probably doing things badly — especially if you are doing so outside of the implementation of a class whose purpose in life is to encapsulate whatever dynamic allocations you need to do.
One more point to all the above correct answers, it depends on what sort of programming you are doing. Kernel developing in Windows for example -> The stack is severely limited and you might not be able to take page faults like in user mode.
In such environments, new, or C-like API calls are prefered and even required.
Of course, this is merely an exception to the rule.
new allocates objects on the heap. Otherwise, objects are allocated on the stack. Look up the difference between the two.
It has been my observation that if free( ptr ) is called where ptr is not a valid pointer to system-allocated memory, an access violation occurs. Let's say that I call free like this:
LPVOID ptr = (LPVOID)0x12345678;
free( ptr );
This will most definitely cause an access violation. Is there a way to test that the memory location pointed to by ptr is valid system-allocated memory?
It seems to me that the the memory management part of the Windows OS kernel must know what memory has been allocated and what memory remains for allocation. Otherwise, how could it know if enough memory remains to satisfy a given request? (rhetorical) That said, it seems reasonable to conclude that there must be a function (or set of functions) that would allow a user to determine if a pointer is valid system-allocated memory. Perhaps Microsoft has not made these functions public. If Microsoft has not provided such an API, I can only presume that it was for an intentional and specific reason. Would providing such a hook into the system prose a significant threat to system security?
Situation Report
Although knowing whether a memory pointer is valid could be useful in many scenarios, this is my particular situation:
I am writing a driver for a new piece of hardware that is to replace an existing piece of hardware that connects to the PC via USB. My mandate is to write the new driver such that calls to the existing API for the current driver will continue to work in the PC applications in which it is used. Thus the only required changes to existing applications is to load the appropriate driver DLL(s) at startup. The problem here is that the existing driver uses a callback to send received serial messages to the application; a pointer to allocated memory containing the message is passed from the driver to the application via the callback. It is then the responsibility of the application to call another driver API to free the memory by passing back the same pointer from the application to the driver. In this scenario the second API has no way to determine if the application has actually passed back a pointer to valid memory.
There's actually some functions called IsBadReadPtr(), IsBadWritePtr(), IsBadStringPtr(), and IsBadCodePtr() that might do the job, but do not use it ever. I mention this only so that you are aware that these options are not to be pursued.
You're much better off making sure you set all your pointers to NULL or 0 when it points to nothing and check against that.
For example:
// Set ptr to zero right after deleting the pointee.
delete ptr; // It's okay to call delete on zero pointers, but it
// certainly doesn't hurt to check.
Note: This might be a performance issue on some compilers (see the section "Code Size" on this page) so it might actually be worth it to do a self-test against zero first.
ptr = 0;
// Set ptr to zero right after freeing the pointee.
if(ptr != 0)
{
free(ptr); // According to Matteo Italia (see comments)
// it's also okay to pass a zero pointer, but
// again it doesn't hurt.
ptr = 0;
}
// Initialize to zero right away if this won't take on a value for now.
void* ptr = 0;
Even better is to use some variant of RAII and never have to deal with pointers directly:
class Resource
{
public:
// You can also use a factory pattern and make this constructor
// private.
Resource() : ptr(0)
{
ptr = malloc(42); // Or new[] or AcquiteArray() or something
// Fill ptr buffer with some valid values
}
// Allow users to work directly with the resource, if applicable
void* GetPtr() const { return ptr; }
~Resource()
{
if(ptr != 0)
{
free(ptr); // Or delete[] or ReleaseArray() or something
// Assignment not actually necessary in this case since
// the destructor is always the last thing that is called
// on an object before it dies.
ptr = 0;
}
}
private:
void* ptr;
};
Or use the standard containers if applicable (which is really an application of RAII):
std::vector<char> arrayOfChars;
Short answer: No.
There is a function in windows that supposedly tells you if a pointer points to real memory (IsBadreadPtr() and it's ilk) but it doesn't work and you should never use it!
The true solution to your problem is to always initialize pointers to NULL, and reset them to NULL once you've deleted them.
EDIT based on your edits:
You're really hinting at a much larger question: How can you be sure your code continues to function properly in the face of client code that screws up?
This really should be a question on its own. There are no simple answers. But it depends on what you mean by "continue to function properly."
There are two theories. One says that even if client code sends you complete crap, you should be able to trudge along, discarding the garbage and processing the good data. A key to accomplishing this is exception handling. If you catch an exception when processing the client's data, roll your state back and try to return as if they had never called you at all.
The other theory is to not even try to continue, and to just fail. Failing can be graceful, and should include some comprehensive logging so that the problem can be identified and hopefully fixed in the lab. Kick up error messages. Tell the user some things to try next time. Generate minidumps, and send them automatically back to the shop. But then, shut down.
I tend to subscribe to the second theory. When client code starts sending crap, the stability of the system is often at risk. They might have corrupted heaps. Needed resources might not be available. Who knows what the problem might be. You might get some good data interspersed with bad, but you dont even know if the good data really is good. So shut down as quickly as you can, to mitigate the risk.
To address your specific concern, I don't think you have to worry about checking the pointer. If the application passes your DLL an invalid address, it represents a memory management problem in the application. No matter how you code your driver, you can't fix the real bug.
To help the application developers debug their problem, you could add a magic number to the object you return to the application. When the your library is called to free an object, check for the number, and if it isn't there, print a debug warning and don't free it! I.e.:
#define DATA_MAGIC 0x12345678
struct data {
int foo; /* The actual object data. */
int magic; /* Magic number for memory debugging. */
};
struct data *api_recv_data() {
struct data *d = malloc(sizeof(*d));
d->foo = whatever;
d->magic = DATA_MAGIC;
return d;
}
void api_free_data(struct data *d) {
if (d->magic == DATA_MAGIC) {
d->magic = 0;
free(d);
} else {
fprintf(stderr, "api_free_data() asked to free invalid data %p\n", d);
}
}
This is only a debugging technique. This will work the correctly if the application has no memory errors. If the application does have problems, this will probably alert the developer to the mistake. It only works because you're actual problem is much more constrained that your initial question indicates.
No, you are supposed to know if your pointers point to correctly allocated memory.
No. You are only supposed to have a pointer to memory that you know is valid, usually because you allocated it in the same program. Track your memory allocations properly and then you won't even need this!
Also, you are invoking Undefined Behaviour by attempting to free an invalid pointer, so it may crash or do anything at all.
Also, free is a function of the C++ Standard Library inherited from C, not a WinAPI function.
First of all, in the standard there's nothing that guarantees such a thing (freeing a non-malloced pointer is undefined behavior).
Anyhow, passing by free is just a twisted route to just trying to access that memory; if you wanted to check if the memory pointed by a pointer is readable/writable on Windows, you really should just try and be ready to deal with the SEH exception; this is actually what the IsBadxxxPtr functions do, by translating such exception in their return code.
However, this is an approach that hides subtle bugs, as explained in this Raymond Chen's post; so, long story short, no there's no safe way to determine if a pointer points to something valid, and I think that, if you need to have such a test somewhere, there's some design flaw in that code.
I'm not going to echo what every one has already said, just to add to those answers though, this is why smart pointers exist - use them!
Any time you find yourself having to work around crashes due to memory errors - take a step back, a large breath, and fix the underlying problem - it's dangerous to attempt to work around them!
EDIT based on your update:
There are two sane ways that I can think of to do this.
The client application provides a buffer where you put the message, meaning your API does not have worry about managing that memory - this requires changes to your interface and client code.
You change the semantics of the interface, and force the clients of the interface to not worry about memory management (i.e. you call with a pointer to something that is only valid in the context of the callback - if client requires, they make their own copy of the data). This does not change your interface - you can still callback with a pointer, however your clients will need to check that they don't use the buffer outside of that context - potentially if they do, it's probably not what you want and so it could be a good thing that they fix it(?)
Personally I would go for the latter as long as you can be sure that the buffer is not used outside of the callback. If it is, then you'll have to use hackery (such as has been suggested with the magic number - though this is not always guaranteed to work, for example let's say there was some form of buffer overrun from the previous block, and you somehow over-write the magic number with crap - what happens there?)
Application memory management is up to the application developer to maintain, not the operating system (even in managed languages, the operating system doesn't do that job, a garbage collector does). If you allocate an object on the heap, it is your responsibility to free it properly. If you fail to do so, your application will leak memory. The operating system (in the case of Windows at least) does know how much memory it has let your application have, and will reclaim it when your application closes (or crashes), but there is no documented way (that works) to query a memory address to see if it is an allocated block.
The best suggestion I can give you: learn to manage your memory properly.
Not without access to the internals of the malloc implementation.
You could perhaps identify some invalid pointers (e.g., ones that don't point anywhere within your process's virtual memory space), but if you take a valid pointer and add 1 to it, it will be invalid for calling free() but will still point within system-allocated memory. (Not to mention the usual problem of calling free on the same pointer more than once).
Aside from the obvious point made by others about this being very bad practice, I see another problem.
Just because a particular address doesn't cause free() to generate an access violation, does not mean it's safe to free that memory. The address could actually be an address within the heap so that no access violation occurs, and freeing it would result in heap corruption. Or it might even be a valid address to free, in which case you've freed some block of memory that might still be in use!
You've really offered no explanation of why such a poor approach should even be considered.
You apparently have determined that you're done with an object that you currently have a pointer to and if that object was malloced you want to free it. This doesn't sound like an unreasonable idea, but the fact that you have a pointer to an object doesn't tell you anything about how that object was allocated (with malloc, with new, with new[], on the stack, as shared memory, as a memory-mapped file, as an APR memory pool, using the Boehm-Demers-Weiser garbage collector, etc.) so there is no way to determine the correct way to deallocate the object (or if deallocation is needed at all; you may have a pointer to an object on the stack). That's the answer to your actual question.
But sometimes it's better to answer the question that should have been asked. And that question is "how can I manage memory in C++ if I can't always tell things like 'how was this object allocated, and how should it be deallocated'?" That's a tricky question, and, while it's not easy, it is possible to manage memory if you follow a few policies. Whenever you hear people complain about properly pairing each malloc with free, each new with delete and each new[] with delete[], etc., you know that they are making their lives harder than necessary by not following a disciplined memory management regime.
I'm going to make a guess that you're passing pointers to a function and when the function is done you want it to clean up the pointers. This policy is generally impossible to get right. Instead I would recommend following a policy that (1) if a function gets a pointer from somebody else, then that "somebody else" is expected to clean up (after all, that "somebody else" knows how the memory was allocated) and (2) if a function allocates an object, then that function's documentation will say what method should be used to deallocate the object. Second, I would highly recommend smart pointers and similar classes.
Stroustrup's advice is:
If I create 10,000 objects and have pointers to them, I need to delete those 10,000 objects, not 9,999, and not 10,001. I don't know how to do that. If I have to handle the 10,000 objects directly, I'm going to screw up. ... So, quite a long time ago I thought, "Well, but I can handle a low number of objects correctly." If I have a hundred objects to deal with, I can be pretty sure I have correctly handled 100 and not 99. If I can get then number down to 10 objects, I start getting happy. I know how to make sure that I have correctly handled 10 and not just 9."
For instance, you want code like this:
#include <cstdlib>
#include <iostream>
#include "boost/shared_ptr.hpp"
namespace {
// as a side note, there is no reason for this sample function to take int*s
// instead of ints; except that I need a simple function that uses pointers
int foo(int* bar, int* baz)
{
// note that since bar and baz come from outside the function, somebody
// else is responsible for cleaning them up
return *bar + *baz;
}
}
int main()
{
boost::shared_ptr<int> quux(new int(2));
// note, I would not recommend using malloc with shared_ptr in general
// because the syntax sucks and you have to initialize things yourself
boost::shared_ptr<int> quuz(reinterpret_cast<int*>(std::malloc(sizeof(int))), std::free);
*quuz = 3;
std::cout << foo(quux.get(), quuz.get()) << '\n';
}
Why would 0x12345678 necessarily be invalid? If your program uses a lot of memory, something could be allocated there. Really, there's only one pointer value you should absolutely rely on being an invalid allocation: NULL.
C++ does not use 'malloc' but 'new', which usually has a different implementation; therefore 'delete' and 'free' can't be mixed – neither can 'delete' and 'delete[]' (its array-version).
DLLs have their own memory-area and can't be mixed with the memory management system of the non-DLL memory-area.
Every API and language has its own memory management for its own type of memory-objects.
I.e.: You do not 'free()' or 'delete' open files, you 'close()' them. The same goes for very other API, even if the type is a pointer to memory instead of a handle.
While learning different languages, I've often seen objects allocated on the fly, most often in Java and C#, like this:
functionCall(new className(initializers));
I understand that this is perfectly legal in memory-managed languages, but can this technique be used in C++ without causing a memory leak?
Your code is valid (assuming functionCall() actually guarantees that the pointer gets deleted), but it's fragile and will make alarm bells go off in the heads of most C++ programmers.
There are multiple problems with your code:
First and foremost, who owns the pointer? Who is responsible for freeing it? The calling code can't do it, because you don't store the pointer. That means the called function must do it, but that's not clear to someone looking at that function. Similarly, if I call the code from somewhere else, I certainly don't expect the function to call delete on the pointer I passed to it!
If we make your example slightly more complex, it can leak memory, even if the called function calls delete. Say it looks like this: functionCall(new className(initializers), new className(initializers)); Imagine that the first one is allocated successfully, but the second one throws an exception (maybe it's out of memory, or maybe the class constructor threw an exception). functionCall never gets called then, and can't free the memory.
The simple (but still messy) solution is to allocate memory first, and store the pointer, and then free it in the same scope as it was declared (so the calling function owns the memory):
className* p = new className(initializers);
functionCall(p);
delete p;
But this is still a mess. What if functionCall throws an exception? Then p won't be deleted. Unless we add a try/catch around the whole thing, but sheesh, that's messy.
What if the function gets a bit more complex, and may return after functionCall but before delete? Whoops, memory leak. Impossible to maintain. Bad code.
So one of the nice solutions is to use a smart pointer:
boost::shared_ptr<className> p = boost::shared_ptr<className>(new className(initializers));
functionCall(p);
Now ownership of the memory is dealt with. The shared_ptr owns the memory, and guarantees that it'll get freed. We could use std::auto_ptr instead, of course, but shared_ptr implements the semantics you'd usually expect.
Note that I still allocated the memory on a separate line, because the problem with making multiple allocations on the same line as you make the function call still exists. One of them may still throw, and then you've leaked memory.
Smart pointers are generally the absolute minimum you need to handle memory management.
But often, the nice solution is to write your own RAII class.
className should be allocated on the stack, and in its constructor, make what allocations with new are necessary. And in its destructor, it should free that memory. This way, you're guaranteed that no memory leaks will occur, and you can make the function call as simple as this:
functionCall(className(initializers));
The C++ standard library works like this. std::vector is one example. You'd never allocate a vector with new. You allocate it on the stack, and let it deal with its memory allocations internally.
Yes, as long as you deallocate the memory inside the function. But by no means this is a best practice for C++.
It depends.
This passes "ownership" of the memory to functionCAll(). It will either need to free the object or save the pointer so that it can be freed later. Passing the ownership of raw pointers like this is one of the easiest ways to build memory issues into your code -- either leaks or double deletes.
In C++ we would not create the memory dynamically like that.
Instead you would create a temporary stack object.
You only need to create a heap object via new if you want the lifetime of the object to be greater than the call to the function. In this case you can use new in conjunction with a smart pointer (see other answers for an example).
// No need for new or memory management just do this
functionCall(className(initializers));
// This assumes you can change the functionCall to somthing like this.
functionCall(className const& param)
{
<< Do Stuff >>
}
If you want to pass a non const reference then do it like this:
calssName tmp(initializers);
functionCall(tmp);
functionCall(className& param)
{
<< Do Stuff >>
}
It is safe if the function that you are calling has acceptance-of-ownership semantics. I don't recall a time where I needed this, so I would consider it unusual.
If the function works this way, it should take its argument as a smart pointer object so that the intent is clea; i.e.
void functionCall(std::auto_ptr<className> ptr);
rather than
void functionCall(className* ptr);
This makes the transfer of ownership explicit, and the calling function will dispose of the memory pointed to by ptr when execution of the function falls out of scope.
This will work for objects created on the stack, but not a regular pointer in C++.
An auto pointer maybe able to handle it, but I haven't messed with them enough to know.
In general, no, unless you want to leak memory. In fact, in most cases, this won't work, since the result of
new T();
in C++ is a T*, not a T (in C#, new T() returns a T).
Have a look at Smart Pointers or A garbage collector for C and C++.