C++ Dynamic Allocation Mismatch: Is this problematic? - c++

I have been assigned to work on some legacy C++ code in MFC. One of the things I am finding all over the place are allocations like the following:
struct Point
{
float x,y,z;
};
...
void someFunc( void )
{
int numPoints = ...;
Point* pArray = (Point*)new BYTE[ numPoints * sizeof(Point) ];
...
//do some stuff with points
...
delete [] pArray;
}
I realize that this code is atrociously wrong on so many levels (C-style cast, using new like malloc, confusing, etc). I also realize that if Point had defined a constructor it would not be called and weird things would happen at delete [] if a destructor had been defined.
Question: I am in the process of fixing these occurrences wherever they appear as a matter of course. However, I have never seen anything like this before and it has got me wondering. Does this code have the potential to cause memory leaks/corruption as it stands currently (no constructor/destructor, but with pointer type mismatch) or is it safe as long as the array just contains structs/primitive types?

Formally the code causes undefined behavior because of the pointer type mismatch in new[]/delete[]. In practice it should work fine.
The pointer type mismatch issue can easily be fixed by adding a cast to the delete-expression
delete [] (BYTE *) pArray;
If Point type is defined as shown in the question (i.e. with trivial constructor and destructor), then this correction solves all formal issues there are in this code. From the language point of view, the lifetime of an object with trivial constructor (destructor) begins (ends) simultaneously with its storage duration. I.e. there's no requirement to perform the actual invocation of constructor (destructor).

As long as the constructors and destructors do nothing, then you're safe.

As long as you assure that it really does invoke a matching delete[] for every new[], it shouldn't leak -- but if an exception might be thrown by any of the code that's been commented out, that's going to be difficult to assure (basically, you need to catch any possible exceptions, delete the memory, then re-throw the exception).

I would first try to figure out why the code was written that way in the first place. It might be simply because the programmer didn't know any better, or because they were trying to work around some funky defect int he compiler. But there might be a real reason that you are unaware of. If there is, then unless you understand that reason and its side effects, you may introduce a defect by changing this code.
That out of the way, and assuming there is no particular reason why the code needs to be this way now, you should be safe in changing the code to use more modern and correct constructs.
But why? I understand the motivation to make the code more correct. But what do you really gain by this? If the code works the way it is now (a big assumption), then by changing the code you possibly gain the benefit of making the code more understandable to future programmers, but every line of code you change introduces the possibility for a new bug to be written.
And if finally you do decide to go ahead with the change, why stop halfway? Consider getting rid of all the news and deletes altogether, and replace them with vectors etc.

Related

Should I reset primitive member variable in destructor?

Please see following code,
class MyClass
{
public:
int i;
MyClass()
{
i = 10;
}
};
MyClass* pObj = nullptr;
int main()
{
{
MyClass obj;
pObj = &obj;
}
while (1)
{
cout << pObj->i; //pObj is dangling pointer, still no crash.
Sleep(1000);
}
return 0;
}
obj will die once it comes out of scope. But I tested in VS 2017, I see no crash even after I use it.
Is it good practice to reset int member varialbe i?
Accessing a member after an object got destroyed is undefined behavior. It may seem like a good to set members in a destructor to a predictable and most likely unexpected value, e.g., a rather large value or a value with specific bit pattern making it easy to recognize the value in a debugger.
However, this idea is flawed and dwarved by the system:
All classes would need to play along and instead of concentrating on creating correct code developers would spent time (both development time as well as run-time) making pointless change.
Compilers happen get rather smart and can detect that changes in destructor are not needed. Since a correct program cannot detect whether the change was made they may not make the change at all. This effect is an actual issue for security applications where, e.g., a password should be erased from memory so it cannot be read (using some non-portable mean).
Even if the value gets set to a specific value, memory gets reused and the values get overwritten. Especially with objects on the stack it is most likely that the memory is used for something else before you see the bad value in a debugger.
Even when resetting values you would necessarily see a "crash": a crash is caused by something being setup to protect against something invalid. In your example you are accessing an int on the stack: the stack will remain accessible from a CPU point of view and at best you'd get an unexpected value. Use of unusual pointer values typically leads to a crash because the memory management system tries to access a location which isn't mapped but even that isn't guaranteed: on a busy 32 bit system pretty much all memory may be in use. That is, trying to rely on undefined behavior being detect is also futile.
Correspondingly, it is much better to use good coding practices which avoid dangling references right away and concentrate on using these. For example, I'm always initializing members in the member initializer list, even in the rare cases they end up getting changed in the body of the constructor (i.e., you'd write your constructor as MyClass(): i() {}).
As a debugging tool it may be reasonable to replace the allocation functions (ideally the allocator object but potentially the global operator new()/operator delete() and family with a version which doesn't quickly hand out released memory and instead fills the released memory with a predictable pattern. Since these actions slow down the program you'd only use this code in a debug build but it is relatively simple to implement once and easy to enable/disable centrally it may be worth the effort. In practice I don't think even such a system pays off as use of managed pointers and proper design of ownership and lifetime avoid most errors due to dangling references.
The behaviour of code you gave is undefined. Partial case of undefined behaviour is working as expected, so here is nothing strange that the code works. Code can work now and it can broke anyway at any time depending of compiler version, compiler options, stack content and a moon phase.
So first and most important is to avoid dangling pointers (and all other kinds of undefined behaviour) everywhere.
What about clearing variables in destructor, I found a best practice:
Follow coding rules saving me from mistakes of access to unallocated or destroyed objects. I cannot describe it in a few words but rules are pretty common (see here and anywhere).
Analyze code by humans (code review) or by statical analyzers (like cppcheck or PVS-Studio or another) to avoid cases similar to one you described above.
Do not call delete manually, better use scoped_ptr or similar object lifetime managers. When delete is reasonable, I usually (usually) set pointer to nullptr after deletion to keep myself from mistakes.
Use pointers as rare as it possible. References are preferred.
When objects of my class used outside and I suspect that somebody can access it after deletion I can put signature field inside, set it to something like 0xDEAD in destructor and check at enter or every public method. Here be careful to not slow down your code to unacceptable speed.
After all of this setting i from your example to 0 or -1 is redundant. As for me it's not a thing you should focus your attention.

malloc & placement new vs. new

I've been looking into this for the past few days, and so far I haven't really found anything convincing other than dogmatic arguments or appeals to tradition (i.e. "it's the C++ way!").
If I'm creating an array of objects, what is the compelling reason (other than ease) for using:
#define MY_ARRAY_SIZE 10
// ...
my_object * my_array=new my_object [MY_ARRAY_SIZE];
for (int i=0;i<MY_ARRAY_SIZE;++i) my_array[i]=my_object(i);
over
#define MEMORY_ERROR -1
#define MY_ARRAY_SIZE 10
// ...
my_object * my_array=(my_object *)malloc(sizeof(my_object)*MY_ARRAY_SIZE);
if (my_object==NULL) throw MEMORY_ERROR;
for (int i=0;i<MY_ARRAY_SIZE;++i) new (my_array+i) my_object (i);
As far as I can tell the latter is much more efficient than the former (since you don't initialize memory to some non-random value/call default constructors unnecessarily), and the only difference really is the fact that one you clean up with:
delete [] my_array;
and the other you clean up with:
for (int i=0;i<MY_ARRAY_SIZE;++i) my_array[i].~T();
free(my_array);
I'm out for a compelling reason. Appeals to the fact that it's C++ (not C) and therefore malloc and free shouldn't be used isn't -- as far as I can tell -- compelling as much as it is dogmatic. Is there something I'm missing that makes new [] superior to malloc?
I mean, as best I can tell, you can't even use new [] -- at all -- to make an array of things that don't have a default, parameterless constructor, whereas the malloc method can thusly be used.
I'm out for a compelling reason.
It depends on how you define "compelling". Many of the arguments you have thus far rejected are certainly compelling to most C++ programmers, as your suggestion is not the standard way to allocate naked arrays in C++.
The simple fact is this: yes, you absolutely can do things the way you describe. There is no reason that what you are describing will not function.
But then again, you can have virtual functions in C. You can implement classes and inheritance in plain C, if you put the time and effort into it. Those are entirely functional as well.
Therefore, what matters is not whether something can work. But more on what the costs are. It's much more error prone to implement inheritance and virtual functions in C than C++. There are multiple ways to implement it in C, which leads to incompatible implementations. Whereas, because they're first-class language features of C++, it's highly unlikely that someone would manually implement what the language offers. Thus, everyone's inheritance and virtual functions can cooperate with the rules of C++.
The same goes for this. So what are the gains and the losses from manual malloc/free array management?
I can't say that any of what I'm about to say constitutes a "compelling reason" for you. I rather doubt it will, since you seem to have made up your mind. But for the record:
Performance
You claim the following:
As far as I can tell the latter is much more efficient than the former (since you don't initialize memory to some non-random value/call default constructors unnecessarily), and the only difference really is the fact that one you clean up with:
This statement suggests that the efficiency gain is primarily in the construction of the objects in question. That is, which constructors are called. The statement presupposes that you don't want to call the default constructor; that you use a default constructor just to create the array, then use the real initialization function to put the actual data into the object.
Well... what if that's not what you want to do? What if what you want to do is create an empty array, one that is default constructed? In this case, this advantage disappears entirely.
Fragility
Let's assume that each object in the array needs to have a specialized constructor or something called on it, such that initializing the array requires this sort of thing. But consider your destruction code:
for (int i=0;i<MY_ARRAY_SIZE;++i) my_array[i].~T();
For a simple case, this is fine. You have a macro or const variable that says how many objects you have. And you loop over each element to destroy the data. That's great for a simple example.
Now consider a real application, not an example. How many different places will you be creating an array in? Dozens? Hundreds? Each and every one will need to have its own for loop for initializing the array. Each and every one will need to have its own for loop for destroying the array.
Mis-type this even once, and you can corrupt memory. Or not delete something. Or any number of other horrible things.
And here's an important question: for a given array, where do you keep the size? Do you know how many items you allocated for every array that you create? Each array will probably have its own way of knowing how many items it stores. So each destructor loop will need to fetch this data properly. If it gets it wrong... boom.
And then we have exception safety, which is a whole new can of worms. If one of the constructors throws an exception, the previously constructed objects need to be destructed. Your code doesn't do that; it's not exception-safe.
Now, consider the alternative:
delete[] my_array;
This can't fail. It will always destroy every element. It tracks the size of the array, and it's exception-safe. So it is guaranteed to work. It can't not work (as long as you allocated it with new[]).
Of course, you could say that you could wrap the array in an object. That makes sense. You might even template the object on the type elements of the array. That way, all the desturctor code is the same. The size is contained in the object. And maybe, just maybe, you realize that the user should have some control over the particular way the memory is allocated, so that it's not just malloc/free.
Congratulations: you just re-invented std::vector.
Which is why many C++ programmers don't even type new[] anymore.
Flexibility
Your code uses malloc/free. But let's say I'm doing some profiling. And I realize that malloc/free for certain frequently created types is just too expensive. I create a special memory manager for them. But how to hook all of the array allocations to them?
Well, I have to search the codebase for any location where you create/destroy arrays of these types. And then I have to change their memory allocators accordingly. And then I have to continuously watch the codebase so that someone else doesn't change those allocators back or introduce new array code that uses different allocators.
If I were instead using new[]/delete[], I could use operator overloading. I simply provide an overload for operators new[] and delete[] for those types. No code has to change. It's much more difficult for someone to circumvent these overloads; they have to actively try to. And so forth.
So I get greater flexibility and reasonable assurance that my allocators will be used where they should be used.
Readability
Consider this:
my_object *my_array = new my_object[10];
for (int i=0; i<MY_ARRAY_SIZE; ++i)
my_array[i]=my_object(i);
//... Do stuff with the array
delete [] my_array;
Compare it to this:
my_object *my_array = (my_object *)malloc(sizeof(my_object) * MY_ARRAY_SIZE);
if(my_object==NULL)
throw MEMORY_ERROR;
int i;
try
{
for(i=0; i<MY_ARRAY_SIZE; ++i)
new(my_array+i) my_object(i);
}
catch(...) //Exception safety.
{
for(i; i>0; --i) //The i-th object was not successfully constructed
my_array[i-1].~T();
throw;
}
//... Do stuff with the array
for(int i=MY_ARRAY_SIZE; i>=0; --i)
my_array[i].~T();
free(my_array);
Objectively speaking, which one of these is easier to read and understand what's going on?
Just look at this statement: (my_object *)malloc(sizeof(my_object) * MY_ARRAY_SIZE). This is a very low level thing. You're not allocating an array of anything; you're allocating a hunk of memory. You have to manually compute the size of the hunk of memory to match the size of the object * the number of objects you want. It even features a cast.
By contrast, new my_object[10] tells the story. new is the C++ keyword for "create instances of types". my_object[10] is a 10 element array of my_object type. It's simple, obvious, and intuitive. There's no casting, no computing of byte sizes, nothing.
The malloc method requires learning how to use malloc idiomatically. The new method requires just understanding how new works. It's much less verbose and much more obvious what's going on.
Furthermore, after the malloc statement, you do not in fact have an array of objects. malloc simply returns a block of memory that you have told the C++ compiler to pretend is a pointer to an object (with a cast). It isn't an array of objects, because objects in C++ have lifetimes. And an object's lifetime does not begin until it is constructed. Nothing in that memory has had a constructor called on it yet, and therefore there are no living objects in it.
my_array at that point is not an array; it's just a block of memory. It doesn't become an array of my_objects until you construct them in the next step. This is incredibly unintuitive to a new programmer; it takes a seasoned C++ hand (one who probably learned from C) to know that those aren't live objects and should be treated with care. The pointer does not yet behave like a proper my_object*, because it doesn't point to any my_objects yet.
By contrast, you do have living objects in the new[] case. The objects have been constructed; they are live and fully-formed. You can use this pointer just like any other my_object*.
Fin
None of the above says that this mechanism isn't potentially useful in the right circumstances. But it's one thing to acknowledge the utility of something in certain circumstances. It's quite another to say that it should be the default way of doing things.
If you do not want to get your memory initialized by implicit constructor calls, and just need an assured memory allocation for placement new then it is perfectly fine to use malloc and free instead of new[] and delete[].
The compelling reasons of using new over malloc is that new provides implicit initialization through constructor calls, saving you additional memset or related function calls post an malloc And that for new you do not need to check for NULL after every allocation, just enclosing exception handlers will do the job saving you redundant error checking unlike malloc.
These both compelling reasons do not apply to your usage.
which one is performance efficient can only be determined by profiling, there is nothing wrong in the approach you have now. On a side note I don't see a compelling reason as to why use malloc over new[] either.
I would say neither.
The best way to do it would be:
std::vector<my_object> my_array;
my_array.reserve(MY_ARRAY_SIZE);
for (int i=0;i<MY_ARRAY_SIZE;++i)
{ my_array.push_back(my_object(i));
}
This is because internally vector is probably doing the placement new for you. It also managing all the other problems associated with memory management that you are not taking into account.
You've reimplemented new[]/delete[] here, and what you have written is pretty common in developing specialized allocators.
The overhead of calling simple constructors will take little time compared the allocation. It's not necessarily 'much more efficient' -- it depends on the complexity of the default constructor, and of operator=.
One nice thing that has not been mentioned yet is that the array's size is known by new[]/delete[]. delete[] just does the right and destructs all elements when asked. Dragging an additional variable (or three) around so you exactly how to destroy the array is a pain. A dedicated collection type would be a fine alternative, however.
new[]/delete[] are preferable for convenience. They introduce little overhead, and could save you from a lot of silly errors. Are you compelled enough to take away this functionality and use a collection/container everywhere to support your custom construction? I've implemented this allocator -- the real mess is creating functors for all the construction variations you need in practice. At any rate, you often have a more exact execution at the expense of a program which is often more difficult to maintain than the idioms everybody knows.
IMHO there both ugly, it's better to use vectors. Just make sure to allocate the space in advance for performance.
Either:
std::vector<my_object> my_array(MY_ARRAY_SIZE);
If you want to initialize with a default value for all entries.
my_object basic;
std::vector<my_object> my_array(MY_ARRAY_SIZE, basic);
Or if you don't want to construct the objects but do want to reserve the space:
std::vector<my_object> my_array;
my_array.reserve(MY_ARRAY_SIZE);
Then if you need to access it as a C-Style pointer array just (just make sure you don't add stuff while keeping the old pointer but you couldn't do that with regular c-style arrays anyway.)
my_object* carray = &my_array[0];
my_object* carray = &my_array.front(); // Or the C++ way
Access individual elements:
my_object value = my_array[i]; // The non-safe c-like faster way
my_object value = my_array.at(i); // With bounds checking, throws range exception
Typedef for pretty:
typedef std::vector<my_object> object_vect;
Pass them around functions with references:
void some_function(const object_vect& my_array);
EDIT:
IN C++11 there is also std::array. The problem with it though is it's size is done via a template so you can't make different sized ones at runtime and you cant pass it into functions unless they are expecting that exact same size (or are template functions themselves). But it can be useful for things like buffers.
std::array<int, 1024> my_array;
EDIT2:
Also in C++11 there is a new emplace_back as an alternative to push_back. This basically allows you to 'move' your object (or construct your object directly in the vector) and saves you a copy.
std::vector<SomeClass> v;
SomeClass bob {"Bob", "Ross", 10.34f};
v.emplace_back(bob);
v.emplace_back("Another", "One", 111.0f); // <- Note this doesn't work with initialization lists ☹
Oh well, I was thinking that given the number of answers there would be no reason to step in... but I guess I am drawn in as the others. Let's go
Why your solution is broken
C++11 new facilities for handling raw memory
Simpler way to get this done
Advices
1. Why your solution is broken
First, the two snippets you presented are not equivalent. new[] just works, yours fails horribly in the presence of Exceptions.
What new[] does under the cover is that it keeps track of the number of objects that were constructed, so that if an exception occurs during say the 3rd constructor call it properly calls the destructor for the 2 already constructed objects.
Your solution however fails horribly:
either you don't handle exceptions at all (and leak horribly)
or you just try to call the destructors on the whole array even though it's half built (likely crashing, but who knows with undefined behavior)
So the two are clearly not equivalent. Yours is broken
2. C++11 new facilities for handling raw memory
In C++11, the comittee members have realized how much we liked fiddling with raw memory and they have introduced facilities to help us doing so more efficiently, and more safely.
Check cppreference's <memory> brief. This example shows off the new goodies (*):
#include <iostream>
#include <string>
#include <memory>
#include <algorithm>
int main()
{
const std::string s[] = {"This", "is", "a", "test", "."};
std::string* p = std::get_temporary_buffer<std::string>(5).first;
std::copy(std::begin(s), std::end(s),
std::raw_storage_iterator<std::string*, std::string>(p));
for(std::string* i = p; i!=p+5; ++i) {
std::cout << *i << '\n';
i->~basic_string<char>();
}
std::return_temporary_buffer(p);
}
Note that get_temporary_buffer is no-throw, it returns the number of elements for which memory has actually been allocated as a second member of the pair (thus the .first to get the pointer).
(*) Or perhaps not so new as MooingDuck remarked.
3. Simpler way to get this done
As far as I am concered, what you really seem to be asking for is a kind of typed memory pool, where some emplacements could not have been initialized.
Do you know about boost::optional ?
It is basically an area of raw memory that can fit one item of a given type (template parameter) but defaults with having nothing in instead. It has a similar interface to a pointer and let you query whether or not the memory is actually occupied. Finally, using the In-Place Factories you can safely use it without copying objects if it is a concern.
Well, your use case really looks like a std::vector< boost::optional<T> > to me (or perhaps a deque?)
4. Advices
Finally, in case you really want to do it on your own, whether for learning or because no STL container really suits you, I do suggest you wrap this up in an object to avoid the code sprawling all over the place.
Don't forget: Don't Repeat Yourself!
With an object (templated) you can capture the essence of your design in one single place, and then reuse it everywhere.
And of course, why not take advantage of the new C++11 facilities while doing so :) ?
You should use vectors.
Dogmatic or not, that is exactly what ALL the STL container do to allocate and initialize.
They use an allocator then allocates uninitialized space and initialize it by means of the container constructors.
If this (like many people use to say) "is not c++" how can be the standard library just be implemented like that?
If you just don't want to use malloc / free, you can allocate "bytes" with just new char[]
myobjet* pvext = reinterpret_cast<myobject*>(new char[sizeof(myobject)*vectsize]);
for(int i=0; i<vectsize; ++i) new(myobject+i)myobject(params);
...
for(int i=vectsize-1; i!=0u-1; --i) (myobject+i)->~myobject();
delete[] reinterpret_cast<char*>(myobject);
This lets you take advantage of the separation between initialization and allocation, still taking adwantage of the new allocation exception mechanism.
Note that, putting my first and last line into an myallocator<myobject> class and the second ands second-last into a myvector<myobject> class, we have ... just reimplemented std::vector<myobject, std::allocator<myobject> >
What you have shown here is actually the way to go when using a memory allocator different than the system general allocator - in that case you would allocate your memory using the allocator (alloc->malloc(sizeof(my_object))) and then use the placement new operator to initialize it. This has many advantages in efficient memory management and quite common in the standard template library.
If you are writing a class that mimics functionality of std::vector or needs control over memory allocation/object creation (insertion in array / deletion etc.) - that's the way to go. In this case, it's not a question of "not calling default constructor". It becomes a question of being able to "allocate raw memory, memmove old objects there and then create new objects at the olds' addresses", question of being able to use some form of realloc and so on. Unquestionably, custom allocation + placement new are way more flexible... I know, I'm a bit drunk, but std::vector is for sissies... About efficiency - one can write their own version of std::vector that will be AT LEAST as fast ( and most likely smaller, in terms of sizeof() ) with most used 80% of std::vector functionality in, probably, less than 3 hours.
my_object * my_array=new my_object [10];
This will be an array with objects.
my_object * my_array=(my_object *)malloc(sizeof(my_object)*MY_ARRAY_SIZE);
This will be an array the size of your objects, but they may be "broken". If your class has virtual funcitons for instance, then you won't be able to call those. Note that it's not just your member data that may be inconsistent, but the entire object is actully "broken" (in lack of a better word)
I'm not saying it's wrong to do the second one, just as long as you know this.

Is "delete p; p = NULL(nullptr);" an antipattern?

Searching something on SO, I stumbled across this question and one of the comments to the most voted answer (the fifth comment to that most voted answer) suggests that delete p; p = NULL; is an antipattern. I must confess that I happen to use it quite often paring it sometimes\most of the times with the check if (NULL != p). The Man himself seems to suggest it (please see the destroy() function example) so I'm really confused to why it might be such a dreaded thing to be considered an antipattern. I use it for the following reasons:
when I'm releasing a resource I also want to kind of invalidate it for further usage and NULL is the right tool to say a pointer is invalid
I don't want to leave behind dangling pointers
I want to avoid double\multiple free bugs - deleting a NULL pointer is like a nop but deleting a dangling pointer is like "shooting yourself in the foot"
Please note that I'm not asking the question in the context of the "this" pointer and let's assume we don't live in a perfect C++ world and that legacy code does exist and it has to be maintained, so please do not suggest any kind of smart pointer :).
Yes, I would not recommended doing it.
The reason is that the extra setting to null will only help in very limited contexts. If you are in a destructor, the pointer itself will not exist right after the destructor execution, which means that whether it is null or not does not matter.
If the pointer was copied, setting p to null will not set the rest of the pointers, and you can then hit the same problems with the extra issue that you will be expecting to find deleted pointers being null, and it won't make sense how your pointer became non-zero and yet the object is not there....
Additionally it might hide other errors, like for example if your application is trying to delete a pointer many times, by setting the pointer to null, the effect is that the second delete will be converted to a no-op, and while the application will not crash the error in the logic is still there (consider that a later edit accesses the pointer right before the delete that is not failing... how can that fail?
I recommend doing so.
Obviously, it's valid in a context where NULL is a valid value of the pointer. This, of course, means if it's used in other places, it must be checked.
Even if the pointer may, technically, not be NULL, it does help in real-world scenarios when customers send you a crash dump. If it's NULL and it's not supposed to be (and it wasn't trapped in testing with the assert() that you should do), then it's easy to see that this is the problem - you'll crash at something like mov eax, [edx+4] or something, you'll see that edx is 0, and you know what the problem is. If, on the other hand, you don't NULL the pointer, but it is deleted, then all sorts of things may happen. It may work, it may crash immediately, it may crash later, it may show weird things - at this point, anything that happens is soft-non-deterministic.
Defensive programming is King. That includes setting a pointer to NULL extraneously, even if you think you don't have to, and doing that extra NULL check in a few places even though you technically know it isn't supposed to happen.
Even if you have the logic error of a pointer going through delete twice, it's better to not crash and handle it safely than to crash. That may mean you do that extra check, and you'll probably want to log that, or maybe even end the program gracefully, but it's not just a crash. Customers don't like that.
Personally, I use an inline function that takes the pointer as a reference and sets it to NULL.
No, it's not an anti-pattern.
Setting a pointer to NULL is a perfectly good way of indicating that the pointer is no longer pointing at anything valid. In fact, that's exactly what NULL value is intended to mean.
If there's an anti-pattern to be avoided here, the anti-pattern would be not having a simple and well-defined set of rules/conventions for how your program manages its memory. It doesn't matter so much what those rules are, as long as they work to avoid leaks and crashes, and as long as you can and do follow them consistently.
It depends on how the pointer is used. If all the code that can see the pointer should "know" that it's no longer valid — especially in a destructor where the pointer is about to go out of scope anyway — there's no need to set it to null.
On the other hand, the pointer may represent an object that sometimes exists, sometimes doesn't, and you have code like if (p) { p->doStuff(); } to act upon the object when it exists. In that case, obviously, you should set it to null after deleting the object.
The important distinction in the latter case is that the lifetime of the pointer variable is much longer than the lifetime of the objects it (sometimes) points to, and its null-ness carries some significant meaning that has to be communicated to other parts of the program.
In my opinion the anti-pattern is not delete p; p = NULL;, it's assert(this != NULL).
I use the anti-pattern as you call it for two reasons - first to enhance the likelyhood that bad code will crash spectacularly without hiding, and second to make the core problem more obvious in debugging. I wouldn't litter my code with asserts just on the off chance that it might catch something.
Although I think #Mark Ransom is on sort of the right track, I would suggest there's an even more fundamental problem than just assert(this!=NULL).
I think it's much more telling that you're using raw pointers and new (directly) often enough to care. While this isn't (necessarily) a code smell/anti-pattern/whatever in itself, it often points toward code that's unnecessarily similar to C, and isn't taking advantage of the things like containers in the standard library.
Even if the standard containers don't fit your needs well, you can still normally wrap your pointer up into a small enough, simple enough package that techniques like this are irrelevant -- you've restricted access to the pointer in question to such a small amount of code that you can glance through it and assure that it's only ever used when it's valid. As Hoare said, there are two ways of doing things: one is to keep the code so simple there are obviously no deficiencies; the other is to make the code so complex there are no obvious deficiencies. This technique only appears relevant (to me) when you're already dealing with the latter case.
Ultimately, I see the desire to do this at all as basically admitting defeat -- rather than even attempting to assure that the code is correct, you've done the equivalent of setting up a bug-zapper next to a swamp. It reduces the appearance of bugs in a small area by a small percentage, but leaves so many more free to breed that if there's any effect on the total population, it's far too small to measure.
The reason is that the extra setting to null will only help in very limited contexts. If you are in a destructor, the pointer itself will not exist right after the destructor execution, which means that whether it is null or not does not matter.
Got to correct this statement since it is false in C++.
The other functions of an object being destroyed may be called while the object is getting destroyed (because somehow the destruction process requires it.) It is generally viewed as ugly, but there not always good work around.
Therefore, clearing your pointers may be the only good solution to avoid problems (i.e. these other functions being called can then test to see whether the object is valid before using it.)
Yet, a good idea in C++ is to use smart objects (probably what you are talking about). More or less, a class that holds a reference to an object and which makes sure said object is released when the destructor is hit and not add many objects to destroy all at once in one object (although the result is the same, it is cleaner.)

Which version of safe_delete is better?

#define SAFE_DELETE(a) if( (a) != NULL ) delete (a); (a) = NULL;
OR
template<typename T> void safe_delete(T*& a) {
delete a;
a = NULL;
}
or any other better way
I would say neither, as both will give you a false sense of security. For example, suppose you have a function:
void Func( SomePtr * p ) {
// stuff
SafeDelete( p );
}
You set p to NULL, but the copies of p outside the function are unaffected.
However, if you must do this, go with the template - macros will always have the potential for tromping on other names.
Clearly the function, for a simple reason. The macro evaluates its argument multiple times. This can have evil side effects. Also the function can be scoped. Nothing better than that :)
delete a;
ISO C++ specifies, that delete on a NULL pointer just doesn't do anything.
Quote from iso 14882:
5.3.5 Delete [expr.delete]
2 [...] In either alternative, if the value of the operand of delete is the
null pointer the operation has no effect. [...]
Regards, Bodo
/edit: I didn't notice the a=NULL; in the original post, so new version: delete a; a=NULL; however, the problem with setting a=NULL has already been pointed out (false feeling of security).
Generally, prefer inline functions over macros, as macros don't respect scope, and may conflict with some symbols during preprocessing, leading to very strange compile errors.
Of course, sometimes templates and functions won't do, but here this is not the case.
Additionally, the better safe-delete is not necessary, as you could use smart-pointers, therefore not requiring to remember using this method in the client-code, but encapsulating it.
(edit) As others have pointed out, safe-delete is not safe, as even if somebody does not forget to use it, it still may not have the desired effect. So it is actually completely worthless, because using safe_delete correctly needs more thought than just setting to 0 by oneself.
You don't need to test for nullity with delete, it is equivalent to a no-op. (a) = NULL makes me lift an eyebrow. The second option is better.
However, if you have a choice, you should use smart pointers, such as std::auto_ptr or tr1::shared_ptr, which already do this for you.
I think
#define SAFE_DELETE(pPtr) { delete pPtr; pPtr = NULL } is better
its ok to call delete if pPtr is NULL. So if check is not required.
in case if you call SAFE_DELETE(ptr+i), it will result in compilation error.
Template definition will create multiple instances of the function for each data type. In my opinion in this case, these multiple definitions donot add any value.
Moreover, with template function definition, you have overhead of function call.
Usage of SAFE_DELETE really appears to be a C programmers approach to commandeering the built in memory management in C++. My question is: Will C++ allow this method of using a SAFE_DELETE on pointers that have been properly encapsulated as Private? Would this macro ONLY work on pointer that are declared Public? OOP BAD!!
As mentioned quite a bit above, the second one is the better one, not a macro with potential unintended side effects, doesn't have the unneeded check against NULL (although I suspect you are doing that as a type check), etc. But neither are promising any safety. If you do use something like tr1::smart_ptr, please make sure you read the docs on them and are sure that it has the right semantics for your task. I just recently had to hunt down and clean up a huge memory leak due to a co-worker putting smart_ptrs into a data structure with circular links :) (he should have used weak_ptrs for back references)
I prefer this version:
~scoped_ptr() {
delete this->ptr_; //this-> for emphasis, ptr_ is owned by this
}
Setting the pointer to null after deleting it is quite pointless, as the only reason that you would use pointers is to allow an object to be referenced in multiple places at once. Even if the pointer in one part of the program is 0 there may well be others that are not set to 0.
Furthermore the safe_delete macro / function template is very difficult to use right, because there are only two places that it can be used if there is code that may throw between the new and delete for the given pointer.
1) Inside either a catch (...) block that rethrows the exception and also duplicated next to the catch (...) block for the path that doesn't throw. (Also duplicated next to every break, return, continue etc that may allow the pointer to fall out of scope)
2) Inside a destructor for an object that owns the pointer (unless there is no code between the new and delete that can throw).
Even if there is no code that could throw when you write the code, this could change in the future (all it takes is for someone to came along and add another new after the first one). It is better write code in a way that stays correct even in the face of exceptions.
Option 1 creates so much code duplication and is so easy to get wrong that I am doubtful to even call it an option.
Option 2 makes safe_delete redundant, as the ptr_ that you are setting to 0 will go out of scope on the next line.
In summary -- don't use safe_delete as it is not safe at all (it is very difficult to use correctly, and leads to redundant code even when its use is correct). Use SBRM and smart pointers.

Tracking automatic variable lifetime?

This may not be possible, but I figured I'd ask...
Is there any way anyone can think of to track whether or not an automatic variable has been deleted without modifying the class of the variable itself? For example, consider this code:
const char* pStringBuffer;
{
std::string sString( "foo" );
pStringBuffer = sString.c_str();
}
Obviously, after the block, pStringBuffer is a dangling pointer which may or may not be valid. What I would like is a way to have a wrapper class which contains pStringBuffer (with a casting operator for const char*), but asserts that the variable it's referencing is still valid. By changing the type of the referenced variable I can certainly do it (boost shared_ptr/weak_ptr, for example), but I would like to be able to do it without imposing restrictions on the referenced type.
Some thoughts:
I'll probably need to change the assignment syntax to include the referenced variable (which is fine)
I might be able to look at the stack pointer to detect if my wrapper class was allocated "later" than the referenced class, but this seems hacky and not standard (C++ doesn't define stack behavior). It could work, though.
Thoughts / brilliant solutions?
In general, it's simply not possible from within C++ as pointers are too 'raw'. Also, looking to see if you were allocated later than the referenced class wouldn't work, because if you change the string, then the c_str pointer may well change.
In this particular case, you could check to see if the string is still returning the same value for c_str. If it is, you are probably still valid and if it isn't then you have an invalid pointer.
As a debugging tool, I would advise using an advanced memory tracking system, like valgrind (available only for linux I'm afraid. Similar programs exist for windows but I believe they all cost money. This program is the only reason I have linux installed on my mac). At the cost of much slower execution of your program, valgrind detects if you ever read from an invalid pointer. While it isn't perfect, I've found it detects many bugs, in particular ones of this type.
One technique you may find useful is to replace the new/delete operators with your own implementations which mark the memory pages used (allocated by your operator new) as non-accessible when released (deallocated by your operator delete). You will need to ensure that the memory pages are never re-used however so there will be limitations regarding run-time length due to memory exhaustion.
If your application accesses memory pages once they've been deallocated, as in your example above, the OS will trap the attempted access and raise an error. It's not exactly tracking per se as the application will be halted immediately but it does provide feedback :-)
This technique is applicable in narrow scenarios and won't catch all types of memory abuses but it can be useful. Hope that helps.
You could make a wrapper class that works in the simple case you mentioned. Maybe something like this:
X<const char*> pStringBuffer;
{
std::string sString( "foo" );
Trick trick(pStringBuffer).set(sString.c_str());
} // trick goes out of scope; its destructor marks pStringBuffer as invalid
But it doesn't help more complex cases:
X<const char*> pStringBuffer;
{
std::string sString( "foo" );
{
Trick trick(pStringBuffer).set(sString.c_str());
} // trick goes out of scope; its destructor marks pStringBuffer as invalid
}
Here, the invalidation happens too soon.
Mostly you should just write code which is as safe as possible (see: smart pointers), but no safer (see: performance, low-level interfaces), and use tools (valgrind, Purify) to make sure nothing slips through the cracks.
Given "pStringBuffer" is the only part of your example existing after sString goes out of scope, you need some change to it, or a substitute, that will reflect this. One simple mechanism is a kind of scope guard, with scope matching sString, that affects pStringBuffer when it is destroyed. For example, it could set pStringBuffer to NULL.
To do this without changing the class of "the variable" can only be done in so many ways:
Introduce a distinct variable in the same scope as sString (to reduce verbosity, you might consider a macro to generate the two things together). Not nice.
Wrap with a template ala X sString: it's arguable whether this is "modifying the type of the variable"... the alternative perspective is that sString becomes a wrapper around the same variable. It also suffers in that the best you can do is have templated constructor pass arguments to wrapped constructors up to some finite N arguments.
Neither of these help much as they rely on the developer remembering to use them.
A much better approach is to make "const char* pStringBuffer" simply "std::string some_meaningful_name", and assign to it as necessary. Given reference counting, it's not too expensive 99.99% of the time.