Can I use std::stack as object pool container? - c++

I need to create a pool of objects to eliminate dynamic allocations. Is it efficient to use std::stack to contain pointers of allocated objects?
I suspect every time I push the released object back to stack a new stack element will be dynamically allocated. Am I right? Should I use std::vector to make sure nothing new is allocated?

Whether a stack is suited for your particular purpose or not is an issue I will not deal with. Now, if you are concerned about the number of allocations, the default internal container for a std::stack is an std::deque<>. It will not need to allocate new memory for the stack in each push (as long as it has space) and when it allocates it does not need to relocate all existing elements as an std::vector<> would.
You can tell the stack to use an std::vector<> as underlying container with the second template argument:
std::stack< int, std::vector<int> > vector_stack;

STL containers of pointers don't do anything with the objects they point to, that's up to you, so you are responsible for not leaking any memory etc. Have a look at Boost Pointer Container Library or try storing the actual objects, you will save yourself hassle in the long run.
If you want to reduce the amount of dynamic allocations made by the container and you know roughly how many objects you need to store, you can use vector's 'reserve()' method, which will preallocate the memory you request in one shot.
You can also specify the number of records you want in the constructor, but this way will create x objects for you and then store them, which might not be want you want.
If, for some technical reason dynamic allocation is out completely, you might want to try using boost::pool as your allocator, (as you know you can specify a different std library memory allocator if you don't want to use the default one).
That said, when I tested it, the default one was always faster, at least with g++ so it may not be worth the effort. Make sure you profile it rather than assume you can out code the standards commitee!

Doing ANY allocs during a free is WRONG due to nothrow-guarantees. If you have to do an alloc to do a free and the alloc throws where do you put the pointer? You either quietly capture the exception and leak or propagate the exception. Propagating the exception means objects that use your YourObject cant be put in STL containers. And leaking is, well, leaking. In either case you have violated the rules.
But what data structure to use depends on your object lifetime control idiom.
Is the idiom an object pool to be used with a factory method(s) and freeInstance
YourObject* pO = YourObject::getInstance(... reinitialize parameters ...);
......object public lifetime....
pO->freeInstance();
or a memory pool to be used with a class specific operator new/operator delete (or an allocator)?
YourObject::operator new(size_t);
......object public lifetime....
delete pO;
If it is an object pool and you have an idea about the number of YourObject*'s use vector in released code and deque or preferably a circular buffer (as deque has no reserve so you have to add this where as a dynamically self sizing circular buffer is precisely what you want) in debug code and reserve the approximate number. Allocate LIFO in release and FIFO in debug so you have history in debug.
In the path where there are no free objects, remember to do the reserve(nMade+1) on the YourObject* collection before you dynamically create an object.
(The reason for this reserve is two fold. First, it must be done at createInstance time Second, it simplifies the code. For otherwise you have the possibility of throwing a std::badalloc in freeInstance which may make destructor guarantees hard to guarantee. OUCH! e.g. - class Y has an YourObject* in it and does a freeInstance for that YourObject* in its destructor - if you don't reserve the space for the YourObject* when you make it where do you store that pointer at freeInstance time? If you reserve the space afterwards in getInstance then you have to catch the std::badalloc for the reserve, release the just made YourObject, and rethrow.)
If it is a memory pool then in the memory blocks use an intrusive singly linked list in release and doubly linked list in debug (I am assuming that sizeof(YourObject)>=2*sizeof(void*)) BTW there are a lot of MemoryPool implementations out there. Again allocate LIFO in release and FIFO in debug so you have history in debug.
BTW if you use the factory idiom don't skip on the overloaded getIntances() and add reinit methods. That just opens up the possibility of leaving out a reinit call. The getInstance methods are your "constructors" in the sense that it is they that get the object to the state that you want. Note that in the object pool case you need a freeInstance which may have to do "destructor like" things to the object.
In this case it makes some sense to speak of "public class invariants" and "private class invariants" - the object sits in a limbo state where public class invariants may NOT be satisfied while in the free pool. Its a YourObject as fas as a YourObject is concerned but all of the public class invariants may not be satisfied. It is the job of YourObject::getInstance to both get an instance AND ensure that its public invariants are satisfied. In a complementary fashion freeInstance releases resources that may have been acquired by getInstance to ensure the "public class invariants" were satisfied may be released during the objects "idle time" on the free list.
LIFO in release also has the SIGNIFICANT benefit of caching the last used objects/blocks where as FIFO is guaranteed not to cache if there are a sufficiently large number of objects/blocks - or even page if the number is larger! But you probably already realized this as you decided to use a stack.
strong text

Related

Is it safe to "dissolve" c++ arrays on the heap?

I am currently implementing my own vector container and I encountered a pretty interesting Issue(At leas for me). It may be a stupid question but idk.
My vector uses an heap array of pointers to heap allocated objects of unknown type (T**).
I did this because I wanted the pointers and references to individual elements to stay same, even after resizing.
This comes at performance cost when constructing and copying, because I need to create the array on the heap and each object of the array on the heap too. (Heap allocation is slower than on the stack, right?)
T** arr = new *T[size]{nullptr};
and then for each element
arr[i] = new T{data};
Now I wonder if it would be safe, beneficial(faster) and possible, if instead of allocating each object individually, I could create a second array on the heap and save the pointer of each object in the first one.Then use (and delete) these objects later as if they were allocated separately.
=> Is allocating arrays on the heap faster than allocating each object individually?
=> Is it safe to allocate objects in an array and forgetting about the array later? (sounds pretty dumb i think)
Link to my github repo: https://github.com/LinuxGameGeek/personal/tree/main/c%2B%2B/vector
Thanks for your help :)
First a remark, you should not think of the comparison heap/stack in terms of efficiency, but on object lifetime:
automatic arrays (what you call on stack) end their life at the end of the block where they are defined
dynamic arrays (whay you call on heap) exists until they are explicitly deleted
Now it is always more efficient to allocate a bunch of objects in an array than to allocate them separately. You save a number of internal calls and various data structure to maintain the heap. Simply you can only deallocate the array and not the individual objects.
Finally, except for trivially copyable objects, only the compiler and not the programmer knows about the exact allocation detail. For example (and for common implementations) an automatic string (so on stack) contains a pointer to a dynamic char array (so on heap)...
Said differently, unless you plan to only use you container for POD or trivially copyable objects, do not expect to handle all the allocation and deallocation yourself: non trivial objects have internal allocations.
Heap allocation is slower than on the stack, right?
Yes. Dynamic allocation has a cost.
Is allocating arrays on the heap faster than allocating each object individually?
Yes. Multiple allocations have that cost multiplied.
I wonder if it would be ... possible, if instead of allocating each object individually, I could create a second array on the heap and save the pointer of each object in the first one
It would be possible, but not trivial. Think hard how you would implement element erasure. And then think about how you would implement other features such as random access correctly into the container with arrays that contain indices from which elements have been erased.
... safe
It can be implemented safely.
... beneficial(faster)
Of course, reducing allocations from N to 1 would be beneficial by itself. But it comes at the cost of some scheme to implement the erasure. Whether this cost is greater than the benefit of reduced allocations depends on many things such as how the container is used.
Is it safe to allocate objects in an array and forgetting about the array later?
"Forgetting" about an allocation seems like a way to say "memory leak".
You could achieve similar advantages with a custom "pool" allocator. Implementing support for custom allocators to your container might be more generally useful.
P.S. Boost already has a "ptr_vector" container that supports custom allocators. No need to reinvent the wheel.
I did this because I wanted the pointers and references to individual
elements to stay same, even after resizing.
You should just use std::vector::reserve to prevent reallocation of vector data when it is resized.
Vector is quite primitive, but is is highly optimized. It will be extremely hard for you to beat it with your code. Just inspect its API and try its all functionalities. To create something better advanced knowledge of template programing is required (which apparently you do not have yet).
What you are trying to come up with is a use of placement new allocation for a deque-like container. It's a viable optimization, but usually its done to reduce allocation calls and memory fragmentation, e.g. on some RT or embedded systems. The array maybe even a static array in that case. But if you also require that instances of T would occupy adjacent space, that's a contradicting requirement, resorting them would kill any performance gains.
... beneficial(faster)
Depends on T. E.g. there is no point to do that to something like strings or shared pointers. Or anything that actually allocates resources elsewhere, unless T allows to change that behaviour too.
I wonder if it would be ... possible, if instead of allocating each
object individually, I could create a second array on the heap and
save the pointer of each object in the first one
Yes it is possible, even with standard ISO containers, thanks to allocators.
There is concern of thread safety or awareness if this "array" appears to be shared resource between multiple writer and reader threads. You might want to implement thread-local storages instead of using shared one and implement semaphores for crossover cases.
Usual application for that is to allocate not on heap but in statically allocated array, predetermined. Or in array that was allocated once at start of program.
Note that if you use placement new you should not use delete on created objects, you have to call destructor directly. placement new overload is not a true new as far as delete concerned. You may or may not cause error but you certainly will cause an crash if you used static array and you will cause heap corruption when deleting element that got same address as dynamically allocated array beginning
This comes at performance cost when constructing and copying, because I need to create the array on the heap and each object of the array on the heap too.
Copying a POD is extremely cheap. If you research perfect forwarding you can achieve the zero cost abstraction for constructors and the emplace_back() function. When copying, use std::copy() as it is very fast.
Is allocating arrays on the heap faster than allocating each object individually?
Each allocation requires you to ask the operating system for memory. Unless you are asking for a particularly large amount of memory you can assume each request will be a constant amount of time. Instead of asking for a parking space 10 times, ask for 10 parking spaces.
Is it safe to allocate objects in an array and forgetting about the array later? (sounds pretty dumb i think)
Depends what you mean by safe. If you can't answer this question on your own, then you must cleanup the memory and not leak under any circumstance.
An example of a time you might ignore cleaning up memory is when you know the program is going to end and cleaning up memory just to exit is kinda pointless. Still, you should clean it up. Read Serge Ballesta answer for more information about lifetime.

custom allocator for stl map in c++

I want to create a custom allocator for a multimap that will allocate the elements in shared memory.I came across boost.interprocess but found it quite complicated to implement.Is there any other workaround ?
I will not give here any implementation, rather to give you some directions.
If your shared memory abstraction or region, for example start at adress void* shMemAddr and if you decide that your stl container to use shared memory,
what needs to be done is to make container allocate memory starting at shMemAddr and further, until there is available memory to allocate in your shared pool. You can implement that using any allocation strategy, for example using malloc or placement new. Further, to be available for your container to use your allocator you need to provide your allocator as template argument, for multimap it would be multimap::allocator_type
class Alloc = allocator > as fourth template argument, after less as compare function,and, for example, if you store in your multimap pairs of int,double as key,value pairs, it would likely be something like this
multimap<int,double,less<int>,CustomAlloc<pair<int,double>>>
Now, your CustomAlloc allocator need to satisfies concept of Allocators which encapsulate specific lowlevel memory management, especially, if shared memory is resource to be allocate in, you need to arrange proper allocation of memory in a multithreaded enviornment. That means that, first, you need some structure for evidence of used memory. It can be some chained data structure, for example, and implementations like that is pretty common, so you need to keep invariants of that structure consistent. What that means is if your structure for book keeping of used(or free) memory need to be updated after succesfull allocation or deallocation it needs to be done atomicaly, so thread which possibly try to allocate memory see only structure in states before CustomAllocator allocation job is started or after allocation job is finished. For example, your first choice to do that could be using mutex to protect data, avoid races and keep invariants. This is just directions, and considering write your own allocators is not very hard, I hope this will help as good starting point.

C++ using a Garbage Collector is overkill, what is a better solution?

I am currently using Boehm Garbage Collector for a large application in C++. While it works, it seems to me that the GC is overkill for my purpose (I do not like having this as a dependency and I have to continually make allowances and think about the GC in everything I do so as to not step on its toes). I would like to find a better solution that is more suited to my needs, rather than a blanket solution that happens to cover it.
In my situation I have one specific class (and everything that inherits from that class) that I want to "collect". I do not need general garbage collection, in all situations except for this particular class I can easily manage my own memory.
Before I started using the GC, I used reference counting, but reference cycles and the frequent updates made this a less than ideal solution.
Is there a better way for me to keep track of this class? One that does not involve additional library dependancies like boost.
Edit:
It is probably best if I give a rundown on the potential lifespan of my object(s).
A function creates a new instance of my class and may (or may not) use it. Regardless, it passes this new instance back to the caller as a return value. The caller may (or may not) use it as well, and again it passes it back up the stack, eventually getting to the top level function which just lets the pointer fade into oblivion.
I cannot just delete the pointer in the top level, because part of the "possible use", involves passing the pointer to other functions which may (or may not) store the pointer for use somewhere else, at some future time.
I hope this better illustrates the problem that I am trying to solve. I currently solve it with Boehm Garbage Collector, but would like simpler, non dependency involving, solution if possible.
In the Embedded Systems world, or programs that are real-time event critical, garbage collection is frowned upon. The point of using dynamic memory is bad.
With dynamic memory allocation, fragmentation occurs. A Garbage Collector is used to periodically arrange memory to reduce the fragmentation, such as combining sequential freed blocks. The primary issue is when to perform this defragmentation or running of the GC.
Some suggested alternatives:
Redesign your system to avoid dynamic memory allocation.
Allocate static buffers and use them. For example in an RTOS system, preallocate space for messages, rather than dynamically allocating them.
Use the Stack, not the Heap.
Use the stack for dynamically allocated variables, if possible. This is not a good idea if variables need a lifetime beyond the function execution.
Place limits on variable sized data.
Along with static buffers, place limits on variable length data or incoming data of unknown size. This may mean that the incoming data must be paused or multiple buffering when the input cannot be stopped.
Create your own memory allocator.
Create many memory pools that allocate different sized blocks. This will reduce fragmentation. For example, for small blocks, maybe a bitset could be used to determine which bytes are in use and which are available. Maybe another pool for 64 byte blocks is necessary. All depends on your system's needs.
If you really just need special handling for the memory allocations associated with a single class, then you should look at overloading the new operator for that class.
class MyClass
{
public:
void *operator new(size_t);
void operator delete(void *);
}
You can implement these operators to do whatever you need to track the memory: allocate it from a special pool, place references on a linked list for tracking, etc.
void* MyClass::operator new(size_t size)
{
void *p = my_allocator(size); // e.g., instead of malloc()
// place p on a linked list, etc.
return p;
}
void MyClass::operator delete(void *p)
{
// remove p from list...
my_free(p);
}
You can then write external code that can walk through the list you are keeping to inspect every currently-allocated instance of MyClass, GC'ing instances as appropriate for your situation.
With memory, you should always try and have clear ownership and knowledge of lifetime. Lifetime determines where you take the memory from (as do other factors), ie stack for scope lived, pool for reused, etc. Ownership will tell you when and if to free memory. In your case, the GC has the ownership and makes the decision when to free. With ref counting, the wrapper class does this logic. Unclear ownership leads to hard to maintain code if manual memory management is used. You must avoid use after free, double frees, and memory leaking.
To solve your problem, figure out who should keep ownership. This will dictate the algoritm to use. GC and ref counting are popular choices, but there are infinetly many. If ownership is unclear, give it to a 3rd party whose job it is to keep track of it. If ownership is shared, make sure all parties are aware of it perhaps by enforcing it via specialized classes. This can also be enforced by simple convention, ie objects of type foo should never keep ptrs of type bar internally as they do not own them and if they do they cannot assume them always valid and might have to check for validity first. Etc.
If you find this hard to determine, it could be a sign that the code is very complex. Could it be made in a more simple manner?
Understanding how your memory is used and accessed is key to writing clean code for maintenance and performance optimizations. This is true regardless of language used.
Best of luck.

Why should C++ programmers minimize use of 'new'?

I stumbled upon Stack Overflow question Memory leak with std::string when using std::list<std::string>, and one of the comments says this:
Stop using new so much. I can't see any reason you used new anywhere you did. You can create objects by value in C++ and it's one of the huge advantages to using the language. You do not have to allocate everything on the heap. Stop thinking like a Java programmer.
I'm not really sure what he means by that.
Why should objects be created by value in C++ as often as possible, and what difference does it make internally? Did I misinterpret the answer?
There are two widely-used memory allocation techniques: automatic allocation and dynamic allocation. Commonly, there is a corresponding region of memory for each: the stack and the heap.
Stack
The stack always allocates memory in a sequential fashion. It can do so because it requires you to release the memory in the reverse order (First-In, Last-Out: FILO). This is the memory allocation technique for local variables in many programming languages. It is very, very fast because it requires minimal bookkeeping and the next address to allocate is implicit.
In C++, this is called automatic storage because the storage is claimed automatically at the end of scope. As soon as execution of current code block (delimited using {}) is completed, memory for all variables in that block is automatically collected. This is also the moment where destructors are invoked to clean up resources.
Heap
The heap allows for a more flexible memory allocation mode. Bookkeeping is more complex and allocation is slower. Because there is no implicit release point, you must release the memory manually, using delete or delete[] (free in C). However, the absence of an implicit release point is the key to the heap's flexibility.
Reasons to use dynamic allocation
Even if using the heap is slower and potentially leads to memory leaks or memory fragmentation, there are perfectly good use cases for dynamic allocation, as it's less limited.
Two key reasons to use dynamic allocation:
You don't know how much memory you need at compile time. For instance, when reading a text file into a string, you usually don't know what size the file has, so you can't decide how much memory to allocate until you run the program.
You want to allocate memory which will persist after leaving the current block. For instance, you may want to write a function string readfile(string path) that returns the contents of a file. In this case, even if the stack could hold the entire file contents, you could not return from a function and keep the allocated memory block.
Why dynamic allocation is often unnecessary
In C++ there's a neat construct called a destructor. This mechanism allows you to manage resources by aligning the lifetime of the resource with the lifetime of a variable. This technique is called RAII and is the distinguishing point of C++. It "wraps" resources into objects. std::string is a perfect example. This snippet:
int main ( int argc, char* argv[] )
{
std::string program(argv[0]);
}
actually allocates a variable amount of memory. The std::string object allocates memory using the heap and releases it in its destructor. In this case, you did not need to manually manage any resources and still got the benefits of dynamic memory allocation.
In particular, it implies that in this snippet:
int main ( int argc, char* argv[] )
{
std::string * program = new std::string(argv[0]); // Bad!
delete program;
}
there is unneeded dynamic memory allocation. The program requires more typing (!) and introduces the risk of forgetting to deallocate the memory. It does this with no apparent benefit.
Why you should use automatic storage as often as possible
Basically, the last paragraph sums it up. Using automatic storage as often as possible makes your programs:
faster to type;
faster when run;
less prone to memory/resource leaks.
Bonus points
In the referenced question, there are additional concerns. In particular, the following class:
class Line {
public:
Line();
~Line();
std::string* mString;
};
Line::Line() {
mString = new std::string("foo_bar");
}
Line::~Line() {
delete mString;
}
Is actually a lot more risky to use than the following one:
class Line {
public:
Line();
std::string mString;
};
Line::Line() {
mString = "foo_bar";
// note: there is a cleaner way to write this.
}
The reason is that std::string properly defines a copy constructor. Consider the following program:
int main ()
{
Line l1;
Line l2 = l1;
}
Using the original version, this program will likely crash, as it uses delete on the same string twice. Using the modified version, each Line instance will own its own string instance, each with its own memory and both will be released at the end of the program.
Other notes
Extensive use of RAII is considered a best practice in C++ because of all the reasons above. However, there is an additional benefit which is not immediately obvious. Basically, it's better than the sum of its parts. The whole mechanism composes. It scales.
If you use the Line class as a building block:
class Table
{
Line borders[4];
};
Then
int main ()
{
Table table;
}
allocates four std::string instances, four Line instances, one Table instance and all the string's contents and everything is freed automagically.
Because the stack is faster and leak-proof
In C++, it takes but a single instruction to allocate space—on the stack—for every local scope object in a given function, and it's impossible to leak any of that memory. That comment intended (or should have intended) to say something like "use the stack and not the heap".
The reason why is complicated.
First, C++ is not garbage collected. Therefore, for every new, there must be a corresponding delete. If you fail to put this delete in, then you have a memory leak. Now, for a simple case like this:
std::string *someString = new std::string(...);
//Do stuff
delete someString;
This is simple. But what happens if "Do stuff" throws an exception? Oops: memory leak. What happens if "Do stuff" issues return early? Oops: memory leak.
And this is for the simplest case. If you happen to return that string to someone, now they have to delete it. And if they pass it as an argument, does the person receiving it need to delete it? When should they delete it?
Or, you can just do this:
std::string someString(...);
//Do stuff
No delete. The object was created on the "stack", and it will be destroyed once it goes out of scope. You can even return the object, thus transfering its contents to the calling function. You can pass the object to functions (typically as a reference or const-reference: void SomeFunc(std::string &iCanModifyThis, const std::string &iCantModifyThis). And so forth.
All without new and delete. There's no question of who owns the memory or who's responsible for deleting it. If you do:
std::string someString(...);
std::string otherString;
otherString = someString;
It is understood that otherString has a copy of the data of someString. It isn't a pointer; it is a separate object. They may happen to have the same contents, but you can change one without affecting the other:
someString += "More text.";
if(otherString == someString) { /*Will never get here */ }
See the idea?
Objects created by new must be eventually deleted lest they leak. The destructor won't be called, memory won't be freed, the whole bit. Since C++ has no garbage collection, it's a problem.
Objects created by value (i. e. on stack) automatically die when they go out of scope. The destructor call is inserted by the compiler, and the memory is auto-freed upon function return.
Smart pointers like unique_ptr, shared_ptr solve the dangling reference problem, but they require coding discipline and have other potential issues (copyability, reference loops, etc.).
Also, in heavily multithreaded scenarios, new is a point of contention between threads; there can be a performance impact for overusing new. Stack object creation is by definition thread-local, since each thread has its own stack.
The downside of value objects is that they die once the host function returns - you cannot pass a reference to those back to the caller, only by copying, returning or moving by value.
C++ doesn't employ any memory manager by its own. Other languages like C# and Java have a garbage collector to handle the memory
C++ implementations typically use operating system routines to allocate the memory and too much new/delete could fragment the available memory
With any application, if the memory is frequently being used it's advisable to preallocate it and release when not required.
Improper memory management could lead memory leaks and it's really hard to track. So using stack objects within the scope of function is a proven technique
The downside of using stack objects are, it creates multiple copies of objects on returning, passing to functions, etc. However, smart compilers are well aware of these situations and they've been optimized well for performance
It's really tedious in C++ if the memory being allocated and released in two different places. The responsibility for release is always a question and mostly we rely on some commonly accessible pointers, stack objects (maximum possible) and techniques like auto_ptr (RAII objects)
The best thing is that, you've control over the memory and the worst thing is that you will not have any control over the memory if we employ an improper memory management for the application. The crashes caused due to memory corruptions are the nastiest and hard to trace.
I see that a few important reasons for doing as few new's as possible are missed:
Operator new has a non-deterministic execution time
Calling new may or may not cause the OS to allocate a new physical page to your process. This can be quite slow if you do it often. Or it may already have a suitable memory location ready; we don't know. If your program needs to have consistent and predictable execution time (like in a real-time system or game/physics simulation), you need to avoid new in your time-critical loops.
Operator new is an implicit thread synchronization
Yes, you heard me. Your OS needs to make sure your page tables are consistent and as such calling new will cause your thread to acquire an implicit mutex lock. If you are consistently calling new from many threads you are actually serialising your threads (I've done this with 32 CPUs, each hitting on new to get a few hundred bytes each, ouch! That was a royal p.i.t.a. to debug.)
The rest, such as slow, fragmentation, error prone, etc., have already been mentioned by other answers.
Pre-C++17:
Because it is prone to subtle leaks even if you wrap the result in a smart pointer.
Consider a "careful" user who remembers to wrap objects in smart pointers:
foo(shared_ptr<T1>(new T1()), shared_ptr<T2>(new T2()));
This code is dangerous because there is no guarantee that either shared_ptr is constructed before either T1 or T2. Hence, if one of new T1() or new T2() fails after the other succeeds, then the first object will be leaked because no shared_ptr exists to destroy and deallocate it.
Solution: use make_shared.
Post-C++17:
This is no longer a problem: C++17 imposes a constraint on the order of these operations, in this case ensuring that each call to new() must be immediately followed by the construction of the corresponding smart pointer, with no other operation in between. This implies that, by the time the second new() is called, it is guaranteed that the first object has already been wrapped in its smart pointer, thus preventing any leaks in case an exception is thrown.
A more detailed explanation of the new evaluation order introduced by C++17 was provided by Barry in another answer.
Thanks to #Remy Lebeau for pointing out that this is still a problem under C++17 (although less so): the shared_ptr constructor can fail to allocate its control block and throw, in which case the pointer passed to it is not deleted.
Solution: use make_shared.
To a great extent, that's someone elevating their own weaknesses to a general rule. There's nothing wrong per se with creating objects using the new operator. What there is some argument for is that you have to do so with some discipline: if you create an object you need to make sure it's going to be destroyed.
The easiest way of doing that is to create the object in automatic storage, so C++ knows to destroy it when it goes out of scope:
{
File foo = File("foo.dat");
// Do things
}
Now, observe that when you fall off that block after the end-brace, foo is out of scope. C++ will call its destructor automatically for you. Unlike Java, you don't need to wait for the garbage collection to find it.
Had you written
{
File * foo = new File("foo.dat");
you would want to match it explicitly with
delete foo;
}
or even better, allocate your File * as a "smart pointer". If you aren't careful about that it can lead to leaks.
The answer itself makes the mistaken assumption that if you don't use new you don't allocate on the heap; in fact, in C++ you don't know that. At most, you know that a small amount of memory, say one pointer, is certainly allocated on the stack. However, consider if the implementation of File is something like:
class File {
private:
FileImpl * fd;
public:
File(String fn){ fd = new FileImpl(fn);}
Then FileImpl will still be allocated on the stack.
And yes, you'd better be sure to have
~File(){ delete fd ; }
in the class as well; without it, you'll leak memory from the heap even if you didn't apparently allocate on the heap at all.
new() shouldn't be used as little as possible. It should be used as carefully as possible. And it should be used as often as necessary as dictated by pragmatism.
Allocation of objects on the stack, relying on their implicit destruction, is a simple model. If the required scope of an object fits that model then there's no need to use new(), with the associated delete() and checking of NULL pointers.
In the case where you have lots of short-lived objects allocation on the stack should reduce the problems of heap fragmentation.
However, if the lifetime of your object needs to extend beyond the current scope then new() is the right answer. Just make sure that you pay attention to when and how you call delete() and the possibilities of NULL pointers, using deleted objects and all of the other gotchas that come with the use of pointers.
When you use new, objects are allocated to the heap. It is generally used when you anticipate expansion. When you declare an object such as,
Class var;
it is placed on the stack.
You will always have to call destroy on the object that you placed on the heap with new. This opens the potential for memory leaks. Objects placed on the stack are not prone to memory leaking!
One notable reason to avoid overusing the heap is for performance -- specifically involving the performance of the default memory management mechanism used by C++. While allocation can be quite quick in the trivial case, doing a lot of new and delete on objects of non-uniform size without strict order leads not only to memory fragmentation, but it also complicates the allocation algorithm and can absolutely destroy performance in certain cases.
That's the problem that memory pools where created to solve, allowing to to mitigate the inherent disadvantages of traditional heap implementations, while still allowing you to use the heap as necessary.
Better still, though, to avoid the problem altogether. If you can put it on the stack, then do so.
I tend to disagree with the idea of using new "too much". Though the original poster's use of new with system classes is a bit ridiculous. (int *i; i = new int[9999];? really? int i[9999]; is much clearer.) I think that is what was getting the commenter's goat.
When you're working with system objects, it's very rare that you'd need more than one reference to the exact same object. As long as the value is the same, that's all that matters. And system objects don't typically take up much space in memory. (one byte per character, in a string). And if they do, the libraries should be designed to take that memory management into account (if they're written well). In these cases, (all but one or two of the news in his code), new is practically pointless and only serves to introduce confusions and potential for bugs.
When you're working with your own classes/objects, however (e.g. the original poster's Line class), then you have to begin thinking about the issues like memory footprint, persistence of data, etc. yourself. At this point, allowing multiple references to the same value is invaluable - it allows for constructs like linked lists, dictionaries, and graphs, where multiple variables need to not only have the same value, but reference the exact same object in memory. However, the Line class doesn't have any of those requirements. So the original poster's code actually has absolutely no needs for new.
I think the poster meant to say You do not have to allocate everything on the heap rather than the the stack.
Basically, objects are allocated on the stack (if the object size allows, of course) because of the cheap cost of stack-allocation, rather than heap-based allocation which involves quite some work by the allocator, and adds verbosity because then you have to manage data allocated on the heap.
Two reasons:
It's unnecessary in this case. You're making your code needlessly more complicated.
It allocates space on the heap, and it means that you have to remember to delete it later, or it will cause a memory leak.
Many answers have gone into various performance considerations. I want to address the comment which puzzled OP:
Stop thinking like a Java programmer.
Indeed, in Java, as explained in the answer to this question,
You use the new keyword when an object is being explicitly created for the first time.
but in C++, objects of type T are created like so: T{} (or T{ctor_argument1,ctor_arg2} for a constructor with arguments). That's why usually you just have no reason to want to use new.
So, why is it ever used at all? Well, for two reasons:
You need to create many values the number of which is not known at compile time.
Due to limitations of the C++ implementation on common machines - to prevent a stack overflow by allocating too much space creating values the regular way.
Now, beyond what the comment you quoted implied, you should note that even those two cases above are covered well enough without you having to "resort" to using new yourself:
You can use container types from the standard libraries which can hold a runtime-variable number of elements (like std::vector).
You can use smart pointers, which give you a pointer similar to new, but ensure that memory gets released where the "pointer" goes out of scope.
and for this reason, it is an official item in the C++ community Coding Guidelines to avoid explicit new and delete: Guideline R.11.
The core reason is that objects on heap are always difficult to use and manage than simple values. Writing code that are easy to read and maintain is always the first priority of any serious programmer.
Another scenario is the library we are using provides value semantics and make dynamic allocation unnecessary. Std::string is a good example.
For object oriented code however, using a pointer - which means use new to create it beforehand - is a must. In order to simplify the complexity of resource management, we have dozens of tools to make it as simple as possible, such as smart pointers. The object based paradigm or generic paradigm assumes value semantics and requires less or no new, just as the posters elsewhere stated.
Traditional design patterns, especially those mentioned in GoF book, use new a lot, as they are typical OO code.
new is the new goto.
Recall why goto is so reviled: while it is a powerful, low-level tool for flow control, people often used it in unnecessarily complicated ways that made code difficult to follow. Furthermore, the most useful and easiest to read patterns were encoded in structured programming statements (e.g. for or while); the ultimate effect is that the code where goto is the appropriate way to is rather rare, if you are tempted to write goto, you're probably doing things badly (unless you really know what you're doing).
new is similar — it is often used to make things unnecessarily complicated and harder to read, and the most useful usage patterns can be encoded have been encoded into various classes. Furthermore, if you need to use any new usage patterns for which there aren't already standard classes, you can write your own classes that encode them!
I would even argue that new is worse than goto, due to the need to pair new and delete statements.
Like goto, if you ever think you need to use new, you are probably doing things badly — especially if you are doing so outside of the implementation of a class whose purpose in life is to encapsulate whatever dynamic allocations you need to do.
One more point to all the above correct answers, it depends on what sort of programming you are doing. Kernel developing in Windows for example -> The stack is severely limited and you might not be able to take page faults like in user mode.
In such environments, new, or C-like API calls are prefered and even required.
Of course, this is merely an exception to the rule.
new allocates objects on the heap. Otherwise, objects are allocated on the stack. Look up the difference between the two.

How to guard against memory leaks?

I was recently interviewing for a C++ position, and I was asked how I guard against creating memory leaks. I know I didn't give a satisfactory answer to that question, so I'm throwing it to you guys. What are the best ways to guard against memory leaks?
Thanks!
What all the answers given so far boil down to is this: avoid having to call delete.
Any time the programmer has to call delete, you have a potential memory leak.
Instead, make the delete call happen automatically. C++ guarantees that local objects have their destructors called when they go out of scope. Use that guarantee to ensure your memory allocations are automatically deleted.
At its most general, this technique means that every memory allocation should be wrapped inside a simple class, whose constructor allocates the necessary memory, and destructor releases it.
Because this is such a commonly-used and widely applicable technique, smart pointer classes have been created that reduce the amount of boilerplate code. Rather than allocating memory, their constructors take a pointer to the memory allocation already made, and stores that. When the smart pointer goes out of scope, it is able to delete the allocation.
Of course, depending on usage, different semantics may be called for. Do you just need the simple case, where the allocation should last exactly as long as the wrapper class lives? Then use boost::scoped_ptr or, if you can't use boost, std::auto_ptr. Do you have an unknown number of objects referencing the allocation with no knowledge of how long each of them will live? Then the reference-counted boost::shared_ptr is a good solution.
But you don't have to use smart pointers. The standard library containers do the trick too. They internally allocate the memory required to store copies of the objects you put into them, and they release the memory again when they're deleted. So the user doesn't have to call either new or delete.
There are countless variations of this technique, changing whose responsibility it is to create the initial memory allocation, or when the deallocation should be performed.
But what they all have in common is the answer to your question: The RAII idiom: Resource Acquisition Is Initialization. Memory allocations are a kind of resource. Resources should be acquired when an object is initialized, and released by the object itslef, when it is destroyed.
Make the C++ scope and lifetime rules do your work for you. Never ever call delete outside of a RAII object, whether it is a container class, a smart pointer or some ad-hoc wrapper for a single allocation. Let the object handle the resource assigned to it.
If all delete calls happen automatically, there's no way you can forget them. And then there's no way you can leak memory.
Don't allocate memory on the heap if you don't need to. Most work can be done on the stack, so you should only do heap memory allocations when you absolutely need to.
If you need a heap-allocated object that is owned by a single other object then use std::auto_ptr.
Use standard containers, or containers from Boost instead of inventing your own.
If you have an object that is referred to by several other objects and is owned by no single one in particular then use either std::tr1::shared_ptr or std::tr1::weak_ptr -- whichever suits your use case.
If none of these things match your use case then maybe use delete. If you do end up having to manually manage memory then just use memory leak detection tools to make sure that you aren't leaking anything (and of course, just be careful). You shouldn't ever really get to this point though.
You'd do well to read up on RAII.
replace new with shared_ptr's. Basically RAII. make code exception safe. Use the stl everywhere possible. If you use reference counting pointers make sure that they don't form cycles. SCOPED_EXIT from boost is also very useful.
(Easy) Never ever let a raw pointer own a object (search your code for the regexp "\= *new". Use shared_ptr or scoped_ptr instead, or even better, use real variables instead of pointers as often as you can.
(Hard) Make sure you don't have any circular references, with shared_ptrs pointing to each other, use weak_ptr to break them.
Done!
Use all kind of smart pointers.
Use certain strategy for creation and deletion of objects, like who creates that is responsible for delete.
make sure that you understand exactly how an object will be deleted everytime you create one
make sure you understand who owns the pointer every time one is returned to you
make sure your error paths dispose of objects you have created appropriately
be paranoid about the above
In addition to the advice about RAII, remember to make your base class destructor virtual if there are any virtual functions.
To avoid memory leaks, what you must do is to have a clear and definite notion of who is responsible for deleting any dynamically allocated object.
C++ allows construction of objects on the stack (i.e. as kind-of local variables). This binds creation and destruction the the control flow: an objects is created when program execution reaches its declaration, and the object is destroyed when execution escapes the block in which that declaration was made. Whenever allocation need matches that pattern, then use it. This will save you much of the trouble.
For other usages, if you can define and document a clear notion of responsibility, then this may work fine. For instance, you have a method or a function which returns a pointer to a newly allocated object, and you document that the caller becomes responsible for ultimately deleting that instance. Clear documentation coupled with good programmer discipline (something which is not easily achieved !) can solve many remaining problems of memory management.
In some situations, including undisciplined programmers and complex data structures, you may have to resort to more advanced techniques, such as reference counting. Each object is awarded a "counter" which is the number of other variables which point to it. Whenever a piece of code decides to no longer point to the object, the counter is decreased. When the counter reaches zero, the object is deleted. Reference counting requires strict counter handling. This can be done with so-called "smart pointers": these are object which are functionally pointers, but which automatically adjust the counter upon their own creation and destruction.
Reference counting works quite good in many situations, but they cannot handle cyclic structures. So for the most complex situations, you have to resort to the heavy artillery, i.e. a garbage collector. The one I link to is the GC for C and C++ written by Hans Boehm, and it has been used in some rather big projects (e.g. Inkscape). The point of a garbage collector is to maintain a global view on the complete memory space, to know whether a given instance is still in use or not. This is the right tool when local-view tools, such as reference counting, are not enough. One could argue that, at that point, one should ask oneself whether C++ is the right language for the problem at hand. Garbage collection works best when the language is cooperative (this unlocks a host of optimizations which are not doable when the compiler is unaware of what happens with memory, as a typical C or C++ compiler).
Note that none of the techniques described above allows the programmer to stop thinking. Even a GC can suffer from memory leaks, because it uses reachability as an approximation of future usage (there are theoretical reasons which imply that it is not possible, in full generality, to accurately detect all objects which will not be used thereafter). You may still have to set some fields to NULL to inform the GC that you will no longer access an object through a given variable.
I start by reading the following: https://stackoverflow.com/search?q=%5Bc%2B%2B%5D+memory+leak
A very good way is using Smart Pointers, the boost/tr1::shared_ptr. The memory will be free'd, once the (stack allocated) smart pointer goes out of scope.
You can use the utility.
If you work on Linux - use valgrid (it's free).
Use deleaker on Windows.
Smart pointers.
Memory management.
Override 'new' and 'delete' or use your own macros/templates.
On x86 you can regularly use Valgrind to check your code