I'm trying to translate some projects I've made with Delphi; an object can be declared in general as:
//I have the control of the object and I MUST delete it when it's not needed anymore
male := THuman.Create();
try
// code
finally
male.Free; (get rid of the object)
end;
Reading Stroustrup's book about C++ I have understood that (in short) his language doesn't need the finally block because there are always workarounds. Now if I want to create a class I have two ways:
THuman male; in which the object is created and then goes out of scope when the block {... code ...} ends
THuman* male = new THuman I can control the object's life and destroy it with a delete
The book suggests to use the first approach (even if both are fine) but I come from a Delphi background and I want to use the second method (I have the control of the object).
Questions. I cannot understand the difference between the 2 methods that C++ have for objects and reading online I got more confusion. Is it correct if I say that method 1 allocates memory on the stack and method 2 on the heap?
In method 2 (we're in the heap) if I assigned the value NULL to the object, do I still have to call the delete?
For example Delphi allows to create instances only on the heap and the Free deletes the object (like delete in C++).
In Short
1- Objects not created with new have automatic lifetime (created in the stack as you say, but that is an implementation technique chosen by most compilers), they are automatically freed once they go out of scope.
2- The lifetime of objects created with new (created in the heap, again as an implementation technique of most compilers), need to be managed by the programmer. Notice that deleting is NOT setting the pointer to NULL, it should happen before. The simple rule is:
Each new must be matched with one an only one delete
each new[] (creation of dynamic arrays) must be matched with one an only one delete[]
p.s: matched here concerns one-vs-one occurrence in the program's runtime, not necessarily in the code itself (you have control over when and where to delete any newed object).
Is it correct if I say that method 1 allocates memory on the stack and method 2 on the heap?
Yes
In method 2 (we're in the heap) if I assigned the value NULL to the object, do I still have to call the delete?
Consider using smartpointers instead of raw pointers. Smartpointers are objects which handle pointer resources and make them more safe to use.
However, if you cannot use smart pointers, make sure to call delete on the pointer to release the previous allocated memory from the heap before you assign NULL to it. Otherwise you will have leaked resources. To check your program memory management you can run valgrind
In method 2 (we're in the heap) if I assigned the value NULL to the object, do I still have to call the delete?
if you do this before calling delete you leaked memory and if you call delete on a NULL pointer you get a seg fault(application crash).
the best way is to use stack objects, but if you need to manage object's existence use smart pointer(unique_ptr, shared_ptr) as suggested above.
Note : by "Leaked memory" I mean that this region is lost it cannot be accessed from program , nor freed by OS.
Related
I have a piece of code that does a lot of memory allocation.
I would like to know if there were a pattern that I can implement to reuse previously deleted memory (because I build a lot of temporary objects that allocate memory like int*, char*, etc.. but it can be very huge).
My purpose is optimization, so I'd like to reuse the memory and not "delete" it even when using temporary object.
It may not be clear enough, let me know so that I can post some code to show you the problem.
Delegate the creation of your temporary objects to a class.
As pointed by Dan, you need to implement a memory manager Or Pool by overloading new and delete operators in that class.
Allocate big chunk of memory when first time new is called and divide that into fixed size blocks. Keep using these blocks for temporary objects.
When delete is called, just update the allocation status of that block.
Delete the big chunk once you are done with using temporary objects.
All I do is make sure the object has a spare pointer in it.
Then instead of "delete" I just keep a linked list of previously used objects, and push it on the front of the list.
Instead of "new" I pop one off.
If the list is empty, that's when I make a truly new one.
What is the difference between
Kwadrat* k1 = new Kwadrat(1,2,3);
k1->field = 0;
Kwadrat k2(1,2,3);
k2.field = 0;
The first one is the pointer to the allocated memory, the second one is the object(where is it, on system stack?) why the second is worse? When we use first, whend the second one?
Dynamic allocation on the heap (uses new):
Kwadrat* k1 = new Kwadrat(1,2,3);
Object creation on the stack (without new):
Kwadrat k2(1,2,3);
Check out this Stack Overflow question for extended discussion about the stack and heap. Brian R. Bondy's answer does a nice job of comparing the two, while Jeff Hill's answer gives you some more nitty gritty details.
For a dangerously small summary:
you must delete objects you create with new, or else your code will suffer from memory leaks
you don't have to worry about manually deleting objects created on the stack because C++ Resource Allocation is Initialization (RAII) takes care of that for you
The stack can overflow if you attempt very large allocations.
As for which one you should use, that depends on the intended scope of the object you are creating
Using new dynamically creates an object on the heap meaning it will persist even if the pointer (k1*) goes out of scope.
This can be handy, but if it goes out of scope and you don't keep a copy of the pointer around it is permanently lost and causes a memory leak. This means you will lose the space used by that resource as long as the program executes. This is the downside of dynamically allocating memory with new, you have to track it and manually free it with the delete operator which takes extra work.
Doing it the other way creates a stack object which will be destroyed once it leaves scope, which is usually preferable.
Often when using dynamic memory creation with new, people work to gain this stack-like functionality by wrapping the dynamically created object in another object which will destroy it when it goes out of scope. This pattern is called Resource Acquisition is Initialization (RAII). This is most useful with reference counting so your objects can still be persisted out of scope, but will be destroyed when nothing refers to them any longer.
new allocates an object on the heap. Your second example allocates memory on the stack. As soon as the function where you allocate k2 returns the memory won't be valid anymore (the destructor of k2 will of course be called first). You need to use new if you want the object to live longer than the function that creates it.
new lets you specify the storage for the allocation you request. allocations via new/new[] are generally on the heap (e.g. wrapping malloc), but ultimately implementation defined. as well, the compiler could have enough information to bypass this in some cases.
where is it, on system stack?
Technically, you would typically write C++ to the specification -- of an abstract machine.
But typically, yes - the object is allocated on the thread's stack.
Why the second is worse? When we use first, when the second one?
The second should be your default because it is very clear, the compiler manages its lifetime and allocation for you, and also very fast. Some exceptions to this include:
when you have a large allocation (stack sizes are relatively small)
sometimes when you want to share the allocation with another thread
when you want to pass the allocation to another thread
when you want the object to live beyond the scope of the method, and you will typically use a smart pointer or other container which will delete the object properly.
in short, there's a lot less that could go wrong (using the second), and far less time is spent creating allocations via the general purpose system allocators.
This is a very newbie question, but something completely new to me. In my code, and everywhere I have seen it before, new objects are created as such...
MyClass x = new MyClass(factory);
However, I just saw some example code that looks like this...
MyClass x(factory);
Does that do the same thing?
Not at all.
The first example uses dynamic memory allocation, i.e., you are allocating an instance of MyClass on the heap (as opposed to the stack). You would need to call delete on that pointer manually or you end up with a memory leak. Also, operator new returns a pointer, not the object itself, so your code would not compile. It needs to change to:
MyClass* x = new MyClass(factory);
The second example allocated an instance of MyClass on the stack. This is very useful for short lived objects as they will automatically be cleaned up when the leave the current scope (and it is fast; cleaning up the stack involves nothing more than incrementing or decrementing a pointer).
This is also how you would implement the Resource Acquisition is Initialization pattern, more commonly referred to as RAII. The destructor for your type would clean up any dynamically allocated memory, so when the stack allocated variable goes out of scope any dynamically allocated memory is cleaned up for you without the need for any outside calls to delete.
No. When you use new, you create objects off the heap that you must then delete later. In addition, you really need MyClass*. The other form creates an object on the stack that will be automatically destroyed at end of scope.
Additional thanks extend to Daniel Newby for answering my memory usage question (and Martin York for explaining it a bit more). It is definitely the answer I was looking for, but more of my other questions were answered by others.
Thanks everyone
for clearing up all of my concerns. Very pleased to see things running how I expect them to run.
I've run into something that I'm not exactly sure about.
In my program, I'm not using malloc() or free(). I'm making instances of my classes with new and I've made sure each one runs it's destructor when it's delete'd, however, there are no free() calls or even setting their pointers (to things inside a global scope, or other classes) to NULL or 0.
What I mean by "I've made sure", is not that I call each destructor. I only use delete to call on the destructor to run, but I have variables that increase by 1 everytime an object is created, and everytime it's destructor is run. This is how I've made sure the amount of objects I created are equal to the amount of destructors called.
Should I be using malloc() and free() anyway? Should I be NULLing pointers to things that I still want to exist?
A second question is why, when I look at my task manager, does my process never "drop" memory? It used to never stop gaining, and then I started deleting everything properly. Or so I thought.
Wouldn't free() or delete make the memory usage go down?
What practices should I pursue about malloc'ing and free'ing memory with linked lists?
There's rarely a reason to use malloc() and free() in a C++ program. Stick with new and delete. Note that unlike languages with garbage collection, setting a pointer to NULL or 0 in C++ has nothing to do with deallocating the memory.
you should be using delete with a new and free with a malloc. delete will call the class' destructor so you don't have to explicitly call it. The purpose of the destructor is to release and resources the class might have and delete will free the memory as well.
The only time you should explicetly use the destructor is when you have initialized your object through placement new. You should put yourself in a position where the compiler generated code releases your resources -- read this article on the C++ idiom : resource acquisition is initialization.
Also setting the pointer of a class to null does nothing, there is no garbage collector in the background cleaning up your memory. If you don't free dynamic memory in C++ it will be "leaked" memory -- i.e., there are no links to the memory and it will never be reclaimed till the process exits.
p.s., once again do not mix the pairs of the memory allocation functions.
edt: don't implement linked lists, use the containers provided by the Standard template library. If you feel you need better performance use the intrusive containers from boost.
You should use new and delete in preference to malloc()/calloc()/realloc() and free().
If you're creating linked lists you should use std::list. Also, look into std::vector
As far as the apparent memory usage of your application: it's quite likely that memory is not returned to the system until the application exits.
new and delete can more or less to be considered the C++ versions of malloc and free. So stick within one pair or another, i.e. if a pointer was created with new, it should be released with delete, which ensures the destructor call you mentioned. The malloc/free pair are not C++ aware, and just allocate and release a block of memory with no attached constructor/destructor semantics.
Yes indeed I consider it good form to set pointers to NULL or zero when they've been freed or deleted. During debugging I also sometimes set them to something characteristic like 0xdeadbeef so they stand out.
It's likely that the OS "memory usage" is reflecting the entire size of your process' heap, rather than your memory manager's idea of how much memory is allocated. When the allocator discovers that it doesn't have enough heap space, it grows the heap, and this will be reflected in the "memory usage" you're looking at. In theory it would be possible to shrink the heap accordingly on memory release, but this doesn't always happen. Thus you may only see the memory usage grow, and never shrink.
"Should I be using malloc() and free() anyway?"
No, in most cases. Stick with one form of memory management only. Trying to use both means that you will inevitably screw up and delete a malloc()ed item, or free() a newed item, giving a subtle bug. Stick with one form and you fix these bugs in advance.
"Wouldn't free() or delete make the memory usage go down?"
The OS allocates memory in pages, often 4 kiB in size. As long as a single byte of the page is still in use, it will not be returned to the OS. You are probably allocating many small items, but not deallocating all of them. (There are ways to help with a custom new/delete, but they are generally not worth the effort.)
In my program, I'm not using malloc()or free(). I'm making instances of my classes with new and I've made sure each one runs it's destructor when it's delete'd,
That is scary. You should not need to make anything run its destructor. It is allocated (new) it is destroyed (delete) the constructor is run atomatically on new and the destructor is run automatically on delete.
however, there are no free() calls or even setting their pointers (to things inside a global scope, or other classes) to NULL or 0.
There is no need to use malloc/free in C++ code (there are a few situations where you are using C libs that require malloced memory but they are explicitly documented and few).
Technically there is no need to set a pointer to NULL after you delete it.
It is good technique for a variable to go out of scope just after it is deleted so it can not accidently be re-used. If for some reason the pointer variable lives (ie does not go out of scope) for a long time after you call delete then it is usefull to set it to NULL so that it is no accidently re-used.
Should I be using malloc() and free() anyway? Should I be NULLing pointers to things that I still want to exist?
No and No.
Note: C++ unlike Java does not keep track of how many pointers point at an object.
If you have more than one pointer pointing at an object you need to use smart pointers (you should be using smart pointers anyway).
A second question is why, when I look at my task manager, does my process never "drop" memory? It used to never stop gaining, and then I started deleting everything properly. Or so I thought.
The application never releases back to the OS (on most OS's in normal situations).
So the memory will never go down (until the application exits).
Internally the memory management tracks all the frees so that the memory can be re-used.
But if it runs out it will ask the OS for more and thus in the task manager the memory allocation will go up (this will not be returned to the OS).
Wouldn't free() or delete make the memory usage go down?
No.
What practices should I pursue about malloc'ing and free'ing memory with linked lists?
You should use Smart Pointers so you don't need to worry about when to delete an object.
They also make your code exception safe.
But if you using pointers. call delete to remove an element then set is value to NULL (the pointer scope is alive long after the delete is called).
Note: A boost:shared_pointer is very similar to a Java pointer. It tracks the number of pointers and deletes the object when the last reference to the object is destroyed. No need for you to do any deleting (just like Java) and all you actually need to do is call new (just like Java).
I've been using C++ for a short while, and I've been wondering about the new keyword. Simply, should I be using it, or not?
With the new keyword...
MyClass* myClass = new MyClass();
myClass->MyField = "Hello world!";
Without the new keyword...
MyClass myClass;
myClass.MyField = "Hello world!";
From an implementation perspective, they don't seem that different (but I'm sure they are)... However, my primary language is C#, and of course the 1st method is what I'm used to.
The difficulty seems to be that method 1 is harder to use with the std C++ classes.
Which method should I use?
Update 1:
I recently used the new keyword for heap memory (or free store) for a large array which was going out of scope (i.e. being returned from a function). Where before I was using the stack, which caused half of the elements to be corrupt outside of scope, switching to heap usage ensured that the elements were intact. Yay!
Update 2:
A friend of mine recently told me there's a simple rule for using the new keyword; every time you type new, type delete.
Foobar *foobar = new Foobar();
delete foobar; // TODO: Move this to the right place.
This helps to prevent memory leaks, as you always have to put the delete somewhere (i.e. when you cut and paste it to either a destructor or otherwise).
Method 1 (using new)
Allocates memory for the object on the free store (This is frequently the same thing as the heap)
Requires you to explicitly delete your object later. (If you don't delete it, you could create a memory leak)
Memory stays allocated until you delete it. (i.e. you could return an object that you created using new)
The example in the question will leak memory unless the pointer is deleted; and it should always be deleted, regardless of which control path is taken, or if exceptions are thrown.
Method 2 (not using new)
Allocates memory for the object on the stack (where all local variables go) There is generally less memory available for the stack; if you allocate too many objects, you risk stack overflow.
You won't need to delete it later.
Memory is no longer allocated when it goes out of scope. (i.e. you shouldn't return a pointer to an object on the stack)
As far as which one to use; you choose the method that works best for you, given the above constraints.
Some easy cases:
If you don't want to worry about calling delete, (and the potential to cause memory leaks) you shouldn't use new.
If you'd like to return a pointer to your object from a function, you must use new
There is an important difference between the two.
Everything not allocated with new behaves much like value types in C# (and people often say that those objects are allocated on the stack, which is probably the most common/obvious case, but not always true). More precisely, objects allocated without using new have automatic storage duration
Everything allocated with new is allocated on the heap, and a pointer to it is returned, exactly like reference types in C#.
Anything allocated on the stack has to have a constant size, determined at compile-time (the compiler has to set the stack pointer correctly, or if the object is a member of another class, it has to adjust the size of that other class). That's why arrays in C# are reference types. They have to be, because with reference types, we can decide at runtime how much memory to ask for. And the same applies here. Only arrays with constant size (a size that can be determined at compile-time) can be allocated with automatic storage duration (on the stack). Dynamically sized arrays have to be allocated on the heap, by calling new.
(And that's where any similarity to C# stops)
Now, anything allocated on the stack has "automatic" storage duration (you can actually declare a variable as auto, but this is the default if no other storage type is specified so the keyword isn't really used in practice, but this is where it comes from)
Automatic storage duration means exactly what it sounds like, the duration of the variable is handled automatically. By contrast, anything allocated on the heap has to be manually deleted by you.
Here's an example:
void foo() {
bar b;
bar* b2 = new bar();
}
This function creates three values worth considering:
On line 1, it declares a variable b of type bar on the stack (automatic duration).
On line 2, it declares a bar pointer b2 on the stack (automatic duration), and calls new, allocating a bar object on the heap. (dynamic duration)
When the function returns, the following will happen:
First, b2 goes out of scope (order of destruction is always opposite of order of construction). But b2 is just a pointer, so nothing happens, the memory it occupies is simply freed. And importantly, the memory it points to (the bar instance on the heap) is NOT touched. Only the pointer is freed, because only the pointer had automatic duration.
Second, b goes out of scope, so since it has automatic duration, its destructor is called, and the memory is freed.
And the barinstance on the heap? It's probably still there. No one bothered to delete it, so we've leaked memory.
From this example, we can see that anything with automatic duration is guaranteed to have its destructor called when it goes out of scope. That's useful. But anything allocated on the heap lasts as long as we need it to, and can be dynamically sized, as in the case of arrays. That is also useful. We can use that to manage our memory allocations. What if the Foo class allocated some memory on the heap in its constructor, and deleted that memory in its destructor. Then we could get the best of both worlds, safe memory allocations that are guaranteed to be freed again, but without the limitations of forcing everything to be on the stack.
And that is pretty much exactly how most C++ code works.
Look at the standard library's std::vector for example. That is typically allocated on the stack, but can be dynamically sized and resized. And it does this by internally allocating memory on the heap as necessary. The user of the class never sees this, so there's no chance of leaking memory, or forgetting to clean up what you allocated.
This principle is called RAII (Resource Acquisition is Initialization), and it can be extended to any resource that must be acquired and released. (network sockets, files, database connections, synchronization locks). All of them can be acquired in the constructor, and released in the destructor, so you're guaranteed that all resources you acquire will get freed again.
As a general rule, never use new/delete directly from your high level code. Always wrap it in a class that can manage the memory for you, and which will ensure it gets freed again. (Yes, there may be exceptions to this rule. In particular, smart pointers require you to call new directly, and pass the pointer to its constructor, which then takes over and ensures delete is called correctly. But this is still a very important rule of thumb)
The short answer is: if you're a beginner in C++, you should never be using new or delete yourself.
Instead, you should use smart pointers such as std::unique_ptr and std::make_unique (or less often, std::shared_ptr and std::make_shared). That way, you don't have to worry nearly as much about memory leaks. And even if you're more advanced, best practice would usually be to encapsulate the custom way you're using new and delete into a small class (such as a custom smart pointer) that is dedicated just to object lifecycle issues.
Of course, behind the scenes, these smart pointers are still performing dynamic allocation and deallocation, so code using them would still have the associated runtime overhead. Other answers here have covered these issues, and how to make design decisions on when to use smart pointers versus just creating objects on the stack or incorporating them as direct members of an object, well enough that I won't repeat them. But my executive summary would be: don't use smart pointers or dynamic allocation until something forces you to.
Which method should I use?
This is almost never determined by your typing preferences but by the context. If you need to keep the object across a few stacks or if it's too heavy for the stack you allocate it on the free store. Also, since you are allocating an object, you are also responsible for releasing the memory. Lookup the delete operator.
To ease the burden of using free-store management people have invented stuff like auto_ptr and unique_ptr. I strongly recommend you take a look at these. They might even be of help to your typing issues ;-)
If you are writing in C++ you are probably writing for performance. Using new and the free store is much slower than using the stack (especially when using threads) so only use it when you need it.
As others have said, you need new when your object needs to live outside the function or object scope, the object is really large or when you don't know the size of an array at compile time.
Also, try to avoid ever using delete. Wrap your new into a smart pointer instead. Let the smart pointer call delete for you.
There are some cases where a smart pointer isn't smart. Never store std::auto_ptr<> inside a STL container. It will delete the pointer too soon because of copy operations inside the container. Another case is when you have a really large STL container of pointers to objects. boost::shared_ptr<> will have a ton of speed overhead as it bumps the reference counts up and down. The better way to go in that case is to put the STL container into another object and give that object a destructor that will call delete on every pointer in the container.
Without the new keyword you're storing that on call stack. Storing excessively large variables on stack will lead to stack overflow.
If your variable is used only within the context of a single function, you're better off using a stack variable, i.e., Option 2. As others have said, you do not have to manage the lifetime of stack variables - they are constructed and destructed automatically. Also, allocating/deallocating a variable on the heap is slow by comparison. If your function is called often enough, you'll see a tremendous performance improvement if use stack variables versus heap variables.
That said, there are a couple of obvious instances where stack variables are insufficient.
If the stack variable has a large memory footprint, then you run the risk of overflowing the stack. By default, the stack size of each thread is 1 MB on Windows. It is unlikely that you'll create a stack variable that is 1 MB in size, but you have to keep in mind that stack utilization is cumulative. If your function calls a function which calls another function which calls another function which..., the stack variables in all of these functions take up space on the same stack. Recursive functions can run into this problem quickly, depending on how deep the recursion is. If this is a problem, you can increase the size of the stack (not recommended) or allocate the variable on the heap using the new operator (recommended).
The other, more likely condition is that your variable needs to "live" beyond the scope of your function. In this case, you'd allocate the variable on the heap so that it can be reached outside the scope of any given function.
The simple answer is yes - new() creates an object on the heap (with the unfortunate side effect that you have to manage its lifetime (by explicitly calling delete on it), whereas the second form creates an object in the stack in the current scope and that object will be destroyed when it goes out of scope.
Are you passing myClass out of a function, or expecting it to exist outside that function? As some others said, it is all about scope when you aren't allocating on the heap. When you leave the function, it goes away (eventually). One of the classic mistakes made by beginners is the attempt to create a local object of some class in a function and return it without allocating it on the heap. I can remember debugging this kind of thing back in my earlier days doing c++.
C++ Core Guidelines R.11: Avoid using new and delete explicitly.
Things have changed significantly since most answers to this question were written. Specifically, C++ has evolved as a language, and the standard library is now richer. Why does this matter? Because of a combination of two factors:
Using new and delete is potentially dangerous: Memory might leak if you don't keep a very strong discipline of delete'ing everything you've allocated when it's no longer used; and never deleteing what's not currently allocated.
The standard library now offers smart pointers which encapsulate the new and delete calls, so that you don't have to take care of managing allocations on the free store/heap yourself. So do other containers, in the standard library and elsewhere.
This has evolved into one of the C++ community's "core guidelines" for writing better C++ code, as the linked document shows. Of course, there exceptions to this rule: Somebody needs to write those encapsulating classes which do use new and delete; but that someone is rarely yourself.
Adding to #DanielSchepler's valid answer:
The second method creates the instance on the stack, along with such things as something declared int and the list of parameters that are passed into the function.
The first method makes room for a pointer on the stack, which you've set to the location in memory where a new MyClass has been allocated on the heap - or free store.
The first method also requires that you delete what you create with new, whereas in the second method, the class is automatically destructed and freed when it falls out of scope (the next closing brace, usually).
The short answer is yes the "new" keyword is incredibly important as when you use it the object data is stored on the heap as opposed to the stack, which is most important!