Memory allocation and constructor - c++

I am sorry if it has been asked before explicitly stated in the standard, but I fail to find whether the memory for objects with automatic storage is allocated in the beginning of enclosing block or immediately before executing the constructor?
I am asking this because https://en.cppreference.com/w/cpp/language/storage_duration says that.
Storage duration
All objects in a program have one of the following storage durations:
automatic storage duration. The storage for the object is allocated at the beginning of the enclosing code block and deallocated at the end. All local objects have this storage duration, except those declared static, extern or thread_local.
Now, does it mean that the storage space is allocated even where constructor is not invoked for some reason?
For example, I have something like that.
{
if(somecondition1) throw something;
MyHugeObject o{};
/// do something
}
So, there a chance that MyHugeObject does not need to be constructed, yet according to the source I've cited, the memory for it is still allocated, despite the fact that the object might never get constructed. Is it the case or it is something implementation based?

First of all, from a language standard perspective, you cannot access the object's storage outside of the lifetime of the object. Before the object is created, you do not know where the object is located, and after it has been destructed, accessing the storage yields undefined behavior. In short: A conforming C++ program cannot observe the difference of when the storage is allocated.
Automatic storage typically means "on the call-stack". I.e. allocation happens by decrementing the stack pointer, and deallocation happens be re-incrementing it. A compiler could emit code that does the stack pointer adjustments exactly where the lifetime of the object starts/ends, but this is inefficient: It would clutter the generated code with two extra instructions for each object that is used. This is especially a problem with objects that are created in a loop: The stack pointer would jump back and forth between two or more positions constantly.
To improve efficiency, compilers huddle all possible object allocations together into a single stack frame allocation: The compiler assigns an offset to each variable within the function, determines the max. size that is required to store all the variables that are present within the function, and allocates all the memory with a single stack pointer decrement instruction at the start of the function execution. Cleanup is then the respective stack pointer increment. This removes any allocation/deallocation overhead from loops as the variables in the next iteration will simply reuse the same spot within the stack frame as the previous iteration used. This is an important optimization, for many loops declare at least one variable.
The C++ standard does not care. Since use of the storage outside of an object's lifetime is UB, the compiler is free to do with the storage whatever it pleases to do. Programmers should not care as well, but they do tend to care about their programs execution times. And that's what most compilers optimize for by using stack frame allocation.

The moment at which the memory is reclaimed from the system is implementation dependant. The only thing that is mandated by the standard is the moment when the constructor is called and when the object can safely be used.
Common implementations use a stack for automatic storage duration objects, and most of the time allocate a whole frame at the beginning of a bloc and pop it at the end of the bloc. Even if stack operations are fast, it is simpler to limit their number, and the simpler is the more robust.
But anyway, even using a stack for automatic storage duration is not mandated by the standard, not speaking of the moment when frames are allocated on and popped from that stack.

The C++ standard has the following to say about it in [basic.stc] :
2 Static, thread, and automatic storage durations are associated with objects introduced by declarations (6.1) and implicitly created by the implementation (6.6.7).
This 6.6.7 reference refers to [class.temporary], which is about temporaries. Temporaries aren't quite the same concept, but that section has this to say :
2 The materialization of a temporary object is generally delayed as long as possible in order to avoid creating unnecessary temporary objects.
I haven't found anything else that would address your question, so the standard appears to give the implementation some leeway as to when storage is allocated for the object.
Note this does not apply to when the object is initialized - that happens when the declaration statement is executed, as per [stmt.dcl] :
2 Variables with automatic storage duration (6.6.5.3) are initialized each time their declaration-statement is executed. Variables with automatic storage duration declared in the block are destroyed on exit from the block (8.6).
The cppreference link you mentioned likely discusses a typical implementation, where objects with automatic storage duration are allocated on the stack. In such implementations, it makes sense to allocate storage at the start of an enclosing block (it's just a simple (in/de)crement of the stack pointer after all, and grouping them is beneficial).
If you want to avoid allocating storage for a huge object when not needed, restructuring the code is an option. On some implementations, introducing an additional block scope will achieve that :
{
if(somecondition1) throw something;
{
MyHugeObject o{};
/// do something
}
}
On other implementations, other approaches might be needed. #DanielLangr's comment below indicates implementations where the allocation happens at the start of the enclosing function, rather than at the start of the block.

Related

C++ Constructors Runtime/Compile Time

As I know we can create objects in runtime or in compile-time. For example
SomeType object1;
SomeType *object2 = new SomeType;
So I think that in the code here;
int main(){
cout << "lalalal";
SomeType object1;
}
A constructor should be called for object1 and then lalalal should appear at screen. Because compiler is allocating the memory before the program starts. So at what point I'm wrong?
As I know we can create objects in runtime or in compile-time.
Not really. In your code example, the first object is created with automatic storage duration (often described as "on the stack"), and the second with allocated dynamic storage duration (often described as "on the heap"). But these both happen at runtime.
A constructor should be called for object1 and then lalalal should appear at screen.
Statements in functions are executed from top-to-bottom (not including loops, obviously). So the object is created second.*
Because compiler is allocating the memory before the program starts.
Yes, it's possible that the memory is allocated ahead of time. But as far as observable effects are concerned, that's irrelevant.
* However, as you haven't included a newline character in your string, what you may be seeing is the effect of line-buffering; on many systems, output isn't displayed until newline characters are received, or until the program terminates.
First, there are two separate concepts in C++: storage duration,
and object lifetime. And while the storage duration cannot be
shorter than the object lifetime, the reverse is not necessarily
true. And second, both are runtime concepts, not compile time.
In this case, however, there is no real difference. Both the
storage duration and the lifetime of the object object1 start
when the definition is executed, and end when it goes out of
scope. Most compiler will, in fact, allocate all of the memory
for local variables at the top of the function, but only because
there is no way a conforming program can tell that it wasn't
allocated at the definition. Anything which affects the
observable behavior of the program, however, must occur when the
standard says it should occur.
No, object1 in your example is not 'created' at compile time, it is created in runtime just like the other object. Moreover, object1 is 'constructed' after the cout command is executed, and therefore the constructor of it is executed afterwards. The memory for it might have been allocated before that though.

Stack-allocated objects still taking memory after going out of scope?

People always talk about how objects created without the new keyword are destroyed when they go out of scope, but when I think about this, it seems like that's wrong. Perhaps the destructor is called when the variable goes out of scope, but how do we know that it is no longer taking up space in the stack? For example, consider the following:
void DoSomething()
{
{
My_Object obj;
obj.DoSomethingElse();
}
AnotherFuncCall();
}
Is it guaranteed that obj will not be saved on the stack when AnotherFuncCall is executed? Because people are always saying it, there must be some truth to what they say, so I assume that the destructor must be called when obj goes out of scope, before AnotherFuncCall. Is that a fair assumption?
You are confusing two different concepts.
Yes, your object's destructor will be called when it leaves its enclosing scope. This is guaranteed by the standard.
No, there is no guarantee that an implementation of the language uses a stack to implement automatic storage (i.e., what you refer to as "stack allocated objects".)
Since most compilers use a fixed size stack I'm not even sure what your question is. It is typically implemented as a fixed size memory region where a pointer move is all that is required to "clean up" the stack as that memory will be used again soon enough.
So, since the memory region used to implement a stack is fixed in size there is no need to set the memory your object took to 0 or something else. It can live there until it is needed again, no harm done.
I believe it depends where in the stack the object was created. If it was on the bottom (assuming stack grows down) then I think the second function may overwrite the destroyed objects space. If the object was inside the stack, then probably that space is wasted, since all further objects would have to be shifted.
Your stack is not dynamically allocated and deallocated, it's just there. Your objects constructors and destructors will get called but you don't get the memory back.
Because people are always saying it, there must be some truth to what they say, so I assume that the destructor must be called when obj goes out of scope, before AnotherFuncCall. Is that a fair assumption?
This is correct. Note that this final question says nothing about a stack". Whether an implementation uses a stack, or something else, is up to the implementation.
Objects created "on the stack" in local scope have what is called automatic storage duration. The Standard says:
C++03 3.7.2 Automatic storage duration
1/ Local objects explicitly declared auto or register or not
explicitly declared static or extern have automatic storage duration.
The storage for these objects lasts until the block in which they are
created exits.
2/ [Note: these objects are initialized and destroyed as described in
6.7. ]
On the destruction of these objects:
6.7 Declaration statement
2/ Variables with automatic storage duration (3.7.2) are initialized
each time their declaration-statement is executed. Variables with
automatic storage duration declared in the block are destroyed on exit
from the block (6.6).
Hence, according to the Standard, when object with local scope fall out of scope, the destructor is called and the storage is released.
Weather or not that storage is on a stack the Standard doesn't say. It just says the storage is released, wherever it might be.
Some architectures don't have stacks in the same sense a PC has. C++ is meant to work on any kind of programmable device. That's why it never mentions anything about stacks, heaps, etc.
On a typical PC-type platform running Windows and user-mode code, these automatic variables are stored on a stack. These stacks are fixed-size, and are created when the thread starts. As they become instantiated, they take up more of the space on the stack, and the stack pointer moves. If you allocate enough of these variables, you will overflow the stack and your program will die an ugly death.
Try running this on a Windows PC and see what happens for an example:
int main()
{
int boom[10000000];
for( int* it = &boom[0]; it != &boom[sizeof(boom)/sizeof(boom[0])]; ++it )
*it = 42;
}
What people say is indeed true. The object still remains in the memory location. However, the way stack works means that the object does not take any memory space from stack.
What usually happens when memory is allocated on the stack is that the stack pointer is decremented by sizeof(type) and when the variable goes out of scope and the object is freed, the stack pointer is incremented, thus shrinking the effective size of data allocated on the stack. Indeed, the data still resides in it's original address, it is not destroyed or deleted at all.
And just to clarify, the C++ standard says absolutely nothing about this! The C++ standard is completely unaware of anything called stack or heap in sense of memory allocation because they are platform specific implementation details.
Your local variables on stack do not take extra memory. The system provides some memory from each thread's stack, and the variables on the stack just use part of it. After running out of the scope, the compiler can reuse the same part of the stack for other variables (used later in the same function).
how do we know that it is no longer taking up space in the stack?
We don't. There are way to see whether they do or don't, but those are architecture and ABI specific. Generally, functions do pop whatever they pushed to the stack when they return control to the caller. What C/C++ guarantees is that it will call a destructor of high-level objects when they leave the scope (though some older C++ like MSVC 6 had terrible bugs at a time when they did not).
Is it guaranteed that obj will not be saved on the stack when AnotherFuncCall is executed?
No. It is up to the compiler to decide when and how to push and pop stack frames as long as that way complies with ABI requirements.
The question "Is something taking up space in the stack" is a bit of a loaded question, because in reality, there is no such thing as free space (at a hardware level.) A lot of people (myself included, at one point) thought that space on a computer is freed by actually clearing it, i.e. changing the data to zeroes. However, this is actually not the case, as doing so would be a lot of extra work. It takes less time to do nothing to memory than it does to clear it. So if you don't need to clear it, don't! This is true for the stack as well as files you delete from your computer. (Ever noticed that "emptying the recycle bin" takes less time than copying those same files to another folder? That's why - they're not actually deleted, the computer just forgets where they're stored.)
Generally speaking, most hardware stacks are implemented with a stack pointer, which tells the CPU where the next empty slot in the stack is. (Or the most recent item pushed on the stack, again, this depends on the CPU architecture.)
When you enter a function, the assembly code subtracts from the stack pointer to create enough room for your local variables, etc. Once the function ends, and you exit scope, the stack pointer is increased by the same amount it was originally decreased, before returning. This increasing of the stack pointer is what is meant by "the local variables on the stack have been freed." It's less that they've been freed and more like "the CPU is now willing to overwrite them with whatever it wants to without a second thought."
Now you may be asking, if our local variables from a previous scope still exist, why can't we use them? Reason being, there's no guarantee they'll still be there from the time you exit scope and the time you try to read them again.

Are objects stored on the stack?

Since objects are constructed via a hidden function, as opposed to primitive types, it makes perfect sense scoping variables for performance in C++, whereas in C99 it doesn't.
My question is: are the objects stored on the stack anyway?
In standard C++ there is no such thing as a stack. The standard only differentiates between the different lifetimes of objects. In that case a variable declared as T t; is said to have automatic storage duration, which means it life-time ends with the end of it's surrounding scope. Most (all?) compilers implement this through a stack. It is a reasonable assumption that all objects created that way actually live on the stack.
Automatically allocated [local] objects are located on automatic memory area ["stack"] while dynamically allocated objects are located in dynamic memory area ["heap"].
As a rule of thumb: in C++, everyting that is not using new or malloc is automatically allocated.
EDIT: Note that I use "stack" and "heap" with double quotes since the standard [AFAIK] does not specify how the data is managed in these areas, but [again AFAIK], compilers indeed tend to use stack for automatic area and heap for dynamic area.
No idea what you mean in your first sentence, but: yes, objects in local variables are generally stored on the stack.

Stack, Static, and Heap in C++

I've searched, but I've not understood very well these three concepts. When do I have to use dynamic allocation (in the heap) and what's its real advantage? What are the problems of static and stack? Could I write an entire application without allocating variables in the heap?
I heard that others languages incorporate a "garbage collector" so you don't have to worry about memory. What does the garbage collector do?
What could you do manipulating the memory by yourself that you couldn't do using this garbage collector?
Once someone said to me that with this declaration:
int * asafe=new int;
I have a "pointer to a pointer". What does it mean? It is different of:
asafe=new int;
?
A similar question was asked, but it didn't ask about statics.
Summary of what static, heap, and stack memory are:
A static variable is basically a global variable, even if you cannot access it globally. Usually there is an address for it that is in the executable itself. There is only one copy for the entire program. No matter how many times you go into a function call (or class) (and in how many threads!) the variable is referring to the same memory location.
The heap is a bunch of memory that can be used dynamically. If you want 4kb for an object then the dynamic allocator will look through its list of free space in the heap, pick out a 4kb chunk, and give it to you. Generally, the dynamic memory allocator (malloc, new, et c.) starts at the end of memory and works backwards.
Explaining how a stack grows and shrinks is a bit outside the scope of this answer, but suffice to say you always add and remove from the end only. Stacks usually start high and grow down to lower addresses. You run out of memory when the stack meets the dynamic allocator somewhere in the middle (but refer to physical versus virtual memory and fragmentation). Multiple threads will require multiple stacks (the process generally reserves a minimum size for the stack).
When you would want to use each one:
Statics/globals are useful for memory that you know you will always need and you know that you don't ever want to deallocate. (By the way, embedded environments may be thought of as having only static memory... the stack and heap are part of a known address space shared by a third memory type: the program code. Programs will often do dynamic allocation out of their static memory when they need things like linked lists. But regardless, the static memory itself (the buffer) is not itself "allocated", but rather other objects are allocated out of the memory held by the buffer for this purpose. You can do this in non-embedded as well, and console games will frequently eschew the built in dynamic memory mechanisms in favor of tightly controlling the allocation process by using buffers of preset sizes for all allocations.)
Stack variables are useful for when you know that as long as the function is in scope (on the stack somewhere), you will want the variables to remain. Stacks are nice for variables that you need for the code where they are located, but which isn't needed outside that code. They are also really nice for when you are accessing a resource, like a file, and want the resource to automatically go away when you leave that code.
Heap allocations (dynamically allocated memory) is useful when you want to be more flexible than the above. Frequently, a function gets called to respond to an event (the user clicks the "create box" button). The proper response may require allocating a new object (a new Box object) that should stick around long after the function is exited, so it can't be on the stack. But you don't know how many boxes you would want at the start of the program, so it can't be a static.
Garbage Collection
I've heard a lot lately about how great Garbage Collectors are, so maybe a bit of a dissenting voice would be helpful.
Garbage Collection is a wonderful mechanism for when performance is not a huge issue. I hear GCs are getting better and more sophisticated, but the fact is, you may be forced to accept a performance penalty (depending upon use case). And if you're lazy, it still may not work properly. At the best of times, Garbage Collectors realize that your memory goes away when it realizes that there are no more references to it (see reference counting). But, if you have an object that refers to itself (possibly by referring to another object which refers back), then reference counting alone will not indicate that the memory can be deleted. In this case, the GC needs to look at the entire reference soup and figure out if there are any islands that are only referred to by themselves. Offhand, I'd guess that to be an O(n^2) operation, but whatever it is, it can get bad if you are at all concerned with performance. (Edit: Martin B points out that it is O(n) for reasonably efficient algorithms. That is still O(n) too much if you are concerned with performance and can deallocate in constant time without garbage collection.)
Personally, when I hear people say that C++ doesn't have garbage collection, my mind tags that as a feature of C++, but I'm probably in the minority. Probably the hardest thing for people to learn about programming in C and C++ are pointers and how to correctly handle their dynamic memory allocations. Some other languages, like Python, would be horrible without GC, so I think it comes down to what you want out of a language. If you want dependable performance, then C++ without garbage collection is the only thing this side of Fortran that I can think of. If you want ease of use and training wheels (to save you from crashing without requiring that you learn "proper" memory management), pick something with a GC. Even if you know how to manage memory well, it will save you time which you can spend optimizing other code. There really isn't much of a performance penalty anymore, but if you really need dependable performance (and the ability to know exactly what is going on, when, under the covers) then I'd stick with C++. There is a reason that every major game engine that I've ever heard of is in C++ (if not C or assembly). Python, et al are fine for scripting, but not the main game engine.
The following is of course all not quite precise. Take it with a grain of salt when you read it :)
Well, the three things you refer to are automatic, static and dynamic storage duration, which has something to do with how long objects live and when they begin life.
Automatic storage duration
You use automatic storage duration for short lived and small data, that is needed only locally within some block:
if(some condition) {
int a[3]; // array a has automatic storage duration
fill_it(a);
print_it(a);
}
The lifetime ends as soon as we exit the block, and it starts as soon as the object is defined. They are the most simple kind of storage duration, and are way faster than in particular dynamic storage duration.
Static storage duration
You use static storage duration for free variables, which might be accessed by any code all times, if their scope allows such usage (namespace scope), and for local variables that need extend their lifetime across exit of their scope (local scope), and for member variables that need to be shared by all objects of their class (classs scope). Their lifetime depends on the scope they are in. They can have namespace scope and local scope and class scope. What is true about both of them is, once their life begins, lifetime ends at the end of the program. Here are two examples:
// static storage duration. in global namespace scope
string globalA;
int main() {
foo();
foo();
}
void foo() {
// static storage duration. in local scope
static string localA;
localA += "ab"
cout << localA;
}
The program prints ababab, because localA is not destroyed upon exit of its block. You can say that objects that have local scope begin lifetime when control reaches their definition. For localA, it happens when the function's body is entered. For objects in namespace scope, lifetime begins at program startup. The same is true for static objects of class scope:
class A {
static string classScopeA;
};
string A::classScopeA;
A a, b; &a.classScopeA == &b.classScopeA == &A::classScopeA;
As you see, classScopeA is not bound to particular objects of its class, but to the class itself. The address of all three names above is the same, and all denote the same object. There are special rule about when and how static objects are initialized, but let's not concern about that now. That's meant by the term static initialization order fiasco.
Dynamic storage duration
The last storage duration is dynamic. You use it if you want to have objects live on another isle, and you want to put pointers around that reference them. You also use them if your objects are big, and if you want to create arrays of size only known at runtime. Because of this flexibility, objects having dynamic storage duration are complicated and slow to manage. Objects having that dynamic duration begin lifetime when an appropriate new operator invocation happens:
int main() {
// the object that s points to has dynamic storage
// duration
string *s = new string;
// pass a pointer pointing to the object around.
// the object itself isn't touched
foo(s);
delete s;
}
void foo(string *s) {
cout << s->size();
}
Its lifetime ends only when you call delete for them. If you forget that, those objects never end lifetime. And class objects that define a user declared constructor won't have their destructors called. Objects having dynamic storage duration requires manual handling of their lifetime and associated memory resource. Libraries exist to ease use of them. Explicit garbage collection for particular objects can be established by using a smart pointer:
int main() {
shared_ptr<string> s(new string);
foo(s);
}
void foo(shared_ptr<string> s) {
cout << s->size();
}
You don't have to care about calling delete: The shared ptr does it for you, if the last pointer that references the object goes out of scope. The shared ptr itself has automatic storage duration. So its lifetime is automatically managed, allowing it to check whether it should delete the pointed to dynamic object in its destructor. For shared_ptr reference, see boost documents: http://www.boost.org/doc/libs/1_37_0/libs/smart_ptr/shared_ptr.htm
It's been said elaborately, just as "the short answer":
static variable (class)
lifetime = program runtime (1)
visibility = determined by access modifiers (private/protected/public)
static variable (global scope)
lifetime = program runtime (1)
visibility = the compilation unit it is instantiated in (2)
heap variable
lifetime = defined by you (new to delete)
visibility = defined by you (whatever you assign the pointer to)
stack variable
visibility = from declaration until scope is exited
lifetime = from declaration until declaring scope is exited
(1) more exactly: from initialization until deinitialization of the compilation unit (i.e. C / C++ file). Order of initialization of compilation units is not defined by the standard.
(2) Beware: if you instantiate a static variable in a header, each compilation unit gets its own copy.
The main difference is speed and size.
Stack
Dramatically faster to allocate. It is done in O(1), since it is allocated when setting up the stack frame, so it is essentially free. The drawback is that if you run out of stack space you are in deep trouble. You can adjust the stack size, but, IIRC, you have ~2MB to play with. Also, as soon as you exit the function everything on the stack is cleared. So, it can be problematic to refer to it later. (Pointers to stack allocated objects lead to bugs.)
Heap
Dramatically slower to allocate. But, you have GB to play with, and point to.
Garbage Collector
The garbage collector is some code that runs in the background and frees memory. When you allocate memory on the heap it is very easy to forget to free it, which is known as a memory leak. Over time, the memory your application consumes grows and grows until it crashes. Having a garbage collector periodically free the memory you no longer need helps eliminate this class of bugs. Of course, this comes at a price, as the garbage collector slows things down.
What are the problems of static and stack?
The problem with "static" allocation is that the allocation is made at compile-time: you can't use it to allocate some variable number of data, the number of which isn't known until run-time.
The problem with allocating on the "stack" is that the allocation is destroyed as soon as the subroutine which does the allocation returns.
I could write an entire application without allocate variables in the heap?
Perhaps but not a non-trivial, normal, big application (but so-called "embedded" programs might be written without the heap, using a subset of C++).
What garbage collector does ?
It keeps watching your data ("mark and sweep") to detect when your application is no longer referencing it. This is convenient for the application, because the application doesn't need to deallocate the data ... but the garbage collector might be computationally expensive.
Garbage collectors aren't a usual feature of C++ programming.
What could you do manipulating the memory by yourself that you couldn't do using this garbage collector?
Learn the C++ mechanisms for deterministic memory deallocation:
'static': never deallocated
'stack': as soon as the variable "goes out of scope"
'heap': when the pointer is deleted (explicitly deleted by the application, or implicitly deleted within some-or-other subroutine)
Stack memory allocation (function variables, local variables) can be problematic when your stack is too "deep" and you overflow the memory available to stack allocations. The heap is for objects that need to be accessed from multiple threads or throughout the program lifecycle. You can write an entire program without using the heap.
You can leak memory quite easily without a garbage collector, but you can also dictate when objects and memory is freed. I have run in to issues with Java when it runs the GC and I have a real time process, because the GC is an exclusive thread (nothing else can run). So if performance is critical and you can guarantee there are no leaked objects, not using a GC is very helpful. Otherwise it just makes you hate life when your application consumes memory and you have to track down the source of a leak.
What if your program does not know upfront how much memory to allocate (hence you cannot use stack variables). Say linked lists, the lists can grow without knowing upfront what is its size. So allocating on a heap makes sense for a linked list when you are not aware of how many elements would be inserted into it.
An advantage of GC in some situations is an annoyance in others; reliance on GC encourages not thinking much about it. In theory, waits until 'idle' period or until it absolutely must, when it will steal bandwidth and cause response latency in your app.
But you don't have to 'not think about it.' Just as with everything else in multithreaded apps, when you can yield, you can yield. So for example, in .Net, it is possible to request a GC; by doing this, instead of less frequent longer running GC, you can have more frequent shorter running GC, and spread out the latency associated with this overhead.
But this defeats the primary attraction of GC which appears to be "encouraged to not have to think much about it because it is auto-mat-ic."
If you were first exposed to programming before GC became prevalent and were comfortable with malloc/free and new/delete, then it might even be the case that you find GC a little annoying and/or are distrustful(as one might be distrustful of 'optimization,' which has had a checkered history.) Many apps tolerate random latency. But for apps that don't, where random latency is less acceptable, a common reaction is to eschew GC environments and move in the direction of purely unmanaged code (or god forbid, a long dying art, assembly language.)
I had a summer student here a while back, an intern, smart kid, who was weaned on GC; he was so adament about the superiorty of GC that even when programming in unmanaged C/C++ he refused to follow the malloc/free new/delete model because, quote, "you shouldn't have to do this in a modern programming language." And you know? For tiny, short running apps, you can indeed get away with that, but not for long running performant apps.
Stack is a memory allocated by the compiler, when ever we compiles the program, in default compiler allocates some memory from OS ( we can change the settings from compiler settings in your IDE) and OS is the one which give you the memory, its depends on many available memory on the system and many other things, and coming to stack memory is allocate when we declare a variable they copy(ref as formals) those variables are pushed on to stack they follow some naming conventions by default its CDECL in Visual studios
ex: infix notation:
c=a+b;
the stack pushing is done right to left PUSHING, b to stack, operator, a to stack and result of those i,e c to stack.
In pre fix notation:
=+cab
Here all the variables are pushed to stack 1st (right to left)and then the operation are made.
This memory allocated by compiler is fixed. So lets assume 1MB of memory is allocated to our application, lets say variables used 700kb of memory(all the local variables are pushed to stack unless they are dynamically allocated) so remaining 324kb memory is allocated to heap.
And this stack has less life time, when the scope of the function ends these stacks gets cleared.

When should I use the new keyword in C++?

I've been using C++ for a short while, and I've been wondering about the new keyword. Simply, should I be using it, or not?
With the new keyword...
MyClass* myClass = new MyClass();
myClass->MyField = "Hello world!";
Without the new keyword...
MyClass myClass;
myClass.MyField = "Hello world!";
From an implementation perspective, they don't seem that different (but I'm sure they are)... However, my primary language is C#, and of course the 1st method is what I'm used to.
The difficulty seems to be that method 1 is harder to use with the std C++ classes.
Which method should I use?
Update 1:
I recently used the new keyword for heap memory (or free store) for a large array which was going out of scope (i.e. being returned from a function). Where before I was using the stack, which caused half of the elements to be corrupt outside of scope, switching to heap usage ensured that the elements were intact. Yay!
Update 2:
A friend of mine recently told me there's a simple rule for using the new keyword; every time you type new, type delete.
Foobar *foobar = new Foobar();
delete foobar; // TODO: Move this to the right place.
This helps to prevent memory leaks, as you always have to put the delete somewhere (i.e. when you cut and paste it to either a destructor or otherwise).
Method 1 (using new)
Allocates memory for the object on the free store (This is frequently the same thing as the heap)
Requires you to explicitly delete your object later. (If you don't delete it, you could create a memory leak)
Memory stays allocated until you delete it. (i.e. you could return an object that you created using new)
The example in the question will leak memory unless the pointer is deleted; and it should always be deleted, regardless of which control path is taken, or if exceptions are thrown.
Method 2 (not using new)
Allocates memory for the object on the stack (where all local variables go) There is generally less memory available for the stack; if you allocate too many objects, you risk stack overflow.
You won't need to delete it later.
Memory is no longer allocated when it goes out of scope. (i.e. you shouldn't return a pointer to an object on the stack)
As far as which one to use; you choose the method that works best for you, given the above constraints.
Some easy cases:
If you don't want to worry about calling delete, (and the potential to cause memory leaks) you shouldn't use new.
If you'd like to return a pointer to your object from a function, you must use new
There is an important difference between the two.
Everything not allocated with new behaves much like value types in C# (and people often say that those objects are allocated on the stack, which is probably the most common/obvious case, but not always true). More precisely, objects allocated without using new have automatic storage duration
Everything allocated with new is allocated on the heap, and a pointer to it is returned, exactly like reference types in C#.
Anything allocated on the stack has to have a constant size, determined at compile-time (the compiler has to set the stack pointer correctly, or if the object is a member of another class, it has to adjust the size of that other class). That's why arrays in C# are reference types. They have to be, because with reference types, we can decide at runtime how much memory to ask for. And the same applies here. Only arrays with constant size (a size that can be determined at compile-time) can be allocated with automatic storage duration (on the stack). Dynamically sized arrays have to be allocated on the heap, by calling new.
(And that's where any similarity to C# stops)
Now, anything allocated on the stack has "automatic" storage duration (you can actually declare a variable as auto, but this is the default if no other storage type is specified so the keyword isn't really used in practice, but this is where it comes from)
Automatic storage duration means exactly what it sounds like, the duration of the variable is handled automatically. By contrast, anything allocated on the heap has to be manually deleted by you.
Here's an example:
void foo() {
bar b;
bar* b2 = new bar();
}
This function creates three values worth considering:
On line 1, it declares a variable b of type bar on the stack (automatic duration).
On line 2, it declares a bar pointer b2 on the stack (automatic duration), and calls new, allocating a bar object on the heap. (dynamic duration)
When the function returns, the following will happen:
First, b2 goes out of scope (order of destruction is always opposite of order of construction). But b2 is just a pointer, so nothing happens, the memory it occupies is simply freed. And importantly, the memory it points to (the bar instance on the heap) is NOT touched. Only the pointer is freed, because only the pointer had automatic duration.
Second, b goes out of scope, so since it has automatic duration, its destructor is called, and the memory is freed.
And the barinstance on the heap? It's probably still there. No one bothered to delete it, so we've leaked memory.
From this example, we can see that anything with automatic duration is guaranteed to have its destructor called when it goes out of scope. That's useful. But anything allocated on the heap lasts as long as we need it to, and can be dynamically sized, as in the case of arrays. That is also useful. We can use that to manage our memory allocations. What if the Foo class allocated some memory on the heap in its constructor, and deleted that memory in its destructor. Then we could get the best of both worlds, safe memory allocations that are guaranteed to be freed again, but without the limitations of forcing everything to be on the stack.
And that is pretty much exactly how most C++ code works.
Look at the standard library's std::vector for example. That is typically allocated on the stack, but can be dynamically sized and resized. And it does this by internally allocating memory on the heap as necessary. The user of the class never sees this, so there's no chance of leaking memory, or forgetting to clean up what you allocated.
This principle is called RAII (Resource Acquisition is Initialization), and it can be extended to any resource that must be acquired and released. (network sockets, files, database connections, synchronization locks). All of them can be acquired in the constructor, and released in the destructor, so you're guaranteed that all resources you acquire will get freed again.
As a general rule, never use new/delete directly from your high level code. Always wrap it in a class that can manage the memory for you, and which will ensure it gets freed again. (Yes, there may be exceptions to this rule. In particular, smart pointers require you to call new directly, and pass the pointer to its constructor, which then takes over and ensures delete is called correctly. But this is still a very important rule of thumb)
The short answer is: if you're a beginner in C++, you should never be using new or delete yourself.
Instead, you should use smart pointers such as std::unique_ptr and std::make_unique (or less often, std::shared_ptr and std::make_shared). That way, you don't have to worry nearly as much about memory leaks. And even if you're more advanced, best practice would usually be to encapsulate the custom way you're using new and delete into a small class (such as a custom smart pointer) that is dedicated just to object lifecycle issues.
Of course, behind the scenes, these smart pointers are still performing dynamic allocation and deallocation, so code using them would still have the associated runtime overhead. Other answers here have covered these issues, and how to make design decisions on when to use smart pointers versus just creating objects on the stack or incorporating them as direct members of an object, well enough that I won't repeat them. But my executive summary would be: don't use smart pointers or dynamic allocation until something forces you to.
Which method should I use?
This is almost never determined by your typing preferences but by the context. If you need to keep the object across a few stacks or if it's too heavy for the stack you allocate it on the free store. Also, since you are allocating an object, you are also responsible for releasing the memory. Lookup the delete operator.
To ease the burden of using free-store management people have invented stuff like auto_ptr and unique_ptr. I strongly recommend you take a look at these. They might even be of help to your typing issues ;-)
If you are writing in C++ you are probably writing for performance. Using new and the free store is much slower than using the stack (especially when using threads) so only use it when you need it.
As others have said, you need new when your object needs to live outside the function or object scope, the object is really large or when you don't know the size of an array at compile time.
Also, try to avoid ever using delete. Wrap your new into a smart pointer instead. Let the smart pointer call delete for you.
There are some cases where a smart pointer isn't smart. Never store std::auto_ptr<> inside a STL container. It will delete the pointer too soon because of copy operations inside the container. Another case is when you have a really large STL container of pointers to objects. boost::shared_ptr<> will have a ton of speed overhead as it bumps the reference counts up and down. The better way to go in that case is to put the STL container into another object and give that object a destructor that will call delete on every pointer in the container.
Without the new keyword you're storing that on call stack. Storing excessively large variables on stack will lead to stack overflow.
If your variable is used only within the context of a single function, you're better off using a stack variable, i.e., Option 2. As others have said, you do not have to manage the lifetime of stack variables - they are constructed and destructed automatically. Also, allocating/deallocating a variable on the heap is slow by comparison. If your function is called often enough, you'll see a tremendous performance improvement if use stack variables versus heap variables.
That said, there are a couple of obvious instances where stack variables are insufficient.
If the stack variable has a large memory footprint, then you run the risk of overflowing the stack. By default, the stack size of each thread is 1 MB on Windows. It is unlikely that you'll create a stack variable that is 1 MB in size, but you have to keep in mind that stack utilization is cumulative. If your function calls a function which calls another function which calls another function which..., the stack variables in all of these functions take up space on the same stack. Recursive functions can run into this problem quickly, depending on how deep the recursion is. If this is a problem, you can increase the size of the stack (not recommended) or allocate the variable on the heap using the new operator (recommended).
The other, more likely condition is that your variable needs to "live" beyond the scope of your function. In this case, you'd allocate the variable on the heap so that it can be reached outside the scope of any given function.
The simple answer is yes - new() creates an object on the heap (with the unfortunate side effect that you have to manage its lifetime (by explicitly calling delete on it), whereas the second form creates an object in the stack in the current scope and that object will be destroyed when it goes out of scope.
Are you passing myClass out of a function, or expecting it to exist outside that function? As some others said, it is all about scope when you aren't allocating on the heap. When you leave the function, it goes away (eventually). One of the classic mistakes made by beginners is the attempt to create a local object of some class in a function and return it without allocating it on the heap. I can remember debugging this kind of thing back in my earlier days doing c++.
C++ Core Guidelines R.11: Avoid using new and delete explicitly.
Things have changed significantly since most answers to this question were written. Specifically, C++ has evolved as a language, and the standard library is now richer. Why does this matter? Because of a combination of two factors:
Using new and delete is potentially dangerous: Memory might leak if you don't keep a very strong discipline of delete'ing everything you've allocated when it's no longer used; and never deleteing what's not currently allocated.
The standard library now offers smart pointers which encapsulate the new and delete calls, so that you don't have to take care of managing allocations on the free store/heap yourself. So do other containers, in the standard library and elsewhere.
This has evolved into one of the C++ community's "core guidelines" for writing better C++ code, as the linked document shows. Of course, there exceptions to this rule: Somebody needs to write those encapsulating classes which do use new and delete; but that someone is rarely yourself.
Adding to #DanielSchepler's valid answer:
The second method creates the instance on the stack, along with such things as something declared int and the list of parameters that are passed into the function.
The first method makes room for a pointer on the stack, which you've set to the location in memory where a new MyClass has been allocated on the heap - or free store.
The first method also requires that you delete what you create with new, whereas in the second method, the class is automatically destructed and freed when it falls out of scope (the next closing brace, usually).
The short answer is yes the "new" keyword is incredibly important as when you use it the object data is stored on the heap as opposed to the stack, which is most important!