Q1. In Java, all objects, arrays and class variables are stored on the heap? Is the same true for C++? Is data segment a part of Heap?
What about the following code in C++?
class MyClass{
private:
static int counter;
static int number;
};
MyClass::number = 100;
Q2. As far as my understanding goes, variables which are given a specific value by compiler are stored in data segment, and unintialized global and static variables are stored in BSS (Block started by symbol). In this case, MyClass::counter being static is initialized to zero by the compiler and so it is stored at BSS and MyClass::number which is initialized to 100 is stored in the data segment. Am I correct in making the conclusion?
Q3. Consider following piece of codes:
void doHello(MyClass &localObj){
// 3.1 localObj is a reference parameter, where will this get stored in Heap or Stack?
// do something
}
void doHelloAgain(MyClass localObj){
// 3.2 localObj is a parameter, where will this get stored in Heap or Stack?
// do something
}
int main(){
MyClass *a = new MyClass(); // stored in heap
MyClass localObj;
// 3.3 Where is this stored in heap or stack?
doHello(localObj);
doHelloAgain(localObj);
}
I hope I have made my questions clear to all
EDIT:
Please refer this article for some understanding on BSS
EDIT1: Changed the class name from MyInstance to MyClass as it was a poor name. Sincere Apologies
EDIT2: Changed the class member variable number from non-static to static
This is somewhat simplified but mostly accurate to the best of my knowledge.
In Java, all objects are allocated on the heap (including all your member variables). Most other stuff (parameters) are references, and the references themselves are stored on the stack along with native types (ints, longs, etc) except string which is more of an object than a native type.
In C++, if you were to allocate all objects with the "new" keyword it would be pretty much the same situation as java, but there is one unique case in C++ because you can allocate objects on the stack instead (you don't always have to use "new").
Also note that Java's heap performance is closer to C's stack performance than C's heap performance, the garbage collector does some pretty smart stuff. It's still not quite as good as stack, but much better than a heap. This is necessary since Java can't allocate objects on the stack.
Q1
Java also stores variables on the stack but class instances are allocated on the heap. In C++ you are free to allocate your class instances either on the stack or on the heap. By using the new keyword you allocate the instance on the heap.
The data segment is not part of the heap, but is allocated when the process starts. The heap is used for dynamic memory allocations while the data segment is static and the contents is known at compile time.
The BSS segment is simply an optimization where all the data belongning to the data segment (e.g. string, constant numbers etc.) that are not initialized or initialized to zero are moved to the BSS segment. The data segment has to be embedded into the executable and by moveing "all the zeros" to the end they can be removed from the executable. When the executable is loaded the BSS segment is allocated and initialized to zero, and the compiler is still able to know the addresses of the various buffers, variables etc. inside the BSS segment.
Q2
MyClass::number is stored where the instance of MyClass class is allocated. It could be either on the heap or on the stack. Notice in Q3 how a points to an instance of MyClass allocated on the heap while localObj is allocated on the stack. Thus a->number is located on the heap while localObj.number is located on the stack.
As MyClass::number is an instance variable you cannot assign it like this:
MyClass::number = 100;
However, you can assign MyClass::counter as it is static (except that it is private):
MyClass::counter = 100;
Q3
When you call doHello the variable localObj (in main) is passed by reference. The variable localObj in doHello refers back to that variable on the stack. If you change it the changes will be stored on the stack where localObj in main is allocated.
When you call doHelloAgain the variable localObj (in main) is copied onto the stack. Inside doHelloAgain the variable localObj is allocated on the stack and only exists for the duration of the call.
In C++, objects may be allocated on the stack...for example, localObj in your Q3 main routine.
I sense some confusion about classes versus instances. "MyInstance" makes more sense as a variable name than a class name. In your Q1 example, "number" is present in each object of type MyInstance. "counter" is shared by all instances. "MyInstance::counter = 100" is a valid assignment, but "MyInstance::number = 100" is not, because you haven't specified
which object should have its "number" member assigned to.
Q1. In Java, all objects, arrays and
class variables are stored on the
heap? Is the same true for C++? Is
data segment a part of Heap?
No, the data section is separate from the heap. Basically, the data section is allocated at load time, everything there has a fixed location after that. In addition, objects can be allocated on the stack.
The only time objects are on the heap is if you use the new keyword, or if you use something from the malloc family of functions.
Q2. As far as my understanding goes,
variables which are given a specific
value by compiler are stored in data
segment, and unintialized global and
static variables are stored in BSS
(Block started by symbol). In this
case, MyInstance::counter being static
is initialized to zero by the compiler
and so it is stored at BSS and
MyInstance::number which is
initialized to 100 is stored in the
data segment. Am I correct in making
the conclusion?
Yes, your understanding of the BSS section is correct. However, since number isn't static the code:
MyInstance::number = 100;
isn't legal, it needs to be either made static or initialized in the constructor properly. If you initialize it in the constructor, it will exist wherever the owning object is allocated. If you make it static, it will end up in the data section... if anywhere. Often static const int variables can be inlined directly into the code used such that a global variable isn't needed at all.
Q3. Consider following piece of codes: ...
void doHello(MyInstance &localObj){
localObj is a reference to the passed object. As far as you know, there is no storage, it refers to wherever the variable being passed is. In reality, under the hood, a pointer may be passed on the stack to facilitate this. But The compiler may just as easily optimize that out if it can.
void doHelloAgain(MyInstance localObj){
a copy of the passed parameter is placed on the stack.
MyInstance localObj;
// 3.3 Where is this stored in heap or stack?
localObj is on the stack.
All memory areas in C++ are listed here
Related
Got to thinking about the this pointer (from what I can tell it's not a pointer per se but rather an expression resulting in the address of the object) and started to wonder about what "this" actually refers to when an object is created and destroyed within a function scope? So not created using the "new" operator. So something like this:
void Foo()
{
SomeObject o;
}
What exactly happens when an object is created as described above and what happens with "this" when it is?
this is a pointer to that object, within the scope of its member functions.
Every object has an address, no matter how it was allocated or what its storage duration is. So, whether or not you used new is irrelevant.
You will find, though, that the address of dynamically allocated objects is numerically distant from the address of other ones, because they're typically stored in different places in virtual memory (your "heap" vs "stack" nomenclature).
C pointers are not limited to manualy allocated memory, they can point on any part of the memory. Including zones that are not designed to be used by variables, like the code segment which is the part of memory where the machine instructions are stored to be executed.
You can see pointers as a sort of big indexes of the computer RAM. And the RAM as a big array of bytes.
When you declare an object, like in you example, the compiler take memory somewhere. This memory have its own address (the big index I was talking above) and we can use it like any other memory address.
So, in your case, if you declare:
SomeObject O;
...then the "this" pointer have the same value as a manualy declared pointer like that:
SomeObject 0;
SomeObject *MyThis = &O;
I just have a very simple question but I can not find it through google.
In C++, if we create a integer in a function, I believe it will be in stack. But if we create a vector or a map, for example,
vector<int> a
will it on stack or heap? I believe that's kind of class object(similar to the object created by "new" in java) so probably it should be on heap?
The vector<int> object itself is created in the storage of your choice: if you declare it as a local variable, that would be in the automatic storage.
However, the vector is usually represented as a pair of pointers; the data for that vector is allocated in the dynamic storage area.
Same goes for std::map<K,V>: the object goes wherever you put it (automatic, static, or dynamic memory, based on your declaration) while the data goes into the dynamic storage area.
Starting with C++11 you can use std::array<T> class for fixed-size collections. The data of this collection will go entirely in the storage where you put the collection itself. However, such collections are not resizable.
The data for any dynamically sized object like that will be heap allocated. If it were on the stack it would risk an overflow and a program crash if it grew too large.
The object itself (i.e. the size of the dynamic array and the pointer to the data's location in memory) will likely be stored on the stack.
Yes this will also be created on the stack.
Variables are only created on the heap when new or malloc is called.
The type doesnt really matter, what matters is how its created.
If you're trying to decide whether or not to create a varaible on the stack or dynamically (on the heap), you should consider the lifetime of the object. If you just need it during the scope that its created in, then create it on the stack. Otherwise create it dynamically.
Here, the vector is stored both on the heap and on the stack. Meaning, the header is on the stack, but as you put elements into the vector, those are dynamically allocated, hence on the heap.
I am new to c++ and have one question to global variables. I see in many examples that global variables are pointers with addresses of the heap. So the pointers are in the memory for global/static variables and the data behind the addresses is on the heap, right?
Instead of this you can declare global (no-pointer) variables that are stored the data. So the data is stored in the memory for global/static variables and not on the heap.
Has this solution any disadvantages over the first solution with the pointers and the heap?
Edit:
First solution:
//global
Sport *sport;
//somewhere
sport = new Sport;
Second solution:
//global
Sport sport;
A disadvantage of storing your data in a global/static variable is that the size is fixed at compile time and can't be changed as opposed to heap storage where the size can be determined at runtime and grow or shrink repeatedly over the run. The lifetime is also fixed as the complete run of the program from start to finish for global/static variables as opposed to heap storage where it can be acquired and released (even repeatedly) all through the runtime of the program. On the other hand, global and static storage management is all handled for you by the compiler where as heap storage has to be explicitly managed by your code. So in summary, global/static storage is easier but not as flexible as heap storage.
You are right in your hypothesis of where the objects are located. About usage,
It's horses for courses. There is no definite rule, it depends on the design & the type of functionality you want to implement. For example:
One may choose the pointer version to achieve lazy initialization or polymorphic behavior, neither of which is possible with global non pointer object approach.
Right. Declared variables go in the DataSegment. And they sit there for the life of the program. You cannot free them. You cannot reallocate them. In Windows, the DataSegment is a fixed size....if you put everything there you may run out of memory (at least it used to be this way).
I'm learning masm32, following some tutorials.
In one tutorial: http://win32assembly.online.fr/tut3.html
there is stated:
LOCAL directive allocates memory from the stack for local variables
used in the function. The bunch of LOCAL directives must be
immediately below the PROC directive. The LOCAL directive is
immediately followed by :.
So LOCAL wc:WNDCLASSEX tells MASM to allocate memory from the stack
the size of WNDCLASSEX structure for the variable named wc. We can
refer to wc in our codes without any difficulty involved in stack
manipulation. That's really a godsend, I think. The downside is that
local variables cannot be used outside the function they're created
and will be automatically destroyed when the function returns to the
caller. Another drawback is that you cannot initialize local variables
automatically because they're just stack memory allocated dynamically
when the function is entered . You have to manually assign them with
desired values after LOCAL directives.
I've always been told stack memory is static, and any dynamic allocation is heap.
Can we really consider those as locals in the sense of C++ then?
When you create local variables in C++, will those variables be dynamically allocated on the stack as well?
Can we really consider those as locals in the sense of C++ then? When you create local variables in C++, will those variables be dynamically allocated on the stack as well?
In C++, local (automatic) variables live on the stack, so yes and yes.
They are allocated dynamically in the sense that they come and go as the function is entered/exited. However, as you rightly point out, this type of allocation is rather different from heap allocation.
In addition to the heap and the stack, there is a third area where variables can reside. It is the data segment. It's where global as well as function- and class-level static variables live.
I don't quite get the point of dynamically allocated memory and I am hoping you guys can make things clearer for me.
First of all, every time we allocate memory we simply get a pointer to that memory.
int * dynInt = new int;
So what is the difference between doing what I did above and:
int someInt;
int* dynInt = &someInt;
As I understand, in both cases memory is allocated for an int, and we get a pointer to that memory.
So what's the difference between the two. When is one method preferred to the other.
Further more why do I need to free up memory with
delete dynInt;
in the first case, but not in the second case.
My guesses are:
When dynamically allocating memory for an object, the object doesn't get initialized while if you do something like in the second case, the object get's initialized. If this is the only difference, is there a any motivation behind this apart from the fact that dynamically allocating memory is faster.
The reason we don't need to use delete for the second case is because the fact that the object was initialized creates some kind of an automatic destruction routine.
Those are just guesses would love it if someone corrected me and clarified things for me.
The difference is in storage duration.
Objects with automatic storage duration are your "normal" objects that automatically go out of scope at the end of the block in which they're defined.
Create them like int someInt;
You may have heard of them as "stack objects", though I object to this terminology.
Objects with dynamic storage duration have something of a "manual" lifetime; you have to destroy them yourself with delete, and create them with the keyword new.
You may have heard of them as "heap objects", though I object to this, too.
The use of pointers is actually not strictly relevant to either of them. You can have a pointer to an object of automatic storage duration (your second example), and you can have a pointer to an object of dynamic storage duration (your first example).
But it's rare that you'll want a pointer to an automatic object, because:
you don't have one "by default";
the object isn't going to last very long, so there's not a lot you can do with such a pointer.
By contrast, dynamic objects are often accessed through pointers, simply because the syntax comes close to enforcing it. new returns a pointer for you to use, you have to pass a pointer to delete, and (aside from using references) there's actually no other way to access the object. It lives "out there" in a cloud of dynamicness that's not sitting in the local scope.
Because of this, the usage of pointers is sometimes confused with the usage of dynamic storage, but in fact the former is not causally related to the latter.
An object created like this:
int foo;
has automatic storage duration - the object lives until the variable foo goes out of scope. This means that in your first example, dynInt will be an invalid pointer once someInt goes out of scope (for example, at the end of a function).
An object created like this:
int foo* = new int;
Has dynamic storage duration - the object lives until you explicitly call delete on it.
Initialization of the objects is an orthogonal concept; it is not directly related to which type of storage-duration you use. See here for more information on initialization.
Your program gets an initial chunk of memory at startup. This memory is called the stack. The amount is usually around 2MB these days.
Your program can ask the OS for additional memory. This is called dynamic memory allocation. This allocates memory on the free store (C++ terminology) or the heap (C terminology). You can ask for as much memory as the system is willing to give (multiple gigabytes).
The syntax for allocating a variable on the stack looks like this:
{
int a; // allocate on the stack
} // automatic cleanup on scope exit
The syntax for allocating a variable using memory from the free store looks like this:
int * a = new int; // ask OS memory for storing an int
delete a; // user is responsible for deleting the object
To answer your questions:
When is one method preferred to the other.
Generally stack allocation is preferred.
Dynamic allocation required when you need to store a polymorphic object using its base type.
Always use smart pointer to automate deletion:
C++03: boost::scoped_ptr, boost::shared_ptr or std::auto_ptr.
C++11: std::unique_ptr or std::shared_ptr.
For example:
// stack allocation (safe)
Circle c;
// heap allocation (unsafe)
Shape * shape = new Circle;
delete shape;
// heap allocation with smart pointers (safe)
std::unique_ptr<Shape> shape(new Circle);
Further more why do I need to free up memory in the first case, but not in the second case.
As I mentioned above stack allocated variables are automatically deallocated on scope exit.
Note that you are not allowed to delete stack memory. Doing so would inevitably crash your application.
For a single integer it only makes sense if you need the keep the value after for example, returning from a function. Had you declared someInt as you said, it would have been invalidated as soon as it went out of scope.
However, in general there is a greater use for dynamic allocation. There are many things that your program doesn't know before allocation and depends on input. For example, your program needs to read an image file. How big is that image file? We could say we store it in an array like this:
unsigned char data[1000000];
But that would only work if the image size was less than or equal to 1000000 bytes, and would also be wasteful for smaller images. Instead, we can dynamically allocate the memory:
unsigned char* data = new unsigned char[file_size];
Here, file_size is determined at runtime. You couldn't possibly tell this value at the time of compilation.
Read more about dynamic memory allocation and also garbage collection
You really need to read a good C or C++ programming book.
Explaining in detail would take a lot of time.
The heap is the memory inside which dynamic allocation (with new in C++ or malloc in C) happens. There are system calls involved with growing and shrinking the heap. On Linux, they are mmap & munmap (used to implement malloc and new etc...).
You can call a lot of times the allocation primitive. So you could put int *p = new int; inside a loop, and get a fresh location every time you loop!
Don't forget to release memory (with delete in C++ or free in C). Otherwise, you'll get a memory leak -a naughty kind of bug-. On Linux, valgrind helps to catch them.
Whenever you are using new in C++ memory is allocated through malloc which calls the sbrk system call (or similar) itself. Therefore no one, except the OS, has knowledge about the requested size. So you'll have to use delete (which calls free which goes to sbrk again) for giving memory back to the system. Otherwise you'll get a memory leak.
Now, when it comes to your second case, the compiler has knowledge about the size of the allocated memory. That is, in your case, the size of one int. Setting a pointer to the address of this int does not change anything in the knowledge of the needed memory. Or with other words: The compiler is able to take care about freeing of the memory. In the first case with new this is not possible.
In addition to that: new respectively malloc do not need to allocate exactly the requsted size, which makes things a bit more complicated.
Edit
Two more common phrases: The first case is also known as static memory allocation (done by the compiler), the second case refers to dynamic memory allocation (done by the runtime system).
What happens if your program is supposed to let the user store any number of integers? Then you'll need to decide during run-time, based on the user's input, how many ints to allocate, so this must be done dynamically.
In a nutshell, dynamically allocated object's lifetime is controlled by you and not by the language. This allows you to let it live as long as it is required (as opposed to end of the scope), possibly determined by a condition that can only be calculated at run-rime.
Also, dynamic memory is typically much more "scalable" - i.e. you can allocate more and/or larger objects compared to stack-based allocation.
The allocation essentially "marks" a piece of memory so no other object can be allocated in the same space. De-allocation "unmarks" that piece of memory so it can be reused for later allocations. If you fail to deallocate memory after it is no longer needed, you get a condition known as "memory leak" - your program is occupying a memory it no longer needs, leading to possible failure to allocate new memory (due to the lack of free memory), and just generally putting an unnecessary strain on the system.