C++ class and object - memory - c++

Whcih occupies memory, a class or an object? And, is that at compile or execution time?
Thanks.

During compilation, the layout of memory is an implementation detail--you don't need to know or care.
During runtime, however... in C++, classes define types but (unless you activate RTTI which allows limited introspection into classes) don't generally occupy any memory themselves1--they're just the frameworks for the construction and destruction of objects. Their methods, however--the constructors, destructors, instance methods, and class methods, occupy some portion of executable memory, but compilers can and do optimize away any such methods that are not used in the program.
Instances of types (that is, objects as well as primitives like int variables) occupy the bulk of memory in C++, but for their member functions they refer back to their classes. Exactly how much memory an instance of a particular class uses is entirely and utterly an implementation detail, and you should generally not need to care about it.
1 Even then, the classes themselves don't use the memory, but their associated std::typeinfo instance does. But again, this is generally implementation-y stuff and not the sort of thing even wizened programmers pay much attention to.

The object instance is the one which occupies memory at execution time since a class is the blueprint of the object.
Also, in C++ there are static variables, local variables and global variables which also occupies memory.

Static, local and global variables are stored in BBS data segment, while objects are stored either in heap or in stack.
Objects are the instances of class, while class definition is used by compiler to create an object by it's class description. Class is like an instruction of "how to build table by yourself" that occupies only the paper it's written on, while an object is the real table made by yourself according to the instruction, that occupies the real space.

Related

Moment of memory allocation of class?

Form the link below Difference between Definition and Declaration says that:
Definition of a variable says where the variable gets stored. i.e.,
memory for the variable is allocated during the definition of the
variable.
And to my knowledge, the declaration of class looks like :
class stu ;
And the definition of class looks like :
class stu{
public:
int x;
};
And so from information above , the memory allocation of this class should happen when I write the complete definition of class.However,
from this link says that :
Memory will be allocated when you create an instance of the class.
which means that the memory woudl be allocated at the moment I write
stu s;
So I would like to know the exact time that memory would allocate for thsi class, in the other word, it happens during compile time or run time?
In general: The memory holding the values for the members is allocated when they are used, which is - with some exceptions - at runtime. (assuming it is not optimized away by the compiler)
The forward declaration of a class is for the compiler to make the type known.
The definition describes the class:
its member functions are transformed into machine code. Those - depending on the target architecture - exist in a data section that is loaded into memory. So the member function takes up memory before any instance is created.
The compiler also stores some information about the memory layout, which is either part of the machine code or also exists somewhere in a data section.
This memory allocation is however about the description of the class, and not what is generally referred to when you talk about memory allocation for a type.
The memory holding the values for the members is allocated when they are used which is generally at runtime. Under certain circumstances, the values of an instance of a type can already be determined at compile-time, which might have the result that those also become part of the data section.

When and why to declare member variables on the heap C++

Ok, so I'm very new at C++ programming, and I've been looking around for a couple days for a decisive answer for this. WHEN should I declare member variables on the heap vs. the stack? Most of the answers that I've found have dealt with other issues, but I want to know when it is best to use the heap for member variables and why it is better to heap the members instead of stacking them.
There are two important concepts to grasp first:
One should avoid thinking in terms of "heap" and "stack". Those are implementation details of your compiler/platform, not of the language.1 Instead, think in terms of object lifetimes: should the object's lifetime correspond to that of its "parent", or should it outlive it? If you need the latter, then you'll need to use new (directly or indirectly) to dynamically allocate an object.
Member variables always have the same lifetime as their parent. The member variable may be a pointer, and the object it points to may well have an independent lifetime. But the pointed-to object is not a member variable.
However, there is no general answer to your question. Crudely speaking, don't dynamically allocate unless there is a good reason to. As I hinted above, these reasons usually correspond to situations where the lifetime needs to differ from its "parent".
1. Indeed, the C++ standard doesn't really talk about "heap" and "stack". They're important to consider when optimising or generally thinking about performance, but they're mostly irrelevant from a program-functionality point of view.
Member variables are members of the class itself. They are neither on
the heap nor on the stack, or rather, they are where ever the class
itself is.
There are very few reasons to add a level of indirection, and allocate a
member separately on the heap: polymorphism (if the type of the member
is not always the same) is by far the most common.
To get some terminology straight: What you call a heap and stack describe the lifetime of objects. The first means that the lifetime is dynamic, the second automatic and the third (which you don't mention) is static.
Usually you will need dynamic lifetime of an object when it should outlive the scope it was created in. Another common case is when you want it to be shared across different parent objects. Also, dynamic lifetime is also necessary when you work with a design that is heavyly object-oriented (uses a lot of polymorphism, doesn't use values), e.g. Qt.
An idiom that requires dynamic lifetimes is the pimpl-idiom.
Most generic-programming libraries are more focused towards value and value-semantics, so you won't use dynamic binding that much and automatic lifetimes become a lot more common.
There are also some examples where dynamic allocation is required for more implementation specific reasons:
dynamically sized objects (containers)
handling incomplete types (see pimpl-idiom)
easy nullability of a type
All of those are just general guidelines and it has to be decided on a case by case basis. In general, prefer automatic objects over dynamic ones.
The stack refers to the call stack. Function calls, return addresses, parameters, and local variables are kept on the call stack. You use stack memory whenever you pass a parameter or create a local variable. The stack has only temporary storage. Once the current function goes out of scope, you no longer have access to any variables for parameters.
The heap is a large pool of memory used for dynamic allocation. When you use the new operator to allocate memory, this memory is assigned from the heap. You want to allocate heap memory when you are creating objects that you don't want to lose after the current function terminates (loses scope). Objects are stored in the heap until the space is deallocated with delete or free().
Consider this example:
You implement a linked list which has a field member head of class node.
Each node has a field member next. If this member of the type Node and not Node* the size of every Node would depend on the number of the nodes after it in the chain.
For example, if you have 100 nodes in your list your head member will be huge. Because it holds the next node inside itself so it needs to have enough size to hold it and next holds the next and so on. So the head has to have enough space to hold in it 99 nodes the next 98 and so on...
You want to avoid that so in this case it's better to have pointer to to next node in each Node rather than the next node itself.

Stack vs. heap for a fixed number of objects requiring global scope

I'm aware that questions about the stack vs. the heap have been asked several times, but I'm confused about one small aspect of choosing how to declare objects in C++.
I understand that the heap--accessed with the "new" operator--is used for dynamic memory allocation. According to an answer to another question on Stack Overflow, "the heap is for storage of data where the lifetime of the storage cannot be determined ahead of time". The stack is faster than the heap, and seems to be used for variables of local scope, i.e., the variables are automatically deleted when the relevant section of code is completed. The stack also has a relatively limited amount of available space.
In my case, I know prior to runtime that I will need an array of pointers to exactly 500 objects of a particular class, and I know I will need to store the pointers and the objects throughout the duration of runtime. The heap doesn't make sense because I know beforehand how long I will need the memory and I know exactly how man objects I will need. The stack also doesn't make sense if it is limited in scope; plus, I don't know if it can actually hold all of my objects/pointers.
What would be the best way to approach this situation and why? Thanks!
Objects allocated on the stack in main() have a lifetime of the entire run of the program, so that's an option. An array of 500 pointers is either 2000 or 4000 bytes depending on whether your pointers are 32 or 64 bits wide -- if you were programming in an environment whose stack limit was that small, you would know it (such environments do exist: for instance, kernel mode stacks are often 8192 bytes or smaller in total) so I wouldn't hesitate to put the array there.
Depending on how big your objects are, it might also be reasonable to put them on the stack -- the typical stack limit in user space nowadays is order of 8 megabytes, which is not so large that you can totally ignore it, but is not peanuts, either.
If they are too big for the stack, I would seriously consider making a global variable that was an array of the objects themselves. The major downside of this is you can't control precisely when they are initialized. If the objects have nontrivial constructors this is very likely to be a problem. An alternative is to allocate storage for the objects as a global variable, initialize them at the appropriate point within main using placement new, and explicitly call their destructors on the way out. This requires care in the presence of exceptions; I'd write a one-off RAII class that encapsulated the job.
It is not a matter of stack or heap (which to be accurate do not mean what you think in c++: they are just data structures like vector, set or queue). It is a matter of storage duration.
You most likely need here static duration objects, which can be either global, or members of a class. Automatic variables declared inside the main function could also do the job, if you design a way to access them from your other code.
There is some information about the different storage durations of C++ (automatic, static, dynamic) there. The accepted answer however uses the confusing stack/heap terminology, but the explanation is correct.
the heap is for storage of data where the lifetime of the storage cannot be determined ahead of time
While that is correct, it's also incomplete.
The stack unwinds when you exit its scope, so using it for global scoped variables is unfeasible, like you said. This however is where you stop being on the right track. While you know the lifetime (or more accurately the scope since that's the most important factor here), you also know it's above the stack frame, so given only two choices, you put it on the heap.
There is a third option, an actual static variable declared at the top scope, but this will only work if your objects have default constructors.
TL;DR: use global (static) storage for either a pointer to the array (dynamic allocation) or just the actual array (static allocation).
Note: Your assumption that the stack is somehow "faster" than the heap is wrong, they're both backed in the same RAM, you just access it relative to different registers. Also I'd like to mention yet again how much I dislike the use of the terms stack and heap.
The stack, as you mention, often has size limits. That leaves you with two choices - dynamically allocate your objects, or make them global. The time cost to allocate all of your objects once at application startup is almost certainly not of significant concern. So just pick whichever method you prefer and do it.
I think you are confusing the use of the stack (storage for local variables and parameters) and unscoped data (static class variables and data allocated via new or malloc). One appropriate solution based on your description would be a static class that has your array of pointers declared as a static class member. This would be allocated on a heap like structure (maybe even the heap depending on your C++ implementation). a "quick and dirty" solution would be to declare the array as a static variable (basically a global variable), however it isn't the best approach from a maintainability perspective.
The best concept to go by is "Use the stack when you can and the heap when you must." I don't see why the stack wouldn't be able to hold all of your objects unless they're large or you're working with a limited resources system. Try the stack and if it can't handle it, the time it would take to allocate the memory on the heap can all be done early in the program's execution and wouldn't be a significant problem.
Unless speed is an issue or you can't afford the overhead, you should stick the objects in a std::vector. If copy semantics aren't defined for the objects, you should use a std::vector of std::shared_ptrs.

when do global variables allocate their memory?

It's been bothering me for a while but I didn't find any good resource about this matter. I have some global variables in my code. It's obvious that they are initialized in some order but is the memmory needed for all those objects reserved before any initialization take place?
here is the simple example of what might go wrong in my code and how I can use the answer:
I had a map<RTTI, object*> objectPool which holds samples of every class in my code, which i used to load objects from a file. To create those samples I use some global variables just to introduce class instance to objectPool. But sometimes those sample instances were initialized before ObjectPool itself. And that generated runtime error.
To fix that error I used some delayed initializer map<RTTI,object*>* lateInitializedObjectPool;. Now every instance first check if the objectPool is initialized and initilize it if not and then intoduce itself to the object pool. It seems to work fine but I'm worried if even the memmory needed for object pool pointer is not reserved before other classes begin to introduce themselves and that may cause access violation.
Variables declared at namespace scope (as opposed to in classes or functions) have the space for the objects themselves (sizeof(ObjectType)) allocated by the executable (or DLL) loader. If the object is a POD that uses aggregate initialization, then it typically gets its values set by having the linker write those values directly into the executable and the exe's loader simply blasting all of those into memory. Objects that don't use aggregate initialization get their values zeroed out initially.
After all of that, if any of these objects have constructors, then those constructors are executed before main is run. Thus, if any of those constructors dynamically allocate memory, that is when they do it. After the executable is loaded, but before main is run.
There's usually separate memory areas for variables that the compiler:
worked out initially contain all 0s - perhaps with a pre-main() constructor running to change their content
predetermined have a specific non-0 value, such that they can be written in a pre-constructed form into the executable image and page faulted in ready for use.
When I say a "separate memory area", I mean some memory the OS executable loader arranges for the process, just as per the stack or heap, but different in that these areas are of fixed pre-determined size. In UNIX, the all-0 memory region mentioned above is commonly known as the "BSS", the non-0 initialised area as "data" - see http://en.wikipedia.org/wiki/Data_segment for details.
C++ has the notion of "static storage duration". This refers to all kinds of variables that will take up a fixed amount of space during the execution of a program. These include not just globals, but also static variables at namespace, class and function level.
Note that the memory allocation in all cases can be done before main, but that the actual initialization differs. Also, some of them are zero-initialized before they're normally initialized. Precisely how all this happens is unspecified: the compiler may add a hidden function call, or the OS just happens to zero the process space anyway, etc.

C++ Etiquette about Member Variables on the Heap

Is it considered bad manners/bad practice to explicitly place object members on the heap (via new)? I would think you might want to allow the client to choose the memory region to instantiate the object. I know there might be a situation where heap members might be acceptable. If you know a situation could you describe it please?
If you have a class that's designed for copy semantics and you're allocating/deallocating a bunch of memory unnecessarily, I could see this being bad practice. In general, though, it's not. There are a lot of classes that can make use of heap storage. Just make sure you're free of memory leaks (deallocate things in the destructor, reference count, etc.) and you're fine.
If you want more flexibility, consider letting your user specify an Allocator. I'll explain.
Certain classes, e.g. std::vector, string, map, etc. need heap storage for the data structures they represent. It's not considered bad manners; when you have an automatically allocated vector, the user is expected to know that a buffer is allocated when the vector constructor gets called:
void foo() {
// user of vector knows a buffer that can hold at least 10 ints
// gets allocated here.
std::vector<int> foo(10);
}
Likewise, for std::string, you know there's an internal, heap-allocated char*. Whether there's one per string instance is usually up to the STL implementation; often times they're reference counted.
However, for nearly all of the STL classes, users do have a choice of where things are put, in that they can specify an allocator. vector is defined kind of like this:
template <typename T, typename Alloc = DefaultAllocator<T> >
class vector {
// etc.
};
Internally, vector uses Alloc (which defaults to whatever the default allocator is for T) to allocate the buffer and other heap storage it may need. If users doesn't like the default allocation strategy, they can specify one of their own:
vector<int, MyCustomAllocator> foo(10);
Now when the constructor allocates, it will use a MyCustomAllocator instead of the default. Here are some details on writing your own STL allocator.
If you're worried that it might be "bad manners" to use the heap for certain storage in your class, you might want to consider giving users of your class an option like this so that they can specify how things are to be allocated if your default strategy doesn't fit their needs.
I don't consider it bad practice at all. There are all sorts of reasons why you might want to explicitly allocate a member variable via new. Here are a few off the top of my head.
Say your class has a very large buffer, e.g., 512kb or 1MB. If this buffer is not stored on the heap, your users might potentially exceed the default stack space if they create multiple local variables of your class. In this case, it would make sense to allocate the buffer in your constructor and store it as a pointer.
If you are doing any kind of reference counting, you'll need a pointer to keep track of how many objects are actually pointing to your data.
If your member variable has a different lifetime than your class, a pointer is the way to go. A perfect example of this is lazy evaluation, where you only pay for the creation of the member if the user asks for it.
Although it is not necessarily a direct benefit to your users, compilation time is another reason to use pointers instead of objects. If you put an object in your class, you have to include the header file that defines the object in the header file for your class. If you use a pointer, you can forward declare the class and only include the header file that defines the class in the source files that need it. In large projects, using forward declarations can drastically speed up compilation time by reducing the overall size of your compilation units.
On the flip side, if your users create a lot of instances of your class for use on the stack, it would be advantageous to use objects instead of pointers for your member variables simply because heap allocations/deallocations are slow by comparison. It's more efficient to avoid the heap in this case, taking into account the first bullet above of course.
Where the class puts its members is less important than that the management of them is contained within the class; i.e. clients and subclasses shouldn't have to worry about the object's member variable.
The simplest way to do this would be to make them stack variables. But in some cases, such as if your class has a dynamic data structure like a linked list, it doesn't make sense.
But if you make sure your objects clean up after themeselves, that should be fine for most applications.
hmm, I don't really understand your question.
If you have a class :
class MyOtherClass;
class MyClass
{
MyOtherClass* m_pStruct;
};
Then, the client of MyClass does not have a real choice on how m_pStruct will be allocated.
But it will be the client's decision on how the class MyClass will itself be allocated, either on the stack or on the heap:
MyClass* pMyClass = new MyClass;
or
MyClass myClass;