Why does operator "new" require a pointer to work? - c++

I cannot wrap my head around why the memory in the stack allocated through new can be accessed only through pointers, while the memory in the heap (statically allocated) can be accessed normally.
Does it have something to do with the fact that pretty much all memory in the heap has some sort of order and the one in the stack is somewhat random? (If what I just said is true at all.)
Dynamic memory just seems so vague and mystical to me, so anyone who could help me understand it better will be hugely appreciated.

Why does operator “new” require a pointer to work?
Becouse it allocates block of memory (size is specified by caller) on heap and returns address of the beginning of that allocated block.
Why are we using it
We're using it if we want that memory temporary, so we can easily delete[] it.
You can easily change the size of the allocated memory (resize).
char arr[20]; // You need more space? Not possible to change size
// While
char * arr = new char[20];
delete[] arr;
arr = new char[50];
Disadvantage
Allocating object with new is much more expensive.
Its slower.
Memory leak's
Memory fragmentation
Has to be free'd delete[]
Summary
Stack (automatic storage) is easier to use, faster & foolproof. But sometimes we have to use heap and we should be careful as much as possible.

See, in C++, memory is divided into four parts, that are
Program code
Global variables
Stack
Heap
Now, as name defines Program code part stores your code andGlobal variable part stores global variables.
These two parts are very clear.
Now, our concern is Stack memory and Heap memory.
Stack memory is reserved for static implementation.
Heap memory is reserved for dynamic implementation.
Variables in stack memory are allocated during compilation.
But variables in heap memory are allocated during runtime, as it allocated during runtime we can't process it as normal variables.
Here we require pointers because we need something that refers us to the memory accquired during dynamic allocation. This job is done by pointers.

Related

Where does the memory come from to initialize a C++ object without the new keyword?

Consider the code snippet:
ClassName* p;
p = new ClassName;
As I understand it, we are allocating memory from the heap to store *p.
But now consider:
ClassName C;
Question: If not from the heap, where does the memory for C come from?
As I understand it, we are allocating memory from the heap to store
*p.
More correctly worded, the object created by new has dynamic storage duration.
If not from the heap, where does the memory for C come from?
It has automatic storage duration.
See C++ standard §3.7/1.
Talking about "stack" or "heap" takes you on the compiler-implementation level. You are generally not interested in how the C++ compiler makes the different kinds of storage duration work but are only interested in their semantics.
Question: If not from the heap, where does the memory for C come from?
From the stack (or in the case of a global variable or function 'static' variable, from a statically allocated region that exists independently of both the heap and the stack).
Strictly speaking, C++ has no concept of heap or stack (other than as data structures in the standard library, which are fundamentally different to "the" heap and stack that are the prevalent mechanisms used for allocation of memory by running programs); the exact mechanism for allocation and management of memory is not specified. In practice, there are two ways for a program to allocate memory at run time on most systems: from the heap (which is itself built on memory chunks obtained from the operating system), or from the stack (the same stack that is used to store the return address and save temporary values when functions are called).
Memory for the stack is usually allocated when (or just before) the program begins execution, and it usually needs to be contiguous. Memory for the heap is usually obtained dynamically and does not normally need to be contiguous (though in practice heap memory is often allocated in contiguous chunks).
Note that your first example:
ClassName* p;
p = new ClassName;
... actually embodies two allocations: one for the ClassName object (dynamic storage, allocated on the heap) and one for the pointer variable p (automatic storage, allocated on the heap).
In practice, local variables will not always require stack allocation - their values can sometimes be kept in registers (or in some cases, the storage can be optimised away altogether, especially if the value is unused or is a compile-time constant). And theoretically, heap allocations can also potentially be optimised away.

How compiler is going to know which memory is allocated using which operator or function?

Suppose I have allocated memory for two arrays, one using new operator and other using malloc function. As far as I know both of the memories are allocated in heap segment then my question is how the compiler is going to know which memory is allocated using which operator or function? Or is there any other concept behind this.
The compiler doesn't have to know how memory behind a pointer was allocated, it's the responsibility of the programmer. You should always use matching allocate-deallocate functions/operators. For example the operator new can be overloaded. In this case when you allocate object with new, and release it with free(), you're in trouble because free() has no idea what kind of book-keeping you have there. Here's simplified an example of this situation:
#include <iostream>
#include <stdlib.h>
struct MyClass
{
// Really dumb allocator.
static void* operator new(size_t s)
{
std::cout << "Allocating MyClass " << s << " bytes.\n";
void* res = Pool + N * sizeof(MyClass);
++N;
return res;
}
// matching operator delete not implemented on purpose.
static char Pool[]; // take memory from this statically allocated array.
static unsigned N; // keep track of allocated objects.
};
char MyClass::Pool[10*sizeof(MyClass)];
unsigned MyClass::N = 0;
int main(int argc, char** argv)
{
MyClass* p = new MyClass();
if (argc == 1)
{
std::cout << "Trying to delete\n";
delete p; // boom - non-matching deallocator used.
}
else
{
std::cout << "Trying to free\n";
free(p); // also boom - non-matching deallocator used.
}
}
If you mix and match the allocators and deallocators you will run into similar problems.
Internally, both allocation mechanisms may or may not finally use the same mechanism, but pairing new and free or malloc and delete would mix conceptually different things and cause undefined behaviour.
You must not use delete for malloc or free for new. Although for basic data types you might get away with it on most compilers, it is still wrong. It is not guaranteed to work. malloc and new could deal with different heaps and not the same one. Furthermore, delete will call destructors of objects whereas free will not.
Compilers don't have to keep track of which memory blocks are allocated by malloc or new. They might as a debug help, or they might not. Don't rely on that.
It does not know. It just calls a function that returns a pointer, and pointers do not carry the information of how they got to be or what kind of memory they point to. It just passes along that pointer and does not care about it any further.
However, the function you use to deallocate the memory (i.e. free/delete) might depend on information that got stored somewhere hidden by malloc/new. So if you allocate memory by malloc and try to deallocate it by using delete (or new and free), it might not work (apart from the obvious problems with constructors/destructors).
Might not work in this case it is undefined what happens. This is a huge bonus for cmpiler developers and performance, because they simply don't have to care. On the other hand, the effort is put off to the developers who have to keep track of how certain memory got allocated. The easiest way to do that is by using just one of the two methods.
new/delete is the C++ way to allocate memory and deallocate memory from the heap
whereas
malloc/free/family is the C way to allocate and free memory from the heap
I don't know why you want the compiler to know who allocated the heap memory but
if you want to track how there is a way.
One way of doing so is new would initialize the allocated memory by calling a constructor you can monitor this constructor to know who allocated the memory to heap.
Regards,
yanivx
As far as I know both of the memories are allocated in heap segment then my question is how compiler is going to know which memory is allocated using which operator or function?
What is this thing you call the "heap segment"?
There is no such thing as far as the C and C++ standards are concerned. The "heap" and "stack" are implementation-specific concepts. They are very widely used concepts, but neither standard mandates a "heap" or a "stack".
How the implementation (not the compiler!) knows where things are allocated is up to the implementation. Your best bet, and the only safe bet, is to follow what the standards say to do:
If you allocate memory using new[] you must deallocate it with delete[] (or leave it undeleted).
Any other deallocation is undefined behavior.
If you allocate memory using new you must deallocate it with delete (or leave it undeleted).
Any other deallocation is undefined behavior.
If you allocate memory using malloc or its kin you must deallocate it with free (or leave it undeleted).
Any other deallocation is undefined behavior.
Not freeing allocated memory can sometimes be a serious problem. If you continuously allocate big chunks of memory and never free a single one you will run into problems. Other times, it's not a problem at all. Allocating one chunk of memory at program start and oops, you didn't free it oftentimes is not a problem because that allocated memory is released when the program terminates. It's up to you to determine whether those memory leaks truly are a problem.
The easiest way to avoid these larger issues is to have the program properly free every single byte of allocated memory before the program exits.
Note well: Doing that doesn't guarantee that you don't have a memory problem. Just because your program eventually should free every single one of the multiple terabytes allocated over the course of the program's execution doesn't necessarily mean that the program is okay memory-wise.

What exactly is dynamic memory? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Differences between dynamic memory and “ordinary” memory
I was reading the C++ tutorial and I don't understand why I need to declare dynamic memory, this is what the tutorial says:
Until now, in all our programs, we have only had as much memory available as we declared for our variables, having the size of all of them to be determined in the source code, before the execution of the program.
And then it says that we have to use new and delete operators to use dynamic memory.
However, I seem to be using dynamic memory when declare a pointer, e.g. char* p, for which I have not specified the length of the array of characters. In fact, I thought that when you use a pointer you are always using dynamic memory. Isn't it true?
I just don't see the difference between declaring a variable using new operator and not. I don't really understand what dynamic memory is. Can anyone explain me this?
I thought that when you use a pointer you are always using dynamic
memory. Isn't it true?
No it's not true, for example
int i;
int *p = &i; // uses a pointer to static memory, no dynamic memory.
However, I seem to be using dynamic memory when declare a pointer,
e.g. char* p, for which I have not specified the length of the array
of characters
char[100] string;
char* p = &(string[0]); // Same as above, no dynamic memory.
You need dynamic memory when you can't tell how big the data structure needs to be.
Say you've to read some ints from a file and store them in memory. You have no idea how many ints you need. You could pick a figure of 100, but then your program breaks if there are 101. You could pick 100,000 hoping that's enough, but it's waste of resources if there's only 10 in the file, and again, it breaks if there's 100,001 ints in the file.
In this scenario your program could iterate through the file, count the number of ints, then dynamically create an array of the correct size. Then you pass over the file a second time reading the ints into your new array.
Static v's Dynamic Memory
Static memory is static because once the program is compiled it can't be changed, it is static. Variables you declare in functions, and members declared on classes / structs are static. The compiler calculates exactly how many of each its going to need as each method gets called.
Dynamic memory is a "pool" of memory that can be made available to your program on demand, at run time.
The compiler only knows it needs to allocate some (probably unknown) amount of that memory, and to release that memory back to the dynamic memory pool.
Hope this helps.
P.S. Yes, there are more efficient ways to get an unknown number of items into memory, but this is the simplest to explain
When you have:
char* p;
p is variable of type pointer to char and p is stored on the stack and you haven't allocated any dynamic memory.
But when you do:
p = new char[100];
you have allocated a part of dynamic memory (heap) of the size 100*sizeof(char).
You are responsible to free allocated memory on the heap:
delete[] p;
You don't need to clean variables from the stack - they will be removed automatically after variable goes out of scope. In this example, p will be removed from the stack when it goes out of its scope.
Dynamic memory is memory which the programmer has to explicity request, as an oppose to have automatically allocated on the stack.
There are many advantages to dynamic memory such being persistent between stack frames (function calls) and can be of varying size.
On the stack an array much be of a certain size:
int ar[5];
However if you 10 element then you can't do it, the solution is to dynamically allocate the memory;
size_t sz;
std::cin >> sz;
int *i_p=new int[sz];
That said everything dynamically allocated must be freed (in C++ using delete)
delete i_p;
However it is generally better where possible to use wrappers to dynamic arrays such as the std::vector
size_t sz;
std::cin >> sz;
std::vector<int> vect(sz);
This will automatically manage the memory and provide a useful interface to the array.
Let's say you want to read an unknown number of integers from a user. You could, for example, declare int numbers[100], ask the user how many numbers there are (let's say this is store in variable n) and if he enters a number larger than 100, you would have no choice but to report an error. Alternatively, you could write int *numbers = new int[n] and allocate just enough space for all the numbers.
Dynamic memory in c++ is a memory allocated in a heap of operation system by using new operator. You need the dynamic memory when you need to allocate the objects which are too large and cannot be allocated in the stack, or when you have a multithreaded environment and need to share the memory allocated in one of the threads between the different threads. Pointer doesn't mean that you use the dynamic memory pointers also can contain the a stack address related with the object in the stack.
In fact, I thought that when you use a pointer you are always using dynamic memory. Isn't it true?
No. Here's a pointer to stack-allocated ("automatic") memory:
{
int i;
int *p = &i;
}

C++ dynamically allocated memory

I don't quite get the point of dynamically allocated memory and I am hoping you guys can make things clearer for me.
First of all, every time we allocate memory we simply get a pointer to that memory.
int * dynInt = new int;
So what is the difference between doing what I did above and:
int someInt;
int* dynInt = &someInt;
As I understand, in both cases memory is allocated for an int, and we get a pointer to that memory.
So what's the difference between the two. When is one method preferred to the other.
Further more why do I need to free up memory with
delete dynInt;
in the first case, but not in the second case.
My guesses are:
When dynamically allocating memory for an object, the object doesn't get initialized while if you do something like in the second case, the object get's initialized. If this is the only difference, is there a any motivation behind this apart from the fact that dynamically allocating memory is faster.
The reason we don't need to use delete for the second case is because the fact that the object was initialized creates some kind of an automatic destruction routine.
Those are just guesses would love it if someone corrected me and clarified things for me.
The difference is in storage duration.
Objects with automatic storage duration are your "normal" objects that automatically go out of scope at the end of the block in which they're defined.
Create them like int someInt;
You may have heard of them as "stack objects", though I object to this terminology.
Objects with dynamic storage duration have something of a "manual" lifetime; you have to destroy them yourself with delete, and create them with the keyword new.
You may have heard of them as "heap objects", though I object to this, too.
The use of pointers is actually not strictly relevant to either of them. You can have a pointer to an object of automatic storage duration (your second example), and you can have a pointer to an object of dynamic storage duration (your first example).
But it's rare that you'll want a pointer to an automatic object, because:
you don't have one "by default";
the object isn't going to last very long, so there's not a lot you can do with such a pointer.
By contrast, dynamic objects are often accessed through pointers, simply because the syntax comes close to enforcing it. new returns a pointer for you to use, you have to pass a pointer to delete, and (aside from using references) there's actually no other way to access the object. It lives "out there" in a cloud of dynamicness that's not sitting in the local scope.
Because of this, the usage of pointers is sometimes confused with the usage of dynamic storage, but in fact the former is not causally related to the latter.
An object created like this:
int foo;
has automatic storage duration - the object lives until the variable foo goes out of scope. This means that in your first example, dynInt will be an invalid pointer once someInt goes out of scope (for example, at the end of a function).
An object created like this:
int foo* = new int;
Has dynamic storage duration - the object lives until you explicitly call delete on it.
Initialization of the objects is an orthogonal concept; it is not directly related to which type of storage-duration you use. See here for more information on initialization.
Your program gets an initial chunk of memory at startup. This memory is called the stack. The amount is usually around 2MB these days.
Your program can ask the OS for additional memory. This is called dynamic memory allocation. This allocates memory on the free store (C++ terminology) or the heap (C terminology). You can ask for as much memory as the system is willing to give (multiple gigabytes).
The syntax for allocating a variable on the stack looks like this:
{
int a; // allocate on the stack
} // automatic cleanup on scope exit
The syntax for allocating a variable using memory from the free store looks like this:
int * a = new int; // ask OS memory for storing an int
delete a; // user is responsible for deleting the object
To answer your questions:
When is one method preferred to the other.
Generally stack allocation is preferred.
Dynamic allocation required when you need to store a polymorphic object using its base type.
Always use smart pointer to automate deletion:
C++03: boost::scoped_ptr, boost::shared_ptr or std::auto_ptr.
C++11: std::unique_ptr or std::shared_ptr.
For example:
// stack allocation (safe)
Circle c;
// heap allocation (unsafe)
Shape * shape = new Circle;
delete shape;
// heap allocation with smart pointers (safe)
std::unique_ptr<Shape> shape(new Circle);
Further more why do I need to free up memory in the first case, but not in the second case.
As I mentioned above stack allocated variables are automatically deallocated on scope exit.
Note that you are not allowed to delete stack memory. Doing so would inevitably crash your application.
For a single integer it only makes sense if you need the keep the value after for example, returning from a function. Had you declared someInt as you said, it would have been invalidated as soon as it went out of scope.
However, in general there is a greater use for dynamic allocation. There are many things that your program doesn't know before allocation and depends on input. For example, your program needs to read an image file. How big is that image file? We could say we store it in an array like this:
unsigned char data[1000000];
But that would only work if the image size was less than or equal to 1000000 bytes, and would also be wasteful for smaller images. Instead, we can dynamically allocate the memory:
unsigned char* data = new unsigned char[file_size];
Here, file_size is determined at runtime. You couldn't possibly tell this value at the time of compilation.
Read more about dynamic memory allocation and also garbage collection
You really need to read a good C or C++ programming book.
Explaining in detail would take a lot of time.
The heap is the memory inside which dynamic allocation (with new in C++ or malloc in C) happens. There are system calls involved with growing and shrinking the heap. On Linux, they are mmap & munmap (used to implement malloc and new etc...).
You can call a lot of times the allocation primitive. So you could put int *p = new int; inside a loop, and get a fresh location every time you loop!
Don't forget to release memory (with delete in C++ or free in C). Otherwise, you'll get a memory leak -a naughty kind of bug-. On Linux, valgrind helps to catch them.
Whenever you are using new in C++ memory is allocated through malloc which calls the sbrk system call (or similar) itself. Therefore no one, except the OS, has knowledge about the requested size. So you'll have to use delete (which calls free which goes to sbrk again) for giving memory back to the system. Otherwise you'll get a memory leak.
Now, when it comes to your second case, the compiler has knowledge about the size of the allocated memory. That is, in your case, the size of one int. Setting a pointer to the address of this int does not change anything in the knowledge of the needed memory. Or with other words: The compiler is able to take care about freeing of the memory. In the first case with new this is not possible.
In addition to that: new respectively malloc do not need to allocate exactly the requsted size, which makes things a bit more complicated.
Edit
Two more common phrases: The first case is also known as static memory allocation (done by the compiler), the second case refers to dynamic memory allocation (done by the runtime system).
What happens if your program is supposed to let the user store any number of integers? Then you'll need to decide during run-time, based on the user's input, how many ints to allocate, so this must be done dynamically.
In a nutshell, dynamically allocated object's lifetime is controlled by you and not by the language. This allows you to let it live as long as it is required (as opposed to end of the scope), possibly determined by a condition that can only be calculated at run-rime.
Also, dynamic memory is typically much more "scalable" - i.e. you can allocate more and/or larger objects compared to stack-based allocation.
The allocation essentially "marks" a piece of memory so no other object can be allocated in the same space. De-allocation "unmarks" that piece of memory so it can be reused for later allocations. If you fail to deallocate memory after it is no longer needed, you get a condition known as "memory leak" - your program is occupying a memory it no longer needs, leading to possible failure to allocate new memory (due to the lack of free memory), and just generally putting an unnecessary strain on the system.

Best practices of dynamic vs. static memory in terms of cleanliness and speed

I have an array, called x, whose size is 6*sizeof(float). I'm aware that declaring:
float x[6];
would allocate 6*sizeof(float) for x in the stack memory. However, if I do the following:
float *x; // in class definition
x = new float[6]; // in class constructor
delete [] x; // in class destructor
I would be allocating dynamic memory of 6*sizeof(float) to x. If the size of x does not change for the lifetime of the class, in terms of best practices for cleanliness and speed (I do vaguely recall, if not correctly, that stack memory operations are faster than dynamic memory operations), should I make sure that x is statically rather than dynamically allocated memory? Thanks in advance.
Declaring the array of fixed size will surely be faster. Each separate dynamic allocation requires finding an unoccupied block and that's not very fast.
So if you really care about speed (have profiled) the rule is if you don't need dynamic allocation - don't use it. If you need it - think twice on how much to allocate since reallocating is not very fast too.
Using an array member will be cleaner (more succinct, less error prone) and faster as there is no need to call allocation and deallocation functions. You will also tend to improve 'locality of reference' for the structure being allocated.
The two main reasons for using dynamically allocated memory for such a member are where the required size is only known at run time, or where the required size is large and it is known that this will have a significant impact on the available stack space on the target platform.
TBH data on the stack generally sits in the cache and hence it is faster. However if you dynamically allocate something once and then use it regularly it will also be cached and hence pretty much as fast.
The important thing is to avoid allocating and deallocating regularly (ie each time a function is called). If you justa void doing regular allocation and deallocations (ie allocate and deallocate once only) then a stack and heap allocated array will preform pretty much as quickly as each other.
Yes, declaring the array statically will perform faster.
This is very easy to test, just write a simple wrapping loop to instantiate X number of these objects. You can also step through the machine code and see the larger number of OPCODEs required to dynamically allocate the memory.
Static allocation is faster (no need to ask to memory ) and there's no way you will forget to delete it or delete it with incorrect delete operator (delete instead of delete[]).
Construction an usage of dynamic/heap data is consists of the following steps:
ask for memory to allocate the objects (calling to new operator). If no memory a new operator will throw bad_alloc exception.
creating the objects with default constructor (also done by new)
release the memory by user (by delete/delete[] operator) - delete will call
to object destructor. Here a user can do a lot of mistakes:
forget to call to delete - this will lead to memory leak
call to not correct delete operator (e.g. delete instead of delete[]) - bad things will happen
call to delete twice - bad things can happen
When using static objects/array of objects, there's no need to allocate memory and release it by user. This makes code simpler and less error-prone.
So to the conclusion, if you know your size on the array on at compilation time and you don't matter about memory (maybe at runtime I'll use not entries in the array), static array is obviously preferred one.
For dynamic allocated data it worth looking for smart pointers (here)
Don't confuse the following cases:
int global_x[6]; // an array with static storage duration
struct Foo {
int *pointer_x; // a pointer member in instance data
int member_x[6]; // an array in instance data
Foo() {
pointer_x = new int[6]; // a heap-allocated array
}
~Foo() { delete[] pointer_x; }
};
int main() {
int auto_x[6]; // an array on the stack (automatic variable)
Foo auto_f; // a Foo on the stack
Foo *dyn_f = new Foo(); // a heap-allocated Foo.
}
Now:
auto_f.member_x is on the stack, because auto_f is on the stack.
(*dyn_f).member_x is on the heap, because *dyn_f is on the heap.
For both Foo objects, pointer_x points to a heap-allocated array.
global_x is in some data section which the OS or runtime creates each time the program is run. This may or may not be from the same heap as dynamic allocations, it doesn't usually matter.
So regardless of whether it's on the heap or not, member_x is a better bet than pointer_x in the case where the length is always 6, because:
It's less code and less error-prone.
Your object only needs a single allocation if the object is heap-allocated, instead of 2.
Your object requires no heap allocations if the object is on the stack.
It uses less memory in total, because of fewer allocations, and also because there's no need for storage for the pointer value.
Reasons to prefer pointer_x:
If you need to reallocate during the lifetime of the object.
If different objects will need a different size array (perhaps based on constructor parameters).
If Foo objects will be placed on the stack, but the array is so large that it won't fit on the stack. For instance if you've got 1MB of stack, then you can't use automatic variables which contain an int[262144].
Composition is more efficient, being faster, lower memory overhead and less memory fragmentation.
You could do something like this:
template <int SZ = 6>
class Whatever {
...
float floats[SZ];
};
Use the stack allocated memory whenever possible. It will save you from the headaches of deallocating the memory, fragmentation of your virtual address space etc. Also, it is faster compared to the dynamic memory allocation.
There are more variables at play here:
The size of the array vs. the size of the stack: stack sizes are quite small compared to the free store (e.g. 1MB upto 30MB). Large chunks on the stack will cause stack overflow
The number of arrays you need: large number of small arrays
The lifetime of the array: if it's only needed locally inside a function, the stack is very convenient. If you need it after the function has exited, you must allocate it on the heap.
Garbage collection: if you allocate it on the heap, you need to clean it up manually, or have some flavour of smart pointers do the work for you.
As mentioned in another reply, large objects can not be allocated on the stack because you are not sure what is the stack size. In interests of portability, large objects or objects with variable sizes should always be allocated on the heap.
There has been a lot of development in the malloc/new routines now provided by the operating system (for example, Solaris's libumem). Dynamic memory allocation is often not a bottleneck.
If yo allocate the arraty statically, there will only be one instance of it. The point of using a class is that you want multiple instances. There is no need to allocate the array dynamically at all:
class A {
...
private:
float x[8];
};
is what you want.