In C++, what happens when the delete operator is called? - c++

In C++, I understand that the delete operator, when used with an array, 'destroys' it, freeing the memory it used. But what happens when this is done?
I figured my program would just mark off the relevant part of the heap being freed for re-usage, and continue on.
But I noticed that also, the first element of the array is set to null, while the other elements are left unchanged. What purpose does this serve?
int * nums = new int[3];
nums[0] = 1;
nums[1] = 2;
cout << "nums[0]: " << *nums << endl;
cout << "nums[1]: " << *(nums+1) << endl;
delete [] nums;
cout << "nums[0]: " << *nums << endl;
cout << "nums[1]: " << *(nums+1) << endl;

Two things happen when delete[] is called:
If the array is of a type that has a nontrivial destructor, the destructor is called for each of the elements in the array, in reverse order
The memory occupied by the array is released
Accessing the memory that the array occupied after calling delete results in undefined behavior (that is, anything could happen--the data might still be there, or your program might crash when you try to read it, or something else far worse might happen).

The reasons for it being NULL are up to the heap implementation.
Some possible reasons are that it is using the space for it's free-space tracking. It might be using it as a pointer to the next free block. It might be using it to record the size of the free block. It might be writing in some serial number for new/delete debug tracking.
It could just be writing NULL because it feels like it.

Whenever someone says int* nums = new int[3], the runtime system is required to store the number of objects, 3, in a place that can be retrieved knowing only the pointer, nums. The compiler can use any technique it wants to use, but there are two popular ones.
The code generated by nums = new int[3] might store the number 3 in a static associative array, where the pointer nums is used as the lookup key and the number 3 is the associated value. The code generated by delete[] nums would look up the pointer in the associative array, would extract the associated size_t, then would remove the entry from the associative array.
The code generated by nums = new int[3] might allocate an extra sizeof(size_t) bytes of memory (possibly plus some alignment bytes) and put the value 3 just before the first int object. Then delete[] nums would find 3 by looking at the fixed offset before the first int object (that is, before *num) and would deallocate the memory starting at the beginning of the allocation (that is, the block of memory beginning the fixed offset before *nums).
Neither technique is perfect. Here are a few of the tradeoffs.
The associative array technique is slower but safer: if someone forgets the [] when deallocating an array of things, (a) the entry in the associative array would be a leak, and (b) only the first object in the array would be destructed. This may or may not be a serious problem, but at least it might not crash the application.
The overallocation technique is faster but more dangerous: if someone says delete nums where they should have said delete[] nums, the address that is passed to operator delete(void* nums) would not be a valid heap allocation—it would be at least sizeof(size_t) bytes after a valid heap allocation. This would probably corrupt the heap. - C++ FAQs

Related

Why does reallocating with realloc, a pre-allocated memory using allocator::allocate conserve the old start memory address?

The main problem was reallocating memory while expanding it and conserving data and the first starting memory address which is used by many other parts of the program (such as a static starting memory)
This doesn't work with realloc because he deallocate the precedent allocated memory and affects another with a new starting memory address:
using namespace std;
int *t = static_cast<int*>(malloc( 2*sizeof(int)));
cout << "address " << t << endl;
t = static_cast<int*>(realloc(t,10*sizeof(int)));
cout << "address " << t << endl;
=========================
// both of the addresses are different
address 0x55c454fc5180
address 0x55c454fc55b0
after testing many solutions (even direct access to the memory by system call), I found this one :
allocator<int> alloc;
int *t = alloc.allocate(2*sizeof(int));
cout << "address " << t << endl;
// reallocating memory using realloc
t = static_cast<int*>(realloc(t, 10*sizeof(int)));
cout << "address " << t << endl;
=========================
// now the addresses are the same
address 0x55c454fc5180
address 0x55c454fc5180
I tried to explain how it's possible but not able to match the both functioning and I want to know why and how it works.
Using realloc on an address allocated with std::allocator, new or anything similar has undefined behavior. It can only be used when the address comes from the malloc/calloc/realloc family of allocation functions. Never mix them.
In general realloc does not guarantee that the address of the allocation remains unchanged. There is no guarantee that realloc will be able to expand the allocation in place (e.g. there might not be enough memory free after the current allocation). It is the defined behavior of realloc to copy the memory block to a new allocation where sufficient space is free in such a situation. This also means that realloc can only be used with trivially-copyable types in C++.
If your program depends on the address of the allocation remaining unchanged, then it can't expand the allocation. You can have one of these, not both.
Also, you are using std::allocator<int> wrong. The argument to .allocate should be the number of elements of the array to allocate, not the number of bytes. Then afterwards you are supposed to call std::allocator_traits<std::allocator<int>>::construct or std::construct_at or a placement-new on each element of the array to construct and start the lifetime of the array elements.
I am not sure why you are trying to use std::allocator here, but it is unlikely that you need it. If you just intend to create an array of int, you should use new int[n], not std::allocator. Or rather don't worry with manual memory management at all and just use std::vector<int>.

C++ programming, dynamical memory is not working properly using malloc and calloc

I have just started learning C++ and I came on a problem I couldn't find on the Internet so I hope you can help me with.
This is my code:
int* a;
int* b;
a = (int*)calloc(1, sizeof(int));
b = (int*)calloc(5, sizeof(int));
cout << sizeof(a) << endl;
cout << sizeof(b) << endl;
What compiler returns me is: 8, 8.
If I use:
cout << sizeof(*a) << endl;
cout << sizeof(*b) << endl;
Compiler returns 4, 4.
The same thing is with malloc. I am using .
What am I doing wrong? Why isn't the size of b 20 as it is 5 times bigger if int is 4 bytes long?
Thanks!
sizeof(*a) and sizeof(*b) are always going to be equal to 4. It seems you expect them to return the size of arrays, but you need to understand that a and b are not arrays. They are pointers to int. If sizeof(int) is 4, then sizeof(*a) is also going to be 4, and this is already known at compile time.
With that being said, you do not need to use the C library functions malloc() and calloc() in C++. If you need manual memory allocation, use new and delete:
a = new int;
b = new int[5];
If you need to do zero-initialization like calloc does, just use () to default-construct the allocated integers:
a = new int();
b = new int[5]();
Instead of free(), use delete or delete[], depending on how new was called previously:
delete a; // Note: no '[]'
delete[] b; // Needs '[]'
However, you do not need manual memory allocation here. Just use std::vector<int>:
#include <vector>
// ...
std::vector<int> a(5); // 5 int elements, zero-initialized.
std::cout << a.size() << '\n'; // Will print '5'.
As a rule of thumb, your C++ code should not have any calls to new, delete, malloc(), calloc() or free(). Doing manual memory management requires more code and is error-prone. Use containers like vector and smart pointers like shared_ptr and unique_ptr instead to reduce the chance of memory and other resource leaks. These safer types are also more convenient. With vector for example, you do not have to remember the size of your allocated memory yourself. The vector keeps track of its size for you. You can also copy vectors easily by just assigning them directly. You also don't need to delete or free() vectors manually. They are automatically deleted when they go out of scope.
As a side-note, I recommend getting rid of the habit of using endl for printing newlines. endl flushes the stream, it doesn't just print a newline. If you use it, you will be constantly flushing the output stream, which is a slow operation. You only rarely need to flush a stream, in which case you can just do so manually with << flush if the need ever arises.
You are taking the sizeof a pointer in the first case and the size of the element int the second. *a is for your intents and purposes the same as a[0]. The size of the pointer is architecture dependent, and the size of the int is 4.
The sizeof value is evaluated at compile time. Dynamic memory allocation occurs at runtime. To find out the amount allocated at run time you can look at overloading the new operator (not recommended) or using the containers as the comments have suggested.
sizeof(a) is the size of the pointer (this is 8 in a 64bit architecture normally), while sizeof(*a) is the size of the pointed to element (an integer value). Nothing returned by sizeof operator has dynamic nature (as the number of elements returned by calloc(3))
By the way, calloc() is strongly deprecated in C++. Its use is reserved to cases in which you have to pass the pointers to C code and for legacy code. Use the operators new and new [] (the last one in this case). But none of these will change things, the sizeof operator will continue returning the values you got. If you want to check the size of the returned array, then check the parameters passed to both operators.

C++ new[] operator creates array of length = length + 1?

Why does the new[] operator in C++ actually create an array of length + 1? For example, see this code:
#include <iostream>
int main()
{
std::cout << "Enter a positive integer: ";
int length;
std::cin >> length;
int *array = new int[length]; // use array new. Note that length does not need to be constant!
//int *array;
std::cout << "I just allocated an array of integers of length " << length << '\n';
for (int n = 0; n<=length+1; n++)
{
array[n] = 1; // set element n to value 1
}
std::cout << "array[0] " << array[0] << '\n';
std::cout << "array[length-1] " << array[length-1] << '\n';
std::cout << "array[length] " << array[length] << '\n';
std::cout << "array[length+1] " << array[length+1] << '\n';
delete[] array; // use array delete to deallocate array
array = 0; // use nullptr instead of 0 in C++11
return 0;
}
We dynamically create an array of length "length" but we are able to assign a value at the index length+1. If we try to do length+2, we get an error.
Why is this? Why does C++ make the length = length + 1?
It doesn’t. You’re allowed to calculate the address array + n, for the purpose of checking that another address is less than it. Trying to access the element array[n] is undefined behavior, which means the program becomes meaningless and the compiler is allowed to do anything whatsoever. Literally anything; one old version of GCC, if it saw a #pragma directive, started a roguelike game on the terminal. (Thanks, Revolver_Ocelot, for reminding me: that was technically implementation-defined behavior, a different category.) Even calculating the address array + n + 1 is undefined behavior.
Because it can do anything, the particular compiler you tried that on decided to let you shoot yourself in the foot. If, for example, the next two words after the array were the header of another block in the heap, you might get a memory-corruption bug. Or maybe a compiler stored the array at the top of your memory space, the address &array[n+1] is aNULL` pointer, and trying to dereference it causes a segmentation fault. Or maybe the next page of memory is not readable or writable and trying to access it crashes the program with a protection fault. Or maybe the implementation bounds-checks your array accesses at runtime and crashes the program. Maybe the runtime stuck a canary value after the array and checks later to see if it was overwritten. Or maybe it happens, by accident, to work.
In practice, you really want the compiler to catch those bugs for you instead of trying to track down the bugs that buffer overruns cause later. It would be better to use a std::vector than a dynamic array. If you must use an array, you want to check that all your accesses are in-bounds yourself, because you cannot rely on the compiler to do that for you and skipping them is a major cause of bugs.
If you write or read beyond the end of an array or other object you create with new, your program's behaviour is no longer defined by the C++ standard.
Anything can happen and the compiler and program remain standard compliant.
The most likely thing to happen in this case is you are corrupting memory in the heap. In a small program this "seems to work" as the section of the heap ypu use isn't being used by any other code, in a larger one you will crash or behave randomly elsewhere in a seemingoy unrelated bit of code.
But arbitrary things could happen. The compiler could prove a branch leads to access beyond tue end of an array and dead-code eliminate paths that lead to it (UB that time travels), or it could hit a protected memory region and crash, or it could corrupt heap management data and cause a future new/delete to crash, or nasal demons, or whatever else.
At the for loop you are assigning elements beyond the bounds of the loop and remember that C++ does not do bounds checking.
So when you initialize the array you are initializing beyond the bounds of the array (Say the user enters 3 for length you are initializing 1 to array[0] through array[5] because the condition is n <= length + 1;
The behavior of the array is unpredictable when you go beyond its bounds, but most likely your program will crash. In this case you are going 2 elements beyonds its bounds because you have used = in the condition and length + 1.
There is no requirement that the new [] operator allocate more memory than requested.
What is happening is that your code is running past the end of the allocated array. It therefore has undefined behaviour.
Undefined behaviour means that the C++ standard imposes no requirements on what happens. Therefore, your implementation (compiler and standard library, in this case) will be equally correct if your program SEEMS to work properly (as it does in your case), produces a run time error, trashes your system drive, or anything else.
In practice, all that is happening is that your code is writing to memory, and later reading from that memory, past the end of the allocated memory block. What happens depends on what is actually in that memory location. In your case, whatever happens to be in that memory location is able to be modified (in the loop) or read (in order to print to std::cout).
Conclusion: the explanation is not that new[] over-allocates. It is that your code has undefined behaviour, so can seem to work anyway.

Trouble with listing elements in a pointer

I am working on a program in c++ in which the user can add phone numbers to a list. For this assignment, we have to use pointers while dynamically allocating the memory needed. The code below works fine, except for the fact that when the program lists the elements in the pointer, random numbers are spit out. I'm new to c++ so any ways I could be pointed into the right direction of fixing this issue are greatly appreciated.
int *FirstArray = new int(size);
int *SecondArray = new int(size + 1);
if (size == 0) {
cout << "Please enter the number which you would like to add";
cin >> FirstArray[size];
for (int x = 0; x <= size; x++) {
cout << x << ". " << FirstArray[x] << endl;
}
for (int x = 0; x <= size; x++) {
FirstArray[x] = SecondArray[x];
}
SecondArray = FirstArray;
delete (FirstArray);
}
else {
cout << "Please enter the number which you would like to add";
cin >> SecondArray[size];
for (int x = 0; x <= size; x++) {
cout << x + 1 << ". " << SecondArray[x] << endl;
}
}
size++;
Apart from the fact that a std::vector would be really the better choice for such application I think learning about pointers is a good starting point to understand why the usage of std-containers is better.
The whole if(size==0)-block in your code snippet is unsafe as well as the else-scope in further consequence because FirstArray[x] reads from memory which is not allocated at least for every x > 0.
So called segmentation faults are then very likely in such cases though such may be defered in case of debugger friendly memory layout or other reasons.
Besides the fact that you then never had really a list but just two values refered by two single-element arrays (or just pointers) it's then clear why you get only random numbers from the memory pointed to by the pointers.
A pointer in C (or C++) is not restricting the access to succeeding elements behind the first element.
This means, that pointers can be used for either single values (which is exactly the same as an array with size == 1) and arrays with more than one element.
Some more issues...
Use new int[] rather than new int() because in this context curved brackets () is understood as argument list to the compiler generated 'constructor' of the data type 'int' which in case of int() just sets the value. C++ is consequently applying its type paradigms to primitive types as well and not only classes. See another SO article on this topic
Using new int[size] instead does what you want. It allocates memory for an integer array with 'size' elements and returns the pointer to the first element.
I think you do not need a SecondArray. A statement like "SecondArray = FirstArray" is anyway not copying the elements. It's copying the pointers and leaving the memory allocated to SecondArray behind as a memory leak.
Deleting then FirstArray with "delete (FirstArray)" makes it even worse because then you delete FirstArray and SecondArray at once because both point to the same memory location and any further access to SecondArray would be dangerous (segfault etc.)
Incrementing size++ at the end is as well in vain (if I got your idea right) because the size should be clear before you allocate and access the memory, not afterwards.
Resizing the array in case that 'size' changes can be done either by calling new(FirstArray)[size] (which is seldomly used directly but common in std-containers) or by consequently giving up using C++ and switching to the ANSI C style with malloc() for initial allocation, realloc() for resizing, memcpy() for copying/assignment and finally free() for deallocation. But switching to ANSI C style in this case doesn't mean that you are not allowed to use it in a C++ context. BTW, in most standard C++ frameworks the new-operator and the delete-operator call malloc() and free() behind the scenes.
At the end of the day, using std::vector<> can make life MUCH easier ;-)

What if I delete an array once in C++, but allocate it multiple times?

Suppose I have the following snippet.
int main()
{
int num;
int* cost;
while(cin >> num)
{
int sum = 0;
if (num == 0)
break;
// Dynamically allocate the array and set to all zeros
cost = new int [num];
memset(cost, 0, num);
for (int i = 0; i < num; i++)
{
cin >> cost[i];
sum += cost[i];
}
cout << sum/num;
}
` `delete[] cost;
return 0;
}
Although I can move the delete statement inside the while loop
for my code, for understanding purposes, I want to know what happens with the code as it's written. Does C++ allocate different memory spaces each time I use operator new?
Does operator delete only delete the last allocated cost array?
Does C++ allocate different memory spaces each time I use operator new?
Yes.
Does operator delete only delete the last allocated cost array?
Yes.
You've lost the only pointers to the others, so they are irrevocably leaked. To avoid this problem, don't juggle pointers, but use RAII to manage dynamic resources automatically. std::vector would be perfect here (if you actually needed an array at all; your example could just keep reading and re-using a single int).
I strongly advise you not to use "C idioms" in a C++ program. Let the std library work for you: that's why it's there. If you want "an array (vector) of n integers," then that's what std::vector is all about, and it "comes with batteries included." You don't have to monkey-around with things such as "setting a maximum size" or "setting it to zero." You simply work with "this thing," whose inner workings you do not [have to ...] care about, knowing that it has already been thoroughly designed and tested.
Furthermore, when you do this, you're working within C++'s existing framework for memory-management. In particular, you're not doing anything "out-of-band" within your own application "that the standard library doesn't know about, and which might (!!) it up."
C++ gives you a very comprehensive library of fast, efficient, robust, well-tested functionality. Leverage it.
There is no cost array in your code. In your code cost is a pointer, not an array.
The actual arrays in your code are created by repetitive new int [num] calls. Each call to new creates a new, independent, nameless array object that lives somewhere in dynamic memory. The new array, once created by new[], is accessible through cost pointer. Since the array is nameless, that cost pointer is the only link you have that leads to that nameless array created by new[]. You have no other means to access that nameless array.
And every time you do that cost = new int [num] in your cycle, you are creating a completely new, different array, breaking the link from cost to the previous array and making cost to point to the new one.
Since cost was your only link to the old array, that old array becomes inaccessible. Access to that old array is lost forever. It is becomes a memory leak.
As you correctly stated it yourself, your delete[] expression only deallocates the last array - the one cost ends up pointing to in the end. Of course, this is only true if your code ever executes the cost = new int [num] line. Note that your cycle might terminate without doing a single allocation, in which case you will apply delete[] to an uninitialized (garbage) pointer.
Yes. So you get a memory leak for each iteration of the loop except the last one.
When you use new, you allocate a new chunk of memory. Assigning the result of the new to a pointer just changes what this pointer points at. It doesn't automatically release the memory this pointer was referencing before (if there was any).
First off this line is wrong:
memset(cost, 0, num);
It assumes an int is only one char long. More typically it's four. You should use something like this if you want to use memset to initialise the array:
memset(cost, 0, num*sizeof(*cost));
Or better yet dump the memset and use this when you allocate the memory:
cost = new int[num]();
As others have pointed out the delete is incorrectly placed and will leak all memory allocated by its corresponding new except for the last. Move it into the loop.
Every time you allocate new memory for the array, the memory that has been previously allocated is leaked. As a rule of thumb you need to free memory as many times as you have allocated.