Dynamically allocating an array of size 0 [duplicate] - c++

A simple test app:
cout << new int[0] << endl;
outputs:
0x876c0b8
So it looks like it works. What does the standard say about this? Is it always legal to "allocate" empty block of memory?

From 5.3.4/7
When the value of the expression in a direct-new-declarator is zero, the allocation function is called to allocate an array with no elements.
From 3.7.3.1/2
The effect of dereferencing a pointer returned as a request for zero size is undefined.
Also
Even if the size of the space requested [by new] is zero, the request can fail.
That means you can do it, but you can not legally (in a well defined manner across all platforms) dereference the memory that you get - you can only pass it to array delete - and you should delete it.
Here is an interesting foot-note (i.e not a normative part of the standard, but included for expository purposes) attached to the sentence from 3.7.3.1/2
[32. The intent is to have operator new() implementable by calling malloc() or calloc(), so the rules are substantially the same. C++ differs from C in requiring a zero request to return a non-null pointer.]

Yes, it is legal to allocate a zero-sized array like this. But you must also delete it.

What does the standard say about this? Is it always legal to "allocate" empty block of memory?
Every object has a unique identity, i.e. a unique address, which implies a non-zero length (the actual amount of memory will be silently increased, if you ask for zero bytes).
If you allocated more than one of these objects then you'd find they have different addresses.

Yes it is completely legal to allocate a 0 sized block with new. You simply can't do anything useful with it since there is no valid data for you to access. int[0] = 5; is illegal.
However, I believe that the standard allows for things like malloc(0) to return NULL.
You will still need to delete [] whatever pointer you get back from the allocation as well.

Curiously, C++ requires that operator new return a legitimate pointer
even when zero bytes are requested. (Requiring this odd-sounding
behavior simplifies things elsewhere in the language.)
I found Effective C++ Third Edition said like this in "Item 51: Adhere to convention when writing new and delete".

I guarantee you that new int[0] costs you extra space since I have tested it.
For example,
the memory usage of
int **arr = new int*[1000000000];
is significantly smaller than
int **arr = new int*[1000000000];
for(int i =0; i < 1000000000; i++) {
arr[i]=new int[0];
}
The memory usage of the second code snippet minus that of the first code snippet is the memory used for the numerous new int[0].

Related

How does the compiler/program deduces the size of memory to be deleted(released) in case of delete[] arr; [duplicate]

Foo* set = new Foo[100];
// ...
delete [] set;
You don't pass the array's boundaries to delete[]. But where is that information stored? Is it standardised?
When you allocate memory on the heap, your allocator will keep track of how much memory you have allocated. This is usually stored in a "head" segment just before the memory that you get allocated. That way when it's time to free the memory, the de-allocator knows exactly how much memory to free.
ONE OF THE approaches for compilers is to allocate a little more memory and to store a count of elements in a head element.
Example how it could be done:
Here
int* i = new int[4];
compiler will allocate sizeof(int)*5 bytes.
int *temp = malloc(sizeof(int)*5)
Will store "4" in the first sizeof(int) bytes
*temp = 4;
and set i
i = temp + 1;
So i will points to an array of 4 elements, not 5.
And deletion
delete[] i;
will be processed in the following way:
int *temp = i - 1;
int numbers_of_element = *temp; // = 4
... call destructor for numbers_of_element elements
... that are stored in temp + 1, temp + 2, ... temp + 4 if needed
free (temp)
The information is not standardised. However in the platforms that I have worked on this information is stored in memory just before the first element. Therefore you could theoretically access it and inspect it, however it's not worth it.
Also this is why you must use delete [] when you allocated memory with new [], as the array version of delete knows that (and where) it needs to look to free the right amount of memory - and call the appropriate number of destructors for the objects.
It's defined in the C++ standard to be compiler specific. Which means compiler magic. It can break with non-trivial alignment restrictions on at least one major platform.
You can think about possible implementations by realizing that delete[] is only defined for pointers returned by new[], which may not be the same pointer as returned by operator new[]. One implementation in the wild is to store the array count in the first int returned by operator new[], and have new[] return a pointer offset past that. (This is why non-trivial alignments can break new[].)
Keep in mind that operator new[]/operator delete[]!=new[]/delete[].
Plus, this is orthogonal to how C knows the size of memory allocated by malloc.
Basically its arranged in memory as:
[info][mem you asked for...]
Where info is the structure used by your compiler to store the amount of memory allocated, and what not.
This is implementation dependent though.
This isn't something that's in the spec -- it's implementation dependent.
Because the array to be 'deleted' should have been created with a single use of the 'new' operator. The 'new' operation should have put that information on the heap. Otherwise, how would additional uses of new know where the heap ends?
This is a more interesting problem than you might think at first. This reply is about one possible implementation.
Firstly, while at some level your system has to know how to 'free' the memory block, the underlying malloc/free (which new/delete/new[]/delete[] generally call) don't always remember exactly how much memory you ask for, it can get rounded up (for example, once you are above 4K it is often rounded up to the next 4K-sized block).
Therefore, even if could get the size of the memory block, that doesn't tell us how many values are in the new[]ed memory, as it can be smaller. Therefore, we do have to store an extra integer telling us how many values there are.
EXCEPT, if the type being constructed doesn't have a destructor, then delete[] doesn't have to do anything except free the memory block, and therefore doesn't have to store anything!
It is not standardized. In Microsoft's runtime the new operator uses malloc() and the delete operator uses free(). So, in this setting your question is equivalent to the following: How does free() know the size of the block?
There is some bookkeeping going on behind the scenes, i.e. in the C runtime.

Creating an array using new without declaring size [duplicate]

This question already has answers here:
Accessing an array out of bounds gives no error, why?
(18 answers)
Closed 3 years ago.
This has been bugging me for quite some time. I have a pointer. I declare an array of type int.
int* data;
data = new int[5];
I believe this creates an array of int with size 5. So I'll be able to store values from data[0] to data[4].
Now I create an array the same way, but without size.
int* data;
data = new int;
I am still able to store values in data[2] or data[3]. But I created an array of size 1. How is this possible?
I understand that data is a pointer pointing to the first element of the array. Though I haven't allocated memory for the next elements, I still able to access them. How?
Thanks.
Normally, there is no need to allocate an array "manually" with new. It is just much more convenient and also much safer to use std::vector<int> instead. And leave the correct implementation of dynamic memory management to the authors of the standard library.
std::vector<int> optionally provides element access with bounds checking, via the at() method.
Example:
#include <vector>
int main() {
// create resizable array of integers and resize as desired
std::vector<int> data;
data.resize(5);
// element access without bounds checking
data[3] = 10;
// optionally: element access with bounds checking
// attempts to access out-of-range elements trigger runtime exception
data.at(10) = 0;
}
The default mode in C++ is usually to allow to shoot yourself in the foot with undefined behavior as you have seen in your case.
For reference:
https://en.cppreference.com/w/cpp/container/vector
https://en.cppreference.com/w/cpp/container/vector/at
https://en.cppreference.com/w/cpp/language/ub
Undefined, unspecified and implementation-defined behavior
What are all the common undefined behaviours that a C++ programmer should know about?
Also, in the second case you don't allocate an array at all, but a single object. Note that you must use the matching delete operator too.
int main() {
// allocate and deallocate an array
int *arr = new int[5];
delete[] arr;
// allocate and deallocate a single object
int *p = new int;
delete p;
}
For reference:
https://en.cppreference.com/w/cpp/language/new
https://en.cppreference.com/w/cpp/language/delete
How does delete[] know it's an array?
When you used new int then accessing data[i] where i!=0 has undefined behaviour.
But that doesn't mean the operation will fail immediately (or every time or even ever).
On most architectures its very likely that the memory addresses just beyond the end of the block you asked for are mapped to your process and you can access them.
If you're not writing to them it's no surprise you can access them (though you shouldn't).
Even if you write to them most memory allocators have a minimum allocation and behind the scenes you may well have been allocated space for more (4 is realistic) integers even though the code only requests 1.
You may also be overwriting some area of memory but never get tripped up. A common consequence of writing beyond the end of an array is to corrupt the free-memory store itself. The consequence may be catastrophe but may only exhibit itself in a later allocation possibly of a similar sized object.
It's a dreadful idea to rely on such behaviour but it's not very surprising that it appears to work.
C++ doesn't (typically or by default) perform strict range checking and accessing invalid array elements may work or at least appear to work initially.
This is why C and C++ can be plagued with bizarre and intermittent errors. Not all code that provokes undefined behaviour fails catastrophically in every execution.
Going outside the bounds of an array in C++ is undefined behavior, so anything can happen, including things that appear to work "correctly".
In practical implementation terms on common systems, you can think of "virtual" memory as a large "flat" space from 0 up to the size of a pointer, and pointers are into this space.
The "virtual" memory for a process is mapped to physical memory, page file, etc. Now, if you access an address that is not mapped, or try to write a read-only part, you will get an error, such as an access violation or segfault.
But this mapping is done for fairly large chunks for efficiency, such as for 4KiB "pages". The allocators in a process, such as new and delete (or the stack) will further split up these pages as required. So accessing other parts of a valid page are unlikely to raise an error.
This has the unfortunate result that it can be hard to detect such out of bounds access, use after free, etc. In many cases writes will succeed, only to corrupt some other seemingly unrelated object, which may cause a crash later, or incorrect program output, so best to be very careful about C and C++ memory management.
data = new int; // will be some virtual address
data[1000] = 5; // possibly the start of a 4K page potentially allowing a great deal beyond it
other_int = new int[5];
other_int[10] = 10;
data[10000] = 42; // with further pages beyond, so you can really make a mess of your programs memory
other_int[10] == 42; // perfectly possible to overwrite other things in unexpected ways
C++ provides many tools to help, such as std::string, std::vector and std::unique_ptr, and it is generally best to try and avoid manual new and delete entirely.
new int allocates 1 integer only. If you access offsets larger than 0, e.g. data[1] you override the memory.
int * is a pointer to something that's probably an int. When you allocate using new int , you're allocating one int and storing the address to the pointer. In reality, int * is just a pointer to some memory.
We can treat an int * as a pointer to a scalar element (i.e. new int) or an array of elements -- the language has no way of telling you what your pointer is really pointing to; a very good argument to stop using pointers and only using scalar values and std::vector.
When you say a[2], you well access the memory sizeof(int) after the value pointed to by a. If a is pointing to a scalar value, anything could be after a and reading it causes undefined behaviour (your program might actually crash -- this is an actual risk). Writing to that adress will most likley cause problems; it is not merely a risk, but something you should actively guard against -- i.e. use std::vector if you need an array and int or int& if you don't.
The expression a[b], where one of the operands is a pointer, is another way to write *(a+b). Let's for the sake of sanity assume that a is the pointer here (but since addition is commutative it can be the other way around! try it!); then the address in a is incremented by b times sizeof(*a), resulting in the address of the bth object after *a.
The resulting pointer is dereferenced, resulting in a "name" for the object whose address is a+b.
Note that a does not have to be an array; if it is one, it "decays" to a pointer before the operator [] is applied. The operation is taking place on a typed pointer. If that pointer is invalid, or if the memory at a+b does not in fact hold an object of the type of *a, or even if that object is unrelated to *a (e.g., because it is not in the same array or structure), the behavior is undefined.
In the real world, "normal" programs do not do any bounds checking but simply add the offset to the pointer and access that memory location. (Accessing out-of-bounds memory is, of course, one of the more common bugs in C and C++, and one of the reasons these languages are not without restrictions recommended for high-security applications.)
If the index b is small, the memory is probably accessible by your program. For plain old data like int the most likely result is then that you simply read or write the memory in that location. This is what happened to you.
Since you overwrite unrelated data (which may in fact be used by other variables in your program) the results are often surprising in more complex programs. Such errors can be hard to find, and there are tools out there to detect such out-of-bounds access.
For larger indices you'll at some point end up in memory which is not assigned to your program, leading to an immediate crash on modern systems like Windows NT and up, and unpredictable results on architectures without memory management.
I am still able to store values in data[2] or data[3]. But I created an array of size 1. How is this possible?
The behaviour of the program is undefined.
Also, you didn't create an array of size 1, but a single non-array object instead. The difference is subtle.

zero array size not allowed in c++ [duplicate]

A simple test app:
cout << new int[0] << endl;
outputs:
0x876c0b8
So it looks like it works. What does the standard say about this? Is it always legal to "allocate" empty block of memory?
From 5.3.4/7
When the value of the expression in a direct-new-declarator is zero, the allocation function is called to allocate an array with no elements.
From 3.7.3.1/2
The effect of dereferencing a pointer returned as a request for zero size is undefined.
Also
Even if the size of the space requested [by new] is zero, the request can fail.
That means you can do it, but you can not legally (in a well defined manner across all platforms) dereference the memory that you get - you can only pass it to array delete - and you should delete it.
Here is an interesting foot-note (i.e not a normative part of the standard, but included for expository purposes) attached to the sentence from 3.7.3.1/2
[32. The intent is to have operator new() implementable by calling malloc() or calloc(), so the rules are substantially the same. C++ differs from C in requiring a zero request to return a non-null pointer.]
Yes, it is legal to allocate a zero-sized array like this. But you must also delete it.
What does the standard say about this? Is it always legal to "allocate" empty block of memory?
Every object has a unique identity, i.e. a unique address, which implies a non-zero length (the actual amount of memory will be silently increased, if you ask for zero bytes).
If you allocated more than one of these objects then you'd find they have different addresses.
Yes it is completely legal to allocate a 0 sized block with new. You simply can't do anything useful with it since there is no valid data for you to access. int[0] = 5; is illegal.
However, I believe that the standard allows for things like malloc(0) to return NULL.
You will still need to delete [] whatever pointer you get back from the allocation as well.
Curiously, C++ requires that operator new return a legitimate pointer
even when zero bytes are requested. (Requiring this odd-sounding
behavior simplifies things elsewhere in the language.)
I found Effective C++ Third Edition said like this in "Item 51: Adhere to convention when writing new and delete".
I guarantee you that new int[0] costs you extra space since I have tested it.
For example,
the memory usage of
int **arr = new int*[1000000000];
is significantly smaller than
int **arr = new int*[1000000000];
for(int i =0; i < 1000000000; i++) {
arr[i]=new int[0];
}
The memory usage of the second code snippet minus that of the first code snippet is the memory used for the numerous new int[0].

Why the allocation succeeds for size zero bytes?

This is similar to What does zero-sized array allocation do/mean?
I have following code
int *p = new int[0];
delete []p;
p gets an address and gets deleted properly.
My question is: Why allocation of zero bytes is allowed by c++ Standard in the first place?
Why doesn't it throw bad_alloc or some special exception ?
I think, It is just postponing the catastrophic failure, making programmer's life difficult. Because if size to be allocated is calculated at run time and if programmer assumes its allocated properly and tries to write something to that memory, ends up corrupting memory !!! and Crash may happen some where else in the code.
EDIT: How much memory it allocates upon zero size request ?
Why would you want it to fail? If the programmer tries to read/write to non-existent elements, then that is an error. The initial allocation is not (this is no different to e.g. int *p = new int[1]; p[1] = 5;).
3.7.3.1/2:
[32. The intent is to have operator new() implementable by calling malloc() or calloc(), so the rules are substantially the same. C++ differs from C in requiring a zero request to return a non-null pointer.]
Compare dynamically allocated array to std::vector for example. You can have a vector of size 0, so why not allow the same for the array? And it is always an error to access past the end of the array whether its size is 0 or not.
Long time ago, before using exceptions, the malloc function returned a NULL pointer if the allocation failed.
If allocating zero bytes would also return a NULL pointer, it would be hard to make the distinction between a failed allocation and a succeeding-zero-bytes allocation.
On the other hand if the allocation of zero bytes would return a non-NULL pointer, you end up with a situation in which two different allocations of zero bytes can have the same pointer.
Therefore, to keep things simple, the malloc function of zero bytes allocates 1 byte.
The same can be said for int[N] where N>0:
Because if size to be allocated is calculated at run time and if programmer assumes its allocated properly and tries to write something past end of that memory, ends up corrupting memory !!! and Crash may happen some where else in the code.
Zero sized array allocation is covered in the ISO C++ Standard under 5.3.4, paragrahp 7
When the value of the expression in a direct-new-declarator is zero, the allocation function is called to allocate an array with no elements.
This makes code that performs dnaymic array allocation easier.
In general: If someone calls a function and asks it to return an array with n (0 in your case) elements, the code shouldn't be trying to read the returned array past the n-nth element anyway.
So, I don't really see the catastrophic failure, since the code would have been faulty to begin with for any n.
As you say:
Because if size to be allocated is calculated at run time and if programmer assumes its allocated properly
The calculated size would be "0", if he tries to access more than his calculated size then, well.. I am repeating myself ;)

C++ new int[0] -- will it allocate memory?

A simple test app:
cout << new int[0] << endl;
outputs:
0x876c0b8
So it looks like it works. What does the standard say about this? Is it always legal to "allocate" empty block of memory?
From 5.3.4/7
When the value of the expression in a direct-new-declarator is zero, the allocation function is called to allocate an array with no elements.
From 3.7.3.1/2
The effect of dereferencing a pointer returned as a request for zero size is undefined.
Also
Even if the size of the space requested [by new] is zero, the request can fail.
That means you can do it, but you can not legally (in a well defined manner across all platforms) dereference the memory that you get - you can only pass it to array delete - and you should delete it.
Here is an interesting foot-note (i.e not a normative part of the standard, but included for expository purposes) attached to the sentence from 3.7.3.1/2
[32. The intent is to have operator new() implementable by calling malloc() or calloc(), so the rules are substantially the same. C++ differs from C in requiring a zero request to return a non-null pointer.]
Yes, it is legal to allocate a zero-sized array like this. But you must also delete it.
What does the standard say about this? Is it always legal to "allocate" empty block of memory?
Every object has a unique identity, i.e. a unique address, which implies a non-zero length (the actual amount of memory will be silently increased, if you ask for zero bytes).
If you allocated more than one of these objects then you'd find they have different addresses.
Yes it is completely legal to allocate a 0 sized block with new. You simply can't do anything useful with it since there is no valid data for you to access. int[0] = 5; is illegal.
However, I believe that the standard allows for things like malloc(0) to return NULL.
You will still need to delete [] whatever pointer you get back from the allocation as well.
Curiously, C++ requires that operator new return a legitimate pointer
even when zero bytes are requested. (Requiring this odd-sounding
behavior simplifies things elsewhere in the language.)
I found Effective C++ Third Edition said like this in "Item 51: Adhere to convention when writing new and delete".
I guarantee you that new int[0] costs you extra space since I have tested it.
For example,
the memory usage of
int **arr = new int*[1000000000];
is significantly smaller than
int **arr = new int*[1000000000];
for(int i =0; i < 1000000000; i++) {
arr[i]=new int[0];
}
The memory usage of the second code snippet minus that of the first code snippet is the memory used for the numerous new int[0].