Dynamic sized array in C - c++

Here is a code snippet I was working with:
int *a;
int p = 10;
*(a+0) = 10;
*(a+1) = 11;
printf("%d\n", a[0]);
printf("%d\n", a[1]);
Now, I expect it to print
10
11
However, a window appears that says program.exe has stopped working.
The if I comment out the second line of code int p = 10; and then tun the code again it works.
Why is this happening? (What I wanted to do was create an array of dynamic size.)

There are probably at least 50 duplicates of this, but finding them may be non-trivial.
Anyway, you're defining a pointer, but no memory for it to point at. You're writing to whatever random address the pointer happened to contain at startup, producing undefined behavior.
Also, your code won't compile, because int *a, int p = 10; isn't syntactically correct -- the comma needs to become a semicolon (or you can get rid of the second int, but I wouldn't really recommend that).
In C, you probably want to use an array instead of a pointer, unless you need to allocate the space dynamically (oops, rereading, you apparently do want to -- so you need to use malloc to allocate the space, like a = malloc(2); -- but you also want to check the return value to before you use it -- at least in theory, malloc can return a null pointer). In C++, you probably want to use a std::vector instead of an array or pointer (it'll manage dynamic allocation for you).

No memory is being allocated for a, it's just an uninitialized pointer to an int (so there are two problems).
Therefore when data is stored in that location, the behavior is undefined. That means you may sometimes not even get a segmentation fault/program crash, or you may -> undefined. (Since C doesn't do any bounds checking, it won't alert you to these sort of problems. Unfortunately, one of the strength of C is also one of its major weaknesses, it will happily do what you ask of it)

You're not even allocating memory so you're accessing invalid memory...
Use malloc to allocate enough memory for your array:
int* a = (int*) malloc(sizeof(int)*arraySize);
//Now you can change the contents of the array

You will need to use malloc to assign memory that array.
If you want the size to by dynamic you will need to use realloc every time you wish to increase the size of the array without destroying the data that is already there

You have to allocate storage for the array. Use malloc if you're in C, new if it's C++, or use a C++ std::vector<int> if you really need the array size to be dynamic.

First a has not been initialized. What does it point to? Nothing, hopefully zero, but you do not know.
Then you are adding 1 to it and accessing that byte. IF a were 0, a+1 would be 1. What is in memory location 1?
Also you are bumping the address by one addressable memory unit. This may or may not be the size of an integer on that machine.

Related

In C++, is the length information of a dynamic array associated with the pointer or the address of the first element?

For example:
int *a, *b;
a = new int[10];
b = new int(12);
b = a; // I know there's memory leak, but let's ignore it first
delete [] b; // line L
What will happen? Will the entire array be deleted successfully?
What if line L is replaced by this:
b = a + 1;
delete [] b;
Or by this:
a++;
delete [] a;
And, lastly, if the length of an dynamic array is associated with the starting address, or in other words, associated with the array itself, do we have any way to get the length of it without using another variable to store the length?
Thanks a lot!
The memory block size and array length information is associated with the address of the object, which is the address of the first item of the array.
I.e. your delete[] is safe in the given code
int *a, *b;
a = new int[10];
b = new int(12);
b = a; // I know there's memory leak, but let's ignore it first
delete [] b; // line L
However, there is no portable way to 1access the associated information. It's an implementation detail. Use a std::vector if you need such info.
To understand why the info can't be accessed, note first that memory block size, needed for deallocation, can be larger than the array length times size of array item. So we're dealing here with two separate values. And for an array of POD item type, where item destructor calls are not necessary, there needs not be an explicitly stored array length value.
Even for an array where destructor calls are necessary, there needs not be an explicitly stored array length value associated with array. For example, in code of the pattern p = new int[n]; whatever(); delete[] p;, the compiler can in principle choose to put the array length in some unrelated place where it can easily be accessed by the code generated for the delete expression. E.g. it could be put in a processor register.
Depending on how smart the compiler is, there needs not necessarily be an explicitly stored memory block size either. For example, the compiler can note that in some function f, three arrays a, b and c are allocated, with their sizes known at the first allocation, and all three deallocated at the end. So the compiler can replace the three allocations with a single one, and ditto replace the three deallocations with a single one (this optimization is explicitly permitted by 2C++14 §5.3.4/10).
1 Except, in C++14 and later, in a deallocation function, which is a bit late.
2 C++14 §5.3.4/10: “An implementation is allowed to omit a call to a replaceable global allocation function (18.6.1.1, 18.6.1.2).
When it does so, the storage is instead provided by the implementation or provided by extending the
allocation of another new-expression. The implementation may extend the allocation of a new-expression e1
to provide storage for a new-expression e2 if …”
int *a, *b;
a = new int[10];
b = new int(12);
b = a; // I know there's memory leak, but let's ignore it first
delete [] b; // line L
Will the entire array be deleted successfully?
Yes, memory will successfully be freed since the pointer a was set to point to an array of 10 integers
a = new int[10];
then the same memory location address is stored into b
b = a;
therefore the delete[] line correctly deallocates the array chunk
delete [] b;
As you stated the code also leaks the integer variable 12 that was originally allocated and pointed to by b.
As a sidenote calling delete[] to free memory not associated with an array (specifically with a static type mismatch - see [expr.delete]/p3) would have triggered undefined behavior.
Now for some internals on where is the size information stored when allocating memory.
Compilers have some degree of freedom (cfr. §5.3.4/10) when it comes to memory allocation requests (e.g. they can "group" memory allocation requests).
When you call new without a supplied allocator, whether it will call malloc or not, it is implementation defined
[new.delete.single]
Executes a loop: Within the loop, the function first attempts to allocate the requested storage.
Whether the attempt involves a call to the Standard C library function malloc is unspecified.
Assuming it calls malloc() (on recent Windows systems it calls the UCRT), attempting to allocate space means doing book-keeping with paged virtual memory, and that's what malloc usually does.
Your size information gets passed along together with other data on how to allocate the memory you requested.
C libraries like glibc take care of alignment, memory guards and allocation of a new page (if necessary) so this might end up in the kernel via a system call.
There is no "standard" way to get that info back from the address only since it's an implementation detail. As many have suggested, if you were to need such an info you could
store the size somewhere
use sizeof() with an array type
even better for C++: use a std::vector<>
That info is furthermore associated with a memory allocation, both of your secondary cases (i.e. substituting the L line with
b = a + 1; delete [] b;
or
a++; delete [] a;
will end in tears since those addresses aren't associated with a valid allocation.
The c++ standard just says that using delete[] on a pointer allocated with new, and using delete on a pointer allocated with new[] is undefined behavior.
So doing this may work, or may not, depending on implementations. It may set your house on fire.
For instance, just suppose that these functions are based on an underlying buffer using malloc() and free().
new and delete will, in most implementations, use a buffer which has exactly the size and address of the item (here an int)
new[] and delete[] are more complex. They must store in the buffer not only size items, but also the value of size. An implementation could store size before the actual items. This would mean that the pointer of the underlying buffer is not the same as the pointer to the first item, which is the value returned by new[]
Mixing array and non array versions would then call free() on invalid pointers, ie pointers that were never returned by malloc. This will crash or trigger an exception
Of course, all this is implementation defined.

Is pointer to variable the same as a pointer to an array of one element, or of zero elements?

Array with size 0 Has good explanations of zero-length arrays and is certainly worthwhile and pertinent. I am not seeing it compare zero-length with single-element arrays and with pointer-to-variable.
When I asked before (Is c++ delete equivalent to delete[1]?) I did not express myself well. My question seemed to be the same or included in more general answers about new, new[], delete, and delete[]. Some understood that I was asking only about a single element. All answers in comments seemed correct and consistent.
There is a question that looks like the same as this question. But the body is about using C++ and Java together. Here, we are talking only about C++.
Checking my understanding
I will present pairs of proposed equivalent statements. The statements are declarations or instantiations of a pointer to a single variable followed by a pointer to an array of one element. Then I will state why I would think they are equivalent or not.
Are these pairs equivalent?
int one = 1;
// Sample 1. Same in the sense of pointing to an address somewhere
// whose contents equal one. Also same in the sense of not being able to
// change to point to a different address:
int * const p_int = &one;
int a_int[1] = {1};
// Sample 2.
int * p_int1 = new int;
int * a_int1 = new int[1];
// Sample 3.
delete p_int1;
delete[] a_int1;
// Sample 4. If Sample 3 is an equivalent pair, then (given Sample 2)
// we can write
delete[] p_int1;
delete a_int1;
Granted, Sample 4 is bad practice.
I am thinking: "delete" will call the destructor of the object. delete[] will call the destructor for each element of the array, and then call the destructor for the array. new in Sample 2 would malloc (so to speak) the variable. new[] would malloc an array of one element, then malloc the one element. And then that one element would be set equal to 1. So, I'm thinking THAT'S why I need to call delete[] and not delete when I have an array of even one element. Am I understanding?
And if I am understanding, then calling delete instead of delete[] to free an array of one element, then I will certainly have a memory leak. A memory leak is the specific "bad thing" that will happen.
However, what about this:
int * a_int0 = new int[0];
delete a_int0;
Would THAT result in a memory leak?
I invite corrections of my misuse of terminology and anything else.
Sample 1:
int const * p_int = &one;
int a_int[1] = {1};
NO, these are not equivalent. A pointer is not the same thing as an array. They are not equivalent for the same reason that 1 is not the same as std::vector<int>{1}: a range of one element is not the same thing as one element.
Sample 2:
int * p_int1 = new int;
int * a_int1 = new int[1];
These are sort of equivalent. You have to delete them differently, but otherwise the way you would use p_int1 and a_int1 is the same. You could treat either as a range (ending at p_int1+1 and a_int1+1, respectively).
Sample 3:
delete p_int1;
delete[] a_int1;
These are I suppose equivalent in the sense that both correctly deallocate the respective memory of the two variables.
Sample 4:
delete[] p_int1;
delete a_int1;
These are I suppose equivalent in the sense that both incorrectly deallocate the respective memory of the two variables.
int const * p_int = &one;
int a_int[1] = {1};
They are not equivalent, the first is a pointer to another variable, the other a mere array of size one, initialized straight away with the value one.
Understand this: pointers are entities in themselves, distinct from what they point to. Which is to say, the memory address of your p_int there, is entirely different from the memory address of your variable one. What's now stored in your p_int however, is the memory address of your variable one.
// Sample 2.
int * p_int1 = new int;
int * a_int1 = new int[1];
Here though, they are effectively the same thing in terms of memory allocation. In both cases you create a single int on the heap(new means heap space), and both pointers are immediately assigned the address of those heap allocated ints. The latter shouldn't be used in practice though, even though it's technically not doing anything outright wrong, it's confusing to read for humans as it conveys the notion that there is an array of objects, when there is in reality just one object, ie no arrayment has taken place.
// Sample 3.
delete p_int1;
delete[] a_int1;
Personally, I've never used the "array" delete, but yeah, what you're saying is the right idea: delete[] essentially means "call delete on every element in the array", while regular delete means "delete that one object the pointer points to".
// Sample 4. If Sample 3 is an equivalent pair, then (given Sample 2)
// we can write
delete[] p_int1;
delete a_int1;
Delete on array is undefined according to the standard, which means we don't really know what'll happen. Some compilers may be smart enough to see what you mean is a regular delete and vice versa, or they may not, in any case this is risky behavior, and as you say bad practice.
Edit: I missed your last point. In short, I don't know.
new int[0];
Basically means "allocate space for 0 objects of type int". That of course means 0 * 4, ie 0 bytes. As for memory leaks, what seems intuitive is no, as no memory has been allocated, so if the pointer goes out of scope, there is nothing on the heap anyway. I can't give a more in dept answer than that. My guess is that this is an example of undefined behavior, that doesn't have any consequences at all.
However, memory leaks happen when you do this:
void foo()
{
int* ptr = new int;
}
Notice how no delete is called as the function returns. The pointer ptr itself gets deallocated automatically, since it's located on the stack(IE automatic memory), while the int itself is not. Since heap memory isn't automatically deallocated, that tiny int will no longer be addressable, since you got no pointer to it anymore in your stack. Basically, those 4 bytes of heap memory will be marked as "in use" by the operating system, for as long as the process runs, so it won't be allocated a second time.
Edit2: I need to improve my reading comprehension, didn't notice you did delete the variable. Doesn't change much though: You never have memory leaks when you remember to delete, memory leaks arise when you forget to call delete on heap allocated objects(ie new) prior to return from the scope the pointer to heap memory was located. My example is the simplest one I could think of.
This is a syntax-only answer. I checked this in a debugger with the following code:
// Equivalence Test 1.
*p_int = 2;
p_int[0] = 3;
*a_int = 2;
a_int[0] = 3;
Because I can access and manipulate the declarations as an array or as a pointer to variable, I think the declarations are syntactically equivalent, at least approximately.
I have to apologize
(1) I did not think to define my terms clearly. And it is very hard to talk about anything without my defining my terms. I should have realized and stated that I was thinking syntax and what a typical compiler would probably do. (2) I should have thought of checking in a debugger much earlier.
I think the previous answers are correct semantically. And of course good programming practice would declare an array when an array is the meaning, etc. for a pointer to variable.
I appreciate the attention that has been given to my question. And I hope you can be accepting of my slowness to figure out what I am trying to say and ask.
I figure that the other declarations can be checked out similarly in a debugger to see whether they are syntactically equivalent.
A Compiler's generating the same assembly would show syntactic equivalence. But if the assembly generated differs, then the assembly needs to be studied to see whether each does the same thing or not.

Windows C++ Integer Array Heap Corruption

I have been working on sorting algorithms for school and have come across a strange issue. When ever I create an integer array bigger than six elements large I get breaks in "free.c" and heap corruption errors.
The code I have narrowed it down to is as follows.
#include <iostream>
using namespace std;
int main(){
int * pie = new int(7);
pie[6] = 1;
cout << pie[6];
return 0;
}
Sometimes you need to assign more than just the last value, however I can get this error on Visual Studio 2012 and 2010 on multiple computers, in Linux this code works perfectly fine however.
Is this an issue with Windows, have I been doing dynamic int arrays wrong forever or what?
Note:After running this several times, sometimes the output in VS will say something about adding a heap protection shunt which seems to resolve the test throwing the exception but still doesn't solve the issue in larger applications (and I would feel bad having to need such protection applied to my code).
Thanks!
In this case you are allocating a single integer which has the value 7 but treating it like an array of 7 elements. You need to do an actual array allocation
int* pie = new int[7];
Also wouldn't hurt to free the memory at the end of main :)
delete[] pie;
new int(7) allocates a single int, value 7. It doesn't allocate space for 7 int values.
To solve your issue, you need to use:
int* pie = new int[7];
Otherwise, you only allocate one int.
Executing the code below, will overwrite whatever is beyond the array boundaries.
int * pie = new int(7);
pie[6] = 1;
This can lead to hard-to-track random bugs. For instance if it overwrites a pointer…
More informations in: C++ Dynamic Memory allocations
Operators new and new[]
In order to request dynamic memory we use the operator new. new is followed by a data type specifier and -if a
sequence of more than one element is required- the number of these
within brackets []. It returns a pointer to the beginning of the new
block of memory allocated. Its form is:
pointer = new type pointer = new type [number_of_elements]
The first expression is used to allocate memory to contain one single element of type type. The second one is used to assign a
block (an array) of elements of type type, where number_of_elements
is an integer value representing the amount of these. For example:
int * bobby;
bobby = new int [5];
In this case, the system dynamically assigns space for five elements
of type int and returns a pointer to the first element of the
sequence, which is assigned to bobby. Therefore, now, bobby points to
a valid block of memory with space for five elements of type int.
 
Operators delete and delete[]
Since the necessity of dynamic memory is usually limited to specific
moments within a program, once it is no longer needed it should be
freed so that the memory becomes available again for other requests of
dynamic memory. This is the purpose of the operator delete, whose
format is:
delete pointer;
delete [] pointer;
The first expression should be used to delete memory allocated for a
single element, and the second one for memory allocated for arrays of
elements.
The value passed as argument to delete must be either a pointer to a
memory block previously allocated with new, or a null pointer (in the
case of a null pointer, delete produces no effect).

Is this a memory leak, if not is it guaranteed by the standard?

Related to a recent question, I wrote the following code:
int main()
{
char* x = new char[33];
int* sz = (int*)x;
sz--;
sz--;
sz--;
sz--;
int szn = *sz; //szn is 33 :)
}
I do know it's not safe and would never use it, but it brings to mind a question:
Is the following safe? Is it a memory leak?
char* allocate()
{
return new char[20];
}
int main()
{
char* x = allocate();
delete[] x;
}
If it's safe, doesn't that mean we can actually find the size of the array? Granted, not in a standard way, but is the compiler required to store information about the size of the array?
I am not using or plan on using this code. I know it is undefined behavior. I know it isn't guaranteed by anything. It's just a theoretical question!
Is the following safe?
Yes, of course that's safe. First snippet has UB however.
If it's safe, doesn't that mean we can actually find the size of the array? Granted, not in a standard way, but is the compiler required to store information about the size of the array?
Yes, generally extra data is stored before the first element. This is used to call the correct number of destructors. It's UB to access this.
required to store information about the size of the array?
No. It only requires delete[] work as expected. new int[10] could simply be a plain malloc call, which would not necessarily store the requested size 10.
This is safe, and is not a memory leak. The standards require that delete[] handle the freeing of memory by any array allocation.
If it's safe, doesn't that mean we can actually find the size of the array?
The standards don't put specific requirements on where and how the allocated size is stored. This could be discoverable as shown above, but different compilers/platforms could also use a completely different methodology. As such, it's not safe to rely on this technique to discover the size.
I know that in c, the size of any malloc on the heap resides before the pointer. The code for free relies on this. This is documented in K&R.
But you should not rely on this always being there or always being in the same position.
If you want to know the array length then I would suggest you create a class or struct to record capcity along side the actual array, and pass that around your program where you would previously just pass a char*.
int main()
{
char* x = new char[33];
int* sz = (int*)x;
sz--;
sz--;
sz--;
sz--;
int szn = *sz; //szn is 33 :)
}
This is an undefined behavior, because you access the memory location that you didn't allocate.
is the compiler required to store information about the size of the array?
No.
If it's safe, doesn't that mean we can actually find the size of the array?
You do not do anything special in the 2nd code snipet, therefore it's safe. But there are no ways to get the size of the array.
I am not sure it the delete must know the size of the array when the array is allocated with basic types (that doesn't demand a call to the destructor). In visual studio compilers, the value is stored only for user defined objects (in this case, delete[] must know the size of the array, as it must call their destructors).
Where in the memory the size is allocated is undefined (in visual studio it is in the same place of the gcc).
http://www.parashift.com/c++-faq-lite/freestore-mgmt.html#faq-16.14
There are two ways to destroy an array, depending on how it was created. In both cases the compiler is required to call a destructor for each element of the array, so the number of elements in the array must be known.
If the array is an automatic variable on the stack, the number of elements is known at compile time. The compiler can hard-code the number of elements in the code it emits for destroying the array.
If the array is dynamically allocated on the heap, there must be another mechanism for knowing the element count. That mechanism is not specified by the standard, nor is it exposed in any other fashion. I think that putting the count at an offset from the front of the array is a common implementation, but it's certainly not the only way, and the actual offset is just a private implementation detail.
Since the compiler must know how many elements are in the array, you'd think it would be possible for the standard to mandate a way of making that count available to programs. Unfortunately this is not possible because the count is only known at destruction time. Imagine that the standard included a function count_of that could access that hidden information:
MyClass array1[33];
MyClass * array2 = new MyClass[33];
cout << count_of(array1) << count_of(array2); // outputs 33 33
Foo(array1);
Foo(array2);
MyClass * not_array = new MyClass;
Foo(not_array);
void Foo(MyClass * ptr)
{
for (int i = 0; i < count_of(ptr); ++i) // how can count_of work here?
...
}
Since the pointer passed to Foo has lost all its context, there's no consistent way for the compiler to know how many elements are in the array, or even if it's an array at all.

Why use array size 1 instead of pointer?

In one C++ open source project, I see this.
struct SomeClass {
...
size_t data_length;
char data[1];
...
}
What are the advantages of doing so rather than using a pointer?
struct SomeClass {
...
size_t data_length;
char* data;
...
}
The only thing I can think of is with the size 1 array version, users aren't expected to see NULL. Is there anything else?
With this, you don't have to allocate the memory elsewhere and make the pointer point to that.
No extra memory management
Accesses to the memory will hit the memory cache (much) more likely
The trick is to allocate more memory than sizeof (SomeClass), and make a SomeClass* point to it. Then the initial memory will be used by your SomeClass object, and the remaining memory can be used by the data. That is, you can say p->data[0] but also p->data[1] and so on up until you hit the end of memory you allocated.
Points can be made that this use results in undefined behavior though, because you declared your array to only have one element, but access it as if it contained more. But real compilers do allow this with the expected meaning because C++ has no alternative syntax to formulate these means (C99 has, it's called "flexible array member" there).
This is usually a quick(and dirty?) way of avoiding multiple memory allocations and deallocations, though it's more C stylish than C++.
That is, instead of this:
struct SomeClass *foo = malloc(sizeof *foo);
foo->data = malloc(data_len);
memcpy(foo->data,data,data_len);
....
free(foo->data);
free(foo);
You do something like this:
struct SomeClass *foo = malloc(sizeof *foo + data_len);
memcpy(foo->data,data,data_len);
...
free(foo);
In addition to saving (de)allocation calls, this can also save a bit of memory as there's no space for a pointer and you could even use space that otherwise could have been struct padding.
Usually you see this as the final member of a structure. Then whoever mallocs the structure, will allocate all the data bytes consecutively in memory as one block to "follow" the structure.
So if you need 16 bytes of data, you'd allocate an instance like this:
SomeClass * pObj = malloc(sizeof(SomeClass) + (16 - 1));
Then you can access the data as if it were an array:
pObj->data[12] = 0xAB;
And you can free all the stuff with one call, of course, as well.
The data member is a single-item array by convention because older C compilers (and apparently the current C++ standard) doesn't allow a zero-sized array. Nice further discussion here: http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
They are semantically different in your example.
char data[1] is a valid array of char with one uninitialized element allocated on the stack. You could write data[0] = 'w' and your program would be correct.
char* data; simply declares a pointer that is invalid until initialized to point to a valid address.
The structure can be simply allocated as a single block of memory instead of multiple allocations that must be freed.
It actually uses less memory because it doesn't need to store the pointer itself.
There may also be performance advantages with caching due to the memory being contiguous.
The idea behind this particular thing is that the rest of data fits in memory directly after the struct. Of course, you could just do that anyway.