declaring an dynamic size array - c++

If I want to declare a dynamic size array in the main function, I can do:-
int m;
cin>>m;
int *arr= new int[m];
The following cannot be done as while compiling the compiler has to know the size of the every symbol except if it is an external symbol:-
int m;
cin>>m;
int arr[m];
My questions are:
Why does the compiler have to know the size of arr in the above code? It is a local symbol which is not defined in the symbol table. At runtime, the stack takes care of it(same way as m). Is it because the compiler has to ascertain the size of main() (a global symbol) which is equal to the size of all objects defined in it?
If I have a function:
int func(int m)
Could I define int arr[m] inside the function or still I would have to do
int *a= new int[m]

For instance :
int MyArray[5]; // correct
or
const int ARRAY_SIZE = 6;
int MyArray[ARRAY_SIZE]; // correct
but
int ArraySize = 5;
int MyArray[ArraySize]; // incorrect
Here is also what is explained in The C++ Programming Language, by Bjarne Stroustrup :
The number of elements of the array, the array bound, must be a constant expression (§C.5). If you need variable bounds, use a vector (§3.7.1, §16.3). For example:

To answer your questions:
1) Q: Why does the compiler have to know the size of arr in the above code?
A: If you generate assembly output, you'll notice a "subtract" of some fixed value to allocate your array on the stack
2) Q: Could I define int arr[m] i ... inside the function?
A: Sure you could. And it will become invalid the moment you exit the function ;)
Basically, you don't want an "array". A C++ "vector" would be a good alternative:
std::vector<A> v(5, A(2));
Here are a couple of links you might enjoy:
http://www.parashift.com/c++-faq/arrays-are-evil.html
http://blogs.msdn.com/b/ericlippert/archive/2008/09/22/arrays-considered-somewhat-harmful.aspx

Related

Normal array declaration vs. dynamic array declaration

I just started learning C++. I learned the easy way of declaring arrays and now I'm confused about the usage of
int* foo = new int[n];
and how it is different from
int foo [n];
I tried testing with code but couldn't find any difference. I read from sources that using "new" requires me to manually de-allocate the memory after I don't need it anymore. In that case, there is no advantage in using "new" or dynamic memory allocation at all. Am I missing something here?
I tried running this:
#include <iostream>
int main() {
int n;
std::cout << "array size" ;
std::cin >> n ;
std::cout << n ;
int foo [n]; //line A
// int* foo = new int[n]; //line B
foo[6] = 30;
std::cout<<foo[6]<<std::endl;
}
Commenting out line B to run line A, or vice versa, gave the exact same result.
There are several ways these are different. First, let's talk about this one:
int n = 10;
int array[n];
This is not part of the ANSI C++ standard and may not be supported by all compilers. You shouldn't count on it. Imagine this code:
int n = 10;
int array[n];
n = 20;
How big is array?
Now, this is the way you can do it (but it's still problematic):
int n = 10;
int * array = new int[n];
Now, that is legal. But you have to remember later to:
delete [] array;
array = nullptr;
Now, there are two other differences. The first one allocates space on the stack. The second allocates space on the heap, and it's persistent until you delete it (give it back). So the second one could return the array from a function, but the first one can't, as it disappears when the function exits.
HOWEVER... You are strongly, strongly discouraged from doing either. You should instead use a container class.
#include <array>
std::array<int, n> array;
The advantages of this:
It's standard and you can count on it
You don't have to remember to free it
It gives you range checking

Why can I declare a 2D array with both dimensions sized variable but not new one?

As the problem stated, this is doable:
#include <iostream>
int main(int argc, char *argv[])
{
unsigned short int i;
std::cin >> i;
unsigned long long int k[i][i];
}
Here I declared an array that is sized i by i, both dimensions are variables.
But not this:
#include <iostream>
int main(int argc, char *argv[])
{
unsigned short int i;
std::cin >> i;
unsigned long long int** k = new int[i][i];
delete[] k;
}
I got an compiler message telling me that
error: only the first dimension of an allocated array may have dynamic
size
I am forced to do this:
#include <iostream>
int main(int argc, char *argv[])
{
unsigned short int i;
std::cin >> i;
unsigned long long int** k = new unsigned long long int*[i];
for ( unsigned short int idx = 0 ; idx < i ; ++ i )
k[idx] = new unsigned long long int[i];
for ( unsigned short int idx = 0 ; idx < i ; ++ i )
delete[] k[idx];
delete[] k;
}
To my understanding, new and delete are used to allocate something on heap, not on stack, which won't be deleted when it goes out of scope, and is useful for passing datas across functions and objects, etc.
What I don't understand is what happens when I declare that k in the first example, I am told that declared array should (and could) only have constant dimensions, and when in need for a array of unknown size, one should always consider new & delete or vectors.
Is there any pros and cons to those two solutions I'm not getting, or is it just what it is?
I'm using Apple's LLVM compiler by the way.
Neither form is C++ standard compliant, because the standard does not support variable-length arrays (VLAs) (interestingly, C99 does - but C is not C++). However, several compilers have an extension to support this, including your compiler:
From Clang's Manual:
Clang supports such variable length arrays in very limited circumstances for compatibility with GNU C and C99 programs:
The element type of a variable length array must be a POD ("plain old data") type, which means that it cannot have any user-declared constructors or destructors, any base classes, or any members of non-POD type. All C types are POD types.
Variable length arrays cannot be used as the type of a non-type template parameter.
But given that the extension is in place, why doesn't your second snippet work? That's because VLA only applies to automatic variables - that is, arguments or local variables. k is automatic but it's just a pointer - the array itself is defined by new int[i][i], which allocates on the heap and is decidedly not an automatic variable.
You can read more about this on the relevant GCC manual section.
I'm sure you can find implementation for 2D array functionality easily, but you can make your own class too. The simplest way is to use std::vector to hold the data and have an index-mapping function that takes your two coordinates and return a single index into the vector.
The client code will look a little different, instead of arr[x][y] you have arr.at(x,y) but otherwise it does the same. You do not have to fiddle with memory management as that is done by std::vector, just use v.resize(N*N) in constructor or dimension-setting function.
Essentially what compilers generally do with two-dimensional arrays (fixed or variable) is this:
int arr[x][y] ---> int arr[x*y];
arr[2][4]= something ---> arr[2+4*x]= something;
Basically they are just a nicer way of notation of a one-dimensional array (on the stack). Most compilers require fixed sizes, so the compiler has an easier way of telling what the dimensions are (and thus what to multiply with). It appears you have just a compiler, which can keep track of the dimensions (and multipliers) even if you use variables.
Of course you can mimick that with new[] yourself too, but it's not supported by the compiler per se.
Probably for the same reason, i.e. because it would be even harder keeping track of the dimensions, especially when moving the pointers around.
E.g. with a new-pointer you could later write:
newarr= someotherarray;
and someotherarray could be something with even different dimensions. If the compiler did a 2-dim -> one dim translation, he'd have to track all possible size transitions.
With the stack allocated arr above, this isn't necessary, because at least once the compiler made it, it stays that size.

Multiple arrays in a class and XCode

I am trying to use XCode for my project and have this code in my .h:
class FileReader
{
private:
int numberOfNodes;
int startingNode;
int numberOfTerminalNodes;
int terminalNode[];
int numberOfTransitions;
int transitions[];
public:
FileReader();
~FileReader();
};
I get a "Field has incomplete type int[]" error on the terminalNode line... but not on the transitions line. What could be going on? I'm SURE that's the correct syntax?
Strictly speaking the size of an array is part of its type, and an array must have a (greater than zero) size.
There's an extension that allows an array of indeterminate size as the last element of a class. This is used to conveniently access a variable sized array as the last element of a struct.
struct S {
int size;
int data[];
};
S *make_s(int size) {
S *s = (S*)malloc(sizeof(S) + sizeof(int)*size);
s->size = size;
return s;
}
int main() {
S *s = make_s(4);
for (int i=0;i<s->size;++i)
s->data[i] = i;
free(s);
}
This code is unfortunately not valid C++, but it is valid C (C99 or C11). If you've inherited this from some C project, you may be surprised that this works there but not in C++. But the truth of the matter is that you can't have zero-length arrays (which is what the incomplete array int transitions[] is in this context) in C++.
Use a std::vector<int> instead. Or a std::unique_ptr<int[]>.
(Or, if you're really really really fussy about not having two separate memory allocations, you can write your own wrapper class which allocates one single piece of memory and in-place constructs both the preamble and the array. But that's excessive.)
The original C use would have been something like:
FileReader * p = malloc(sizeof(FileReader) + N * sizeof(int));
Then you could have used p->transitions[i], for i in [0, N).
Such a construction obviously doesn't make sense in the object model of C++ (think constructors and exceptions).
You can't put an unbound array length in a header -- there is no way for the compiler to know the class size, thus it can never be instantiated.
Its likely that the lack of error on the transitions line is a result of handling the first error. That is, if you comment out terminalNode, transitions should give the error.
It isn't. If you're inside a struct definition, the compiler needs to know the size of the struct, so it also needs to know the size of all its elements. Because int [] means an array of ints of any length, its size is unknown. Either use a fixed-size array (int field[128];) or a pointer that you'll use to malloc memory (int *field;).

Why i can't watch the expression a[1][1] after declare it by a[n][n] in c++?

my code:
#include <iostream>
using namespace std;
int main() {
int n=5;
int a[n][n];
a[1][1]=5;
return 0;
}
I got this error when trying to watch the expression a[1][1] in eclipse on line 6:
Failed to execute MI command:
-data-evaluate-expression a[1][1] Error message from debugger back end:
Cannot perform pointer math on
incomplete types, try casting to a
known type, or void *.
i guess it's returned from gdb? however, i don't know why i can't watch that value? Isn't "a" is a normal multi-dimensional array?
For some odd reasons this isn't valid C++ unless you make it
const int n = 5;
Otherwise the array size is formally unknown until runtime.
C++ doesn't suppose variable length array (VLA). So your code is not standard conformant code.
It will not compile if you compile it with g++ -pedantic. The array size must be constant expression. But in your code, its not.
So write:
const int n=5; //now this becomes constant!
int a[n][n]; //the size should be constant expression.
Lets try the above code, as its completely Standard conformant code now.
why not better do it a dynamic 2d array? In that case you do not have to make the n constant, and you can determine the size dynamically.
int **arr, n;
arr = new int * [n]; // allocate the 1st dimension. each location will hole one array
for (i=0; i<n; i++)
{
arr[i] = new int [n]; // allocate the 2nd dimension of one single n element array
// and assign it to the above allocated locations.
}
Now you can access the aray as arr[i][j]
To free to the reverse
for (i=0; i<n; i++)
{
delete [] arr[i]; // first delete all the 2nd dimenstion (arr[i])
}
delete [] arr; // then delete the location arays which held the address of the above (arr)

Dealing with array size

I happened to ask myself a question about arrays in c++.
Well, we all know that arrays are fixed collections of something, I say fixed because it is necessary to declare array length when defining arrays.
Well, let's consider an example:
char myarray[10] = {'\0'};
int sz = sizeof(myarray); // It is supposed to be 10
Well, it is correct, 10 is the number returned by sizeof. This can be done by the compiler because he knows how much space it placed for that variable.
Now consider what happens in this situation:
void dosome(mystruct* arr) {
int elements = sizeof(arr)/sizeof(mystruct);
for (int i = 0; i < elements; i++) {
// Do something hoping no overflow will ever occur
}
}
Nice... but I suppose it can be overflow prone. If I pass to this function an array I created in a "normal" way, everything should be fine:
mystruct array[20];
dosome(array);
No problem. But if I do this:
mystruct* array = (mystruct*)malloc(80*sizeof(mystruct));
dosome(array);
WHAT HAPPENS???????????????????
I would like to understand how sizeof behaves, this function is evaluated at compile time right??? ok, what happens when I use not an array, but something very cumbersome like a block of data like that one? furthermore, I could realloc it woth another call to malloc and ask to dosome to process that datablock again. Will it work?
I could try it physically, but I would get some exact answer about the behavioir of sizeof.
Thank you.
it's wrong starting from the mystruct array[20] example. Because the function receives a pointer type, and not an array type, it cannot deduce the number of elements in the array. you are actually getting the size of a mystruct* when you perform sizeof(arr).
You can use templates to write functions which take arrays as parameters, but the suggested way in C++ is to use vectors, if I am not wrong.
The "way" to receive arrays as parameters would be to write something like:
template <int N> void somefunction(int (&v)[N]);
EDIT corrected the function declaration. oops.
void dosome(mystruct* arr) {
int elements = sizeof(arr)/sizeof(mystruct);
for (int i = 0; i < elements; i++) {
// Do something hoping no overflow will ever occur
}
}
What type does arr have in this example? mystruct*! And it's size is most likely 4 or 8. If you want to pass statically/automatically allocated arrays (not new'd) to functions preserving the size so that your trick works, pass by REFERENCE!
template <int N>
void dosome(mystruct (& arr) [N]) {
for (int i = 0; i < N; i++) {
// Do something . No overflow will occur
}
}
Also note this
int a[20];
sizof a; //equals to 20*sizeof(int)
int* b = new int [20];
sizeof b; //equals to sizeof pointer, most likely 4
sizeof is a compile-time operator. And here it computes only the size of a pointer.