Passing a 2-D array the column is mandatory - c++

While passing a 2-Dimensional array we have to specify the the column.
eg:
void funtion1(a[])// works
{
}
void function2(a[][4])//works
{
}
void function3(a[][])//doesn't work
{
}
What could be the possible reasons that the function3 is considered an incorrect definition.
Is there a different way to define function3 so that we can leave both row and column blank.
Reading some replies:
Can you explain how x[n] and x[] are different?. I guess the former represents a specific array position and the latter is unspecified array. More explanation will be deeply appreciated.

You cannot pass a 2D array without specifying the second dimension, since otherwise, parameter "a" will decay to a pointer, the compiler needs to know how long the second dimension is to calculate the offsets (reason is that 2D array is stored as 1D in memory). Therefore, compiler must know size of *a, which requires that the second dimension be given. You can use vector of vectors to replace 2D array.

with void function2(a[][4]) it knows that there are 4 elements in each row. With void function3(a[][]) it doesn't know, so it can't calculate where a[i] should be.
Use a vector, since it's c++

C style arrays don't work the way you think. Think of them as a block of memory, and the dimensions tell the compiler how far to offset from the original address.
int a[] is basically a pointer and every element is an int, which means a[1] is equivalent of *(a + 1), where each 1 is sizeof(int) bytes. There's no limit or end (simplistically speaking) of the a array. You could use a[999999] and the compiler won't care.
int a[][4] is similar, but now the compiler knows that each row is 4*sizeof(int). So a[2][1] is *(a + 2*4 + 1)
int a[][] on the other hand, is an incomplete type, so to the compiler, a[2][1] is *(a + 2*?? + 1), and who know what ?? means.
Don't use int **a, that means an array of pointers, which is most likely what you don't want.
As some have said, with STL, use vectors instead. It's pretty safe to use std::vector<std::vector<int> > a. You'll still be able to get a[2][1].
And while you're at it, use references instead, const std::vector<std::vector<int> > &a. That way, you're not copying the whole array with each function call.

how does compiler calculate address of a[x][y]?
well simply:
address_of_a+(x*SECOND_SIZE+y)
imagine now that you want
a[2][3]
compiler has to computes:
address_of_a+(2*SECOND_SIZE+3)
if compiler doesn't know SECOND_SIZE how it can compute this?
you have to give it to him explicitly. you are using a[2][1], a[100][13] in your code, so compiler has to know how to compute addresses of these objects.
see more here

Related

Understanding syntax of function parameter: vector<vector<int>> A[]

I'm solving a question where a function is defined as following:
vector <int> func(int a, vector<vector<int>> B[]){
// Some stuff
}
I'm confused about why the second parameter is not simply vector<vector<int>> B. Why the extra [] part in the parameter? Can someone clarify the meaning of this?
The vector, B, is populated by a similar code snippet:
vector<vector<int>> B[10];
while(num--){
cin>>u>>v>>w;
vector<int> t1,t2;
t1.push_back(v);
t1.push_back(w);
B[u].push_back(t1);
t2.push_back(u);
t2.push_back(w);
B[v].push_back(t2);
}
Just as int foo(char str[]) is a function that takes a (c-style) array of characters, so int foo(vector<vector<int>> B[]) takes an array of vectors of vectors ... of integers. This means that it's three-dimensional data, requiring 3 indices to access the elements (fundamental data type; in this case, int), like B[i][j][k] = 5. Without the extra [] in the API it'd be two-dimensional data: a vector of vectors.
Note that int foo(char str[]) is equivalent to int foo(char str[5]) which is equivalent to int foo(char * str).
In C we usually add the [] to a function declaration to imply that we expect to receive an array of those elements; while * is often used when we expect at most one element. Likewise, adding the number [5] is basically just a comment to the user of the code that they expect 5 elements, but the compiler won't enforce this. These conventions carry over to C++ when we use these c-style arrays ... which is rare.
With c-style arrays there's either going to be a maximum array size in the comments somewhere; or, more commonly, it's provided as an input. That may be what the first argument of the function is supposed to represent.
I agree with KungPhoo here that this API looks suspiciously bad. I'd expect bugs/bad performance just because the choices seem very amateurish. The c-style array means the function can't know where the end of the c-style array is - but the vectors mean that we give up some of the (niche) benefits of c-style simplicity (especially because they're nested!). It seems to be getting the worst of both worlds. But, perhaps, there may be a very niche justification for the API.
B ist a static array (C-style) of 10 elements [0 .. 9]. It's not safe and this code is a terrible mess.
Better use std::array<std::vector<str::vector<int>>, 10> B; instead to have index checking.

How to pass dynamic and static 2d arrays as void pointer?

for a project using Tensorflow's C API I have to pass a void pointer (void*) to a method of Tensorflow. In the examples the void* points to a 2d array, which also worked for me. However now I have array dimensions which do not allow me to use the stack, which is why I have to use a dynamic array or a vector.
I managed to create a dynamic array with the same entries like this:
float** normalizedInputs;//
normalizedInputs = new float* [noCellsPatches];
for(int i = 0; i < noCellsPatches; ++i)
{
normalizedInputs[i] = new float[no_input_sizes];
}
for(int i=0;i<noCellsPatches;i++)
{
for(int j=0;j<no_input_sizes;j++)
{
normalizedInputs[i][j]=inVals.at(no_input_sizes*i+j);
////
////
//normalizedInputs[i][j]=(inVals.at(no_input_sizes*i+j)-inputMeanValues.at(j))/inputVarValues.at(j);
}
}
The function call needing the void* looks like this:
TF_Tensor* input_value = TF_NewTensor(TF_FLOAT,in_dims_arr,2,normalizedInputs,num_bytes_in,&Deallocator, 0);
In argument 4 you see the "normalizedInputs" array. When I run my program now, the calculated results are totally wrong. When I go back to the static array they are right again. What do I have to change?
Greets and thanks in advance!
Edit: I also noted that the TF_Tensor* input_value holds totally different values for both cases (for dynamic it has many 0 and nan entries). Is there a way to solve this by using a std::vector<std::vector<float>>?
Respectively: is there any valid way pass a consecutive dynamic 2d data structure to a function as void*?
In argument 4 you see the "normalizedInputs" array. When I run my program now, the calculated results are totally wrong.
The reason this doesn't work is because you are passing the pointers array as data. In this case you would have to use normalizedInputs[0] or the equivalent more explicit expression &normalizedInputs[0][0]. However there is another bigger problem with this code.
Since you are using new inside a loop you won't have contiguous data which TF_NewTensor expects. There are several solutions to this.
If you really need a 2d-array you can get away with two allocations. One for the pointers and one for the data. Then set the pointers into the data array appropriately.
float **normalizedInputs = new float* [noCellsPatches]; // allocate pointers
normalizedInputs[0] = new float [noCellsPatches*no_input_sizes]; // allocate data
// set pointers
for (int i = 1; i < noCellsPatches; ++i) {
normalizedInputs[i] = &normalizedInputs[i-1][no_input_sizes];
}
Then you can use normalizedInputs[i][j] as normal in C++ and the normalizedInputs[0] or &normalizedInputs[0][0] expression for your TF_NewTensor call.
Here is a mechanically simpler solution, just use a flat 1d array.
float * normalizedInputs = new float [noCellsPatches*no_input_sizes];
You access the i,j-th element by normalizedInputs[i*no_input_sizes+j] and you can use it directly in the TF_NewTensor call without worrying about any addresses.
C++ standard does its best to prevent programmers to use raw arrays, specifically multi-dimensional ones.
From your comment, your statically declared array is declared as:
float normalizedInputs[noCellsPatches][no_input_sizes];
If noCellsPatches and no_input_sizes are both compile time constants you have a correct program declaring a true 2D array. If they are not constants, you are declaring a 2D Variable Length Array... which does not exist in C++ standard. Fortunately, gcc allow it as an extension, but not MSVC nor clang.
If you want to declare a dynamic 2D array with non constant rows and columns, and use gcc, you can do that:
int (*arr0)[cols] = (int (*) [cols]) new int [rows*cols];
(the naive int (*arr0)[cols] = new int [rows][cols]; was rejected by my gcc 5.4.0)
It is definitely not correct C++ but is accepted by gcc and does what is expected.
The trick is that we all know that the size of an array of size n in n times the size of one element. A 2D array of rows rows of columnscolumns if then rows times the size of one row, which is columns when measured in underlying elements (here int). So we ask gcc to allocate a 1D array of the size of the 2D array and take enough liberalities with the strict aliasing rule to process it as the 2D array we wanted. As previously said, it violates the strict aliasing rule and use VLA in C++, but gcc accepts it.

Strange parameter recuperation of an array

I am trying to understand some code that passes a multi dimension array to a function. But the prototype of this function intrigues me.
The program creates this "tab" variable:
#define N 8
float tab[N][N];
tab[0][0] = 2; tab[0][1] = 3; tab[0][2] = -1;
tab[1][0] = 3; tab[1][1] = 1; tab[1][2] = -4;
tab[2][0] = 1; tab[2][1] = 2; tab[2][2] = 3;
hello(tab);
And we have this function:
function hello(float mat[][N]) {
I dont understand why the hello function gets the tab variable with an empty [] and then with [N]. What does it change ? I don't understand... Why not tab[][] ?
The code seems to have been made by a good developer so I don't think that the N variable is here for no reason.
If you can explain me this, thanks for your time !
The original code
float tab[N][N];
defines an array of N by N. I'm not going to use row or column because how the array is oriented to the program logic may have no bearing on how the array is represented in memory. Just know that it will be a block of memory N*N long that can be access with mat[0..N-1][0..N-1]. The sizes are known and constant. When you define an array, it must know its size and this size cannot change. If you do not know the size, use a std::vector or a std::vector<std::vector<YOUR TYPE HERE>>
float tab[][];
is illegal because the size of the array is unknown. The compiler has no clue how much storage to allocate to the array and cannot produce a functional (even if flawed) program.
When you pass an array to a function such as
function hello(float mat[][N])
the array decays into a pointer. More info: What is array decaying? Once the array has decayed, the size of the first dimension is lost. To safely use the array you must either already know the size of the array or provide it as another parameter. Example:
function hello(float mat[][N], size_t matLen)
In question, the size is given as N. You know it's N and you can safely call
hello(mat);
without providing any sizing and simply use N inside the function as the bounds. N is not a devious magic number, but it could be given a more descriptive name.
You can also be totally explicit and
function hello(float mat[N][N])
and remove any ambiguity along with the ability to use the function with arrays of size M by N. Sometimes it's a trade-off worth making.
Let me explain a little bit "untechnically", but probably comprehensive:
Think float tab[ROW][COL] as a two dimensional array of floats, where "ROW" stands for the rows, and "COL" stands for the columns, and think that the array is mapped to memory one complete row following the other, i.e.
r0c0,r0c1,r0c2
r1c0,r1c1,r1c2
r2c0,r2c1,r2c2
for ROW=3 and COL=3. Then, if the compiler would have to find out where to write tab[2][1], it would have to take 2 times the size of a row + 1 (where row size is actually the number of columns COL). So for addressing a cell, it is relevant to know the size of the row, whereas within a row one has just to add the column index. Hence, a declaration like tab[][N] is sufficient, as N defines the number of columns - i.e. the size of a row - and lets the compiler address each cell correctly.
Hope it helps somehow.

Pointer to 2D Array(why does it work)

I have the following function ( I want to print all elements from a given row)
void print_row(int j, int row_dimension, int *p)
{
p = p + (j * row_dimension);
for(int i = 0; i< row_dimension; i++)
cout<<*(p+i)<< " ";
}
Creating an array
int j[3][3]={{1,2,3},
{4,5,6},
{7,8,9} };
What I do not understand is why can I call the function in the following way :
print_row(i, 3, *j);
Why can I give as a parameter "*j" ? Shouldn't an address be passed? Why can I use the indirection operator?
int j[3][3] =
{{1,2,3},
{4,5,6},
{7,8,9}}; // 2d array
auto t1 = j; // int (*t1)[3]
auto t2 = *j; // int *t2
So what is happening is that *j produces j[0], which is an int[3] which then decays to an int*.
j is in fact an array of arrays. As such, *j is an array of three integers, and when used as a rvalue, it decays to a pointer to its first element, said differently, it decays to &j[0][0].
Then in printrow you compute the starting address of the first element of each subarray - that's the less nice part, I'll come back later to it. Then you correctly use the *(p+i) equivalent of p[i] to access each element of the subarray.
The remaining part of the answer is my interpretation of a strict reading of C standard
I said that computing the starting address of each subarray was the less nice part. It works because we all know that a 2D array of size NxM has the same representation in memory as a linear array of size N*M and we alias those representations. But if we respect strictly the standard, as an int pointer, &p[i][j] points to the first element of an array of three elements. As such, when you add the size of a row, you point past the end of the array which leads to undefined behaviour if you later dereference this address. Of course it works with all common compilers, but on an old question of mine, #HansPassant gave me a reference on experimental compilers able to enforce controls on arrays sizes. Those compilers could detect the access past the end of the array and raise a run time error... but it would break a lot of existing code!
To be strictly standard conformant, you should use a pointer to arrays of 3 integers. It requires the use of Variable Length Arrays, which is an optional feature but is fully standard conformant for system supporting it. Alternatively, you can go down to the byte representation of the 2D array, get its initial address, and from there compute as byte addresses the starting point of each subarray. It is a lot of boiling plate address computations but it fully respect the ##!%$ strict aliasing rule...
TL/DR: this code works with all common compilers, and will probably work with a lot of future versions of them, but it is not correct in a strict interpretation of the standard.
Your code works because *j is a pointer which has the same value as j or j[0]. Such behavior caused by mechanics of how two-dimensional arrays are handled by the compiler.
When you declare 2D array:
int j[3][3]={{1,2,3},
{4,5,6},
{7,8,9}};
compiler actually puts all values sequentially in memory, so the following declaration will have the same footprint:
int j[9]={1,2,3,4,5,6,7,8,9};
So in your case pointers j, *j and j[0] just point to the same place in memory.
Memory isn't multidimensional, so even if its a 2D array, it's data will be placed in a sequential manner, so if you get a pointer to that array -- that is implicitly a pointer to the first element of it -- and start reading the elements sequentially, you will iterate through all elements of this 2D array, reading element from the subsequent rows just after the last element of the previous one.

2D array vs array of arrays

What is the difference between a 2D array and an array of arrays?
I have read comments, such as #Dave's, that seem to differentiate between the two.
This breaks if he's using 2d arrays, or pointer-to-array types, rather than an array of arrays. – Dave
I always thought that both referred to:
int arr_arr[][];
EDIT: #FutureReader, you may wish to see How do I use arrays in C++?
There are four different concepts here.
The two-dimensional array: int arr[][]. It cannot be resized in any direction, and is contiguous. Indexing it is the same as ((int*)arr)[y*w + x]. Must be allocated statically.
The pointer-to array: int (*arr)[]. It can be resized only to add more rows, and is contiguous. Indexing it is the same as ((int*)arr)[y*w + x]. Must be allocated dynamically, but can be freed free(x);
The pointer-to-pointer: int **arr. It can be resized in any direction, and isn't necessarily square. Usually allocated dynamically, not necessarily contiguous, and freeing is dependent on its construction. Indexing is the same as *(*(arr+y)+x).
The array-of-pointers: int *arr[]. It can be resized only to add more columns, and isn't necessarily square. Resizing and freeing also depends on construction. Indexing is the same as *(*(arr+y)+x).
Every one of these can be used arr[y][x], leading to the confusion.
A 2 dimensional array is by definition an array of arrays.
What Dave was saying is that in that context, there are different semantics between the definition of a 2D array like this:
int x[][];
this:
int *x[];
or this:
int **x;
The answer here is a little more subtle.
An array of arrays is defined as such:
int array2[][];
The pointer-to-array types are defined as:
int (*array2)[];
The array-of-pointer types are defined as:
int* array2[];
The compiler treats both of these a little differently, and indeed there is one more option:
int** array2;
A lot of people are taught that these three are identical, but if you know more about compilers you will surely know that difference is small, but it is there. A lot of programs will run if you substitute one for another, but at the compiler and ASM level things are NOT the same. A textbook on C compilers should provide a much more in depth answer.
Also, if one is interested in the implementation of a 2D array there are multiple methods that vary in efficiency, depending on the situation. You can map a 2D array to a 1D array, which ensures spacial locality when dealing with linearized data. You can use the array of arrays if you want the ease of programming, and if you need to manipulate the rows/columns separately. There are certain blocked types and other fancy designs that are cache-smart, but rarely do you need to know the implementation if you the user.
Hope I helped!
The following is a 2D array that can be called an array of arrays:
int AoA[10][10];
The following is a pointer to a pointer that has been set up to function as a 2D array:
int **P2P = malloc(10 * sizeof *P2P);
if(!P2P) exit(1);
for(size_t i = 0; i < 10; i++)
{
P2P[i] = malloc(10 * sizeof **P2P);
if(!P2P[i])
{
for(; i > 0; i--)
free(P2P[i - 1]);
free(P2P);
}
}
Both can be accessed via AoA[x][y] or P2P[x][y], but the two are incompatible. In particular, P2P = AoA is something that newbies sometimes expect to work, but will not - P2P expects to point to pointers, but when AoA decays into a pointer, it is a pointer to an array, specifically int (*)[10], which is not the int ** that P2P is supposed to be.
2d array can include this:
int x[width * height]; // access: x[x + y * width];
From Wikipedia:
For a two-dimensional array, the element with indices i,j would have
address B + c · i + d · j, where the coefficients c and d are the row
and column address increments, respectively.