Pointer to 2D Array(why does it work) - c++

I have the following function ( I want to print all elements from a given row)
void print_row(int j, int row_dimension, int *p)
{
p = p + (j * row_dimension);
for(int i = 0; i< row_dimension; i++)
cout<<*(p+i)<< " ";
}
Creating an array
int j[3][3]={{1,2,3},
{4,5,6},
{7,8,9} };
What I do not understand is why can I call the function in the following way :
print_row(i, 3, *j);
Why can I give as a parameter "*j" ? Shouldn't an address be passed? Why can I use the indirection operator?

int j[3][3] =
{{1,2,3},
{4,5,6},
{7,8,9}}; // 2d array
auto t1 = j; // int (*t1)[3]
auto t2 = *j; // int *t2
So what is happening is that *j produces j[0], which is an int[3] which then decays to an int*.

j is in fact an array of arrays. As such, *j is an array of three integers, and when used as a rvalue, it decays to a pointer to its first element, said differently, it decays to &j[0][0].
Then in printrow you compute the starting address of the first element of each subarray - that's the less nice part, I'll come back later to it. Then you correctly use the *(p+i) equivalent of p[i] to access each element of the subarray.
The remaining part of the answer is my interpretation of a strict reading of C standard
I said that computing the starting address of each subarray was the less nice part. It works because we all know that a 2D array of size NxM has the same representation in memory as a linear array of size N*M and we alias those representations. But if we respect strictly the standard, as an int pointer, &p[i][j] points to the first element of an array of three elements. As such, when you add the size of a row, you point past the end of the array which leads to undefined behaviour if you later dereference this address. Of course it works with all common compilers, but on an old question of mine, #HansPassant gave me a reference on experimental compilers able to enforce controls on arrays sizes. Those compilers could detect the access past the end of the array and raise a run time error... but it would break a lot of existing code!
To be strictly standard conformant, you should use a pointer to arrays of 3 integers. It requires the use of Variable Length Arrays, which is an optional feature but is fully standard conformant for system supporting it. Alternatively, you can go down to the byte representation of the 2D array, get its initial address, and from there compute as byte addresses the starting point of each subarray. It is a lot of boiling plate address computations but it fully respect the ##!%$ strict aliasing rule...
TL/DR: this code works with all common compilers, and will probably work with a lot of future versions of them, but it is not correct in a strict interpretation of the standard.

Your code works because *j is a pointer which has the same value as j or j[0]. Such behavior caused by mechanics of how two-dimensional arrays are handled by the compiler.
When you declare 2D array:
int j[3][3]={{1,2,3},
{4,5,6},
{7,8,9}};
compiler actually puts all values sequentially in memory, so the following declaration will have the same footprint:
int j[9]={1,2,3,4,5,6,7,8,9};
So in your case pointers j, *j and j[0] just point to the same place in memory.

Memory isn't multidimensional, so even if its a 2D array, it's data will be placed in a sequential manner, so if you get a pointer to that array -- that is implicitly a pointer to the first element of it -- and start reading the elements sequentially, you will iterate through all elements of this 2D array, reading element from the subsequent rows just after the last element of the previous one.

Related

How to convert between flat and multidimensional arrays without copying data?

I've got some data structured as a multi-dimensional array, i.e. double[][], and I need to pass it to a function that expects a single linear array of double[] along with dimensional metadata for the multi-dimensional representation.
For example, I might have a 3 x 5 multidimensional array, which I need to pass as a 15-element flat array along with height and width parameters so that the function knows it is a 3x5 array rather than a 5x3 array.
The function will then return a flat array and size metadata, which I need to use to convert the data back into a multidimensional type.
I believe the data layout in memory is exactly the same for both the flat and multi-dimensional representations; the only difference is how the indexing operations are performed. So I'd like to do the "conversion" with typecasting rather than copying the array values.
What's the most correct and readable way to typecast between multidimensional and flat arrays of the same total size?
I actually know what the dimensions of the multi-dimensional array will be at compile time. The array sizes aren't dynamic.
The most correct way has been given by #Maxim Egorushkin and #ypnos: double *flat = &multi[0][0];. And it will work fine with any decent compiler. But unfortunately is not valid C++ code and invokes Undefined Bahaviour.
The problem is that for an array double multi[N][M]; (N and M being compile time contant expressions), &multi[0][0] is the address of the first element of an array of size M. So it is legal to do pointer arithmetics only up to M. See this other question of mine for more details.
What's the most correct and readable way to typecast between multidimensional and flat arrays of the same total size?
The address of the first array element coincides with the address of the array. You can pass around the address of the first element, no casting is necessary.
I would assume the most popular way to do it is:
double *flat = &multi[0][0];
This is how it is done in C, and you do operate with simple C arrays.
You could also have a look at std::array in your use case (dimensions known at compile time), but that one is not multi-dimensional, so if you would cascade it, you would lose the contiguous layout.
You can use cast to a reference to an array. This require to use some fancy C++ type syntax but in return it allows to use all features that work on arrays, like for each loop.
#include <iostream>
using namespace std;
int main()
{
static constexpr size_t x = 5, y = 3;
unsigned multiArray[x][y];
for (size_t i = 0; i != x; ++i)
for (size_t j = 0; j != y; ++j)
multiArray[i][j] = i * j;
static constexpr size_t z = x * y;
unsigned (&singleArray)[z] = (unsigned (&)[z])multiArray[0][0];
for (const unsigned value : singleArray)
cout << value << ' ';
cout << endl;
return 0;
}
Take into account that this and other methods basing on casts work only with real multi-dimensional arrays. If it is an array of arrays (like unsigned **multiArray;), it isn't allocated in a continuous block of memory and a cast cannot bypass that.

How to pass dynamic and static 2d arrays as void pointer?

for a project using Tensorflow's C API I have to pass a void pointer (void*) to a method of Tensorflow. In the examples the void* points to a 2d array, which also worked for me. However now I have array dimensions which do not allow me to use the stack, which is why I have to use a dynamic array or a vector.
I managed to create a dynamic array with the same entries like this:
float** normalizedInputs;//
normalizedInputs = new float* [noCellsPatches];
for(int i = 0; i < noCellsPatches; ++i)
{
normalizedInputs[i] = new float[no_input_sizes];
}
for(int i=0;i<noCellsPatches;i++)
{
for(int j=0;j<no_input_sizes;j++)
{
normalizedInputs[i][j]=inVals.at(no_input_sizes*i+j);
////
////
//normalizedInputs[i][j]=(inVals.at(no_input_sizes*i+j)-inputMeanValues.at(j))/inputVarValues.at(j);
}
}
The function call needing the void* looks like this:
TF_Tensor* input_value = TF_NewTensor(TF_FLOAT,in_dims_arr,2,normalizedInputs,num_bytes_in,&Deallocator, 0);
In argument 4 you see the "normalizedInputs" array. When I run my program now, the calculated results are totally wrong. When I go back to the static array they are right again. What do I have to change?
Greets and thanks in advance!
Edit: I also noted that the TF_Tensor* input_value holds totally different values for both cases (for dynamic it has many 0 and nan entries). Is there a way to solve this by using a std::vector<std::vector<float>>?
Respectively: is there any valid way pass a consecutive dynamic 2d data structure to a function as void*?
In argument 4 you see the "normalizedInputs" array. When I run my program now, the calculated results are totally wrong.
The reason this doesn't work is because you are passing the pointers array as data. In this case you would have to use normalizedInputs[0] or the equivalent more explicit expression &normalizedInputs[0][0]. However there is another bigger problem with this code.
Since you are using new inside a loop you won't have contiguous data which TF_NewTensor expects. There are several solutions to this.
If you really need a 2d-array you can get away with two allocations. One for the pointers and one for the data. Then set the pointers into the data array appropriately.
float **normalizedInputs = new float* [noCellsPatches]; // allocate pointers
normalizedInputs[0] = new float [noCellsPatches*no_input_sizes]; // allocate data
// set pointers
for (int i = 1; i < noCellsPatches; ++i) {
normalizedInputs[i] = &normalizedInputs[i-1][no_input_sizes];
}
Then you can use normalizedInputs[i][j] as normal in C++ and the normalizedInputs[0] or &normalizedInputs[0][0] expression for your TF_NewTensor call.
Here is a mechanically simpler solution, just use a flat 1d array.
float * normalizedInputs = new float [noCellsPatches*no_input_sizes];
You access the i,j-th element by normalizedInputs[i*no_input_sizes+j] and you can use it directly in the TF_NewTensor call without worrying about any addresses.
C++ standard does its best to prevent programmers to use raw arrays, specifically multi-dimensional ones.
From your comment, your statically declared array is declared as:
float normalizedInputs[noCellsPatches][no_input_sizes];
If noCellsPatches and no_input_sizes are both compile time constants you have a correct program declaring a true 2D array. If they are not constants, you are declaring a 2D Variable Length Array... which does not exist in C++ standard. Fortunately, gcc allow it as an extension, but not MSVC nor clang.
If you want to declare a dynamic 2D array with non constant rows and columns, and use gcc, you can do that:
int (*arr0)[cols] = (int (*) [cols]) new int [rows*cols];
(the naive int (*arr0)[cols] = new int [rows][cols]; was rejected by my gcc 5.4.0)
It is definitely not correct C++ but is accepted by gcc and does what is expected.
The trick is that we all know that the size of an array of size n in n times the size of one element. A 2D array of rows rows of columnscolumns if then rows times the size of one row, which is columns when measured in underlying elements (here int). So we ask gcc to allocate a 1D array of the size of the 2D array and take enough liberalities with the strict aliasing rule to process it as the 2D array we wanted. As previously said, it violates the strict aliasing rule and use VLA in C++, but gcc accepts it.

Passing a 2-D array the column is mandatory

While passing a 2-Dimensional array we have to specify the the column.
eg:
void funtion1(a[])// works
{
}
void function2(a[][4])//works
{
}
void function3(a[][])//doesn't work
{
}
What could be the possible reasons that the function3 is considered an incorrect definition.
Is there a different way to define function3 so that we can leave both row and column blank.
Reading some replies:
Can you explain how x[n] and x[] are different?. I guess the former represents a specific array position and the latter is unspecified array. More explanation will be deeply appreciated.
You cannot pass a 2D array without specifying the second dimension, since otherwise, parameter "a" will decay to a pointer, the compiler needs to know how long the second dimension is to calculate the offsets (reason is that 2D array is stored as 1D in memory). Therefore, compiler must know size of *a, which requires that the second dimension be given. You can use vector of vectors to replace 2D array.
with void function2(a[][4]) it knows that there are 4 elements in each row. With void function3(a[][]) it doesn't know, so it can't calculate where a[i] should be.
Use a vector, since it's c++
C style arrays don't work the way you think. Think of them as a block of memory, and the dimensions tell the compiler how far to offset from the original address.
int a[] is basically a pointer and every element is an int, which means a[1] is equivalent of *(a + 1), where each 1 is sizeof(int) bytes. There's no limit or end (simplistically speaking) of the a array. You could use a[999999] and the compiler won't care.
int a[][4] is similar, but now the compiler knows that each row is 4*sizeof(int). So a[2][1] is *(a + 2*4 + 1)
int a[][] on the other hand, is an incomplete type, so to the compiler, a[2][1] is *(a + 2*?? + 1), and who know what ?? means.
Don't use int **a, that means an array of pointers, which is most likely what you don't want.
As some have said, with STL, use vectors instead. It's pretty safe to use std::vector<std::vector<int> > a. You'll still be able to get a[2][1].
And while you're at it, use references instead, const std::vector<std::vector<int> > &a. That way, you're not copying the whole array with each function call.
how does compiler calculate address of a[x][y]?
well simply:
address_of_a+(x*SECOND_SIZE+y)
imagine now that you want
a[2][3]
compiler has to computes:
address_of_a+(2*SECOND_SIZE+3)
if compiler doesn't know SECOND_SIZE how it can compute this?
you have to give it to him explicitly. you are using a[2][1], a[100][13] in your code, so compiler has to know how to compute addresses of these objects.
see more here

2D array vs array of arrays

What is the difference between a 2D array and an array of arrays?
I have read comments, such as #Dave's, that seem to differentiate between the two.
This breaks if he's using 2d arrays, or pointer-to-array types, rather than an array of arrays. – Dave
I always thought that both referred to:
int arr_arr[][];
EDIT: #FutureReader, you may wish to see How do I use arrays in C++?
There are four different concepts here.
The two-dimensional array: int arr[][]. It cannot be resized in any direction, and is contiguous. Indexing it is the same as ((int*)arr)[y*w + x]. Must be allocated statically.
The pointer-to array: int (*arr)[]. It can be resized only to add more rows, and is contiguous. Indexing it is the same as ((int*)arr)[y*w + x]. Must be allocated dynamically, but can be freed free(x);
The pointer-to-pointer: int **arr. It can be resized in any direction, and isn't necessarily square. Usually allocated dynamically, not necessarily contiguous, and freeing is dependent on its construction. Indexing is the same as *(*(arr+y)+x).
The array-of-pointers: int *arr[]. It can be resized only to add more columns, and isn't necessarily square. Resizing and freeing also depends on construction. Indexing is the same as *(*(arr+y)+x).
Every one of these can be used arr[y][x], leading to the confusion.
A 2 dimensional array is by definition an array of arrays.
What Dave was saying is that in that context, there are different semantics between the definition of a 2D array like this:
int x[][];
this:
int *x[];
or this:
int **x;
The answer here is a little more subtle.
An array of arrays is defined as such:
int array2[][];
The pointer-to-array types are defined as:
int (*array2)[];
The array-of-pointer types are defined as:
int* array2[];
The compiler treats both of these a little differently, and indeed there is one more option:
int** array2;
A lot of people are taught that these three are identical, but if you know more about compilers you will surely know that difference is small, but it is there. A lot of programs will run if you substitute one for another, but at the compiler and ASM level things are NOT the same. A textbook on C compilers should provide a much more in depth answer.
Also, if one is interested in the implementation of a 2D array there are multiple methods that vary in efficiency, depending on the situation. You can map a 2D array to a 1D array, which ensures spacial locality when dealing with linearized data. You can use the array of arrays if you want the ease of programming, and if you need to manipulate the rows/columns separately. There are certain blocked types and other fancy designs that are cache-smart, but rarely do you need to know the implementation if you the user.
Hope I helped!
The following is a 2D array that can be called an array of arrays:
int AoA[10][10];
The following is a pointer to a pointer that has been set up to function as a 2D array:
int **P2P = malloc(10 * sizeof *P2P);
if(!P2P) exit(1);
for(size_t i = 0; i < 10; i++)
{
P2P[i] = malloc(10 * sizeof **P2P);
if(!P2P[i])
{
for(; i > 0; i--)
free(P2P[i - 1]);
free(P2P);
}
}
Both can be accessed via AoA[x][y] or P2P[x][y], but the two are incompatible. In particular, P2P = AoA is something that newbies sometimes expect to work, but will not - P2P expects to point to pointers, but when AoA decays into a pointer, it is a pointer to an array, specifically int (*)[10], which is not the int ** that P2P is supposed to be.
2d array can include this:
int x[width * height]; // access: x[x + y * width];
From Wikipedia:
For a two-dimensional array, the element with indices i,j would have
address B + c · i + d · j, where the coefficients c and d are the row
and column address increments, respectively.

How to use (2d) arrays with negative index?

In 1D you can simulate x-coordinate in such a way:
int temp[1000];
int *x = a+500;
How can we have a grid now? (Something like a[10][-13].)
You can easily convert -ve and +ve integers into just +ve integers as an index into an array as you are unable to use -ve indexes.
Here is how
if (index < 0)
then index = -index * 2 - 1
else index = index * 2
i.e. -ve indexes use the odd numbers, +ve use the even numbers. 0 stays at 0.
Don't confuse mathematics with array dimensions in C/C++, those are different things. If you have a mathematical matrix with indices -500 to 500, you use a C array with indices 0 to 1000 to store it in.
However you can access an array by using a negative index, as long as you make sure you aren't accessing the array out of bounds. For example:
int arr[1000];
int* ptr = &arr[499];
printf("%d", ptr[-100]);
2D arrays work in the very same way, although strictly speaking you can still not access a sub array out of bounds and expect to end up in an adjacent array, this is undefined behavior in C/C++. But in real world implementations static 2D arrays are always allocated using adjacent memory cells, so one can often safely assume they are, no matter what the C standard says.
You just have to calculate the offsets yourself, for instance
int grid[400]; // twenty by twenty grid, origin at (10, 10)
int get_grid_value(int x, int y)
{
return grid[20*(x + 10) + (y + 10)];
}
Of course in real code you shouldn't use so many magic numbers.
First of all, this only works if the memory allocated for the array is contiguous. Then you can find out the "middle point" of the array by
int temp[5][5];
int *a = temp[2] + 2;
Or, in more general terms
int len
int *temp = malloc(len * len * sizeof(int));
int *a = temp + (len/2)*len + len/2;
If you want to simulate geometry using arrays ... you could do something like
have a variable with maximum number of points and assign a pointer to the middle value. So with that pointer you could have negative indeces.
A sample program.
int main() {
int c[10000];
int *a = &c[5000];
for(int i=-5000;i<5000;i++)
a[i] = i;
for(int i=-5000;i<5000;i++)
cout<<a[i]<<" ";
cout<<endl;
return 0;
}
Hope this was helpful ..
To use it in a more proper way, you could have a class which internally manages this. Or you could have your template.
I'm not sure you can do that with a simple 2-D array without invoking the gremlins of undefined behavior, but you could set it up as an array of pointers. Create an array of pointers to int, then set a pointer to point into the middle of the array; that gives you signed indices for the first dimension. Then set each element of the pointer array to point to an array of int, and advance each to point to the middle of that array; that gives you signed indices for the second dimension. You can use the same arr[x][y] syntax you'd use for an actual 2-D array, but the second [] applies to an actual pointer, not an array that decayed to a pointer.
If any of these arrays are allocated with malloc(), you must pass the original pointer to free().
If there's sufficient interest, I'll try to post some code later.
BTW, I'm not at all convinced this would be worth the effort. You could easily fake all this with ordinary 0-based arrays, at the cost of a little syntactic sugar.