Pointer to C++ 2D Array - c++

I'm porting some C++ code to Actionscript 3 and just had a small question I'm confused about.
In one function, one of the parameters is int* myPtr. myPtr is the address of an element of a 2D-Array, &my2DArray[x][y]. x and y are also parameters of the function. I'm just a little bit confused with what is being accessed when the code accesses, for example, myPtr[1]. I think this would be the next element in my2dArray, but I'm not sure if this would be my2DArray[x+1][y] or my2DArray[x][y+1]. Thanks for any help.
Additional info:
my2DArray is created by:
//initPtr is a int*, auxPtr is a int*, as is temp1
initPtr = (unsigned int *)NewPtr(
sizeof(unsigned int) *
X * Y);
}
auxPtr = initPtr ;
for (i = 0; i < X; i++) {
temp1 = auxPtr + i * Y;
my2DArray[i] = (short *)temp1;
}
unsigned char* NewPtr(
int size)
{
return ((unsigned char*)calloc(size, sizeof(unsigned char)));
}

It would be my2DArray[x][y+1]. C++ uses row-major order for multidimensional array indices. That means that large jumps in memory are changes in the left-most index, and single element jumps are changes in the right-most index.

If myPtr truly is a pointer to an arbitrary cell in the array, then, unless you know the layout of the 2D array, you can't precisely. It is probably the next one in the row (assuming a standard C++ 2D array, but it might be down a row (if it were at the end of the row). Or, if the array is implemented as a 1D array of 1D arrays, then it might not even be a legal address (were it at the end of the row).

Related

Swap rows in a 2D array with std::swap. How does it work?

I'm mostly just documenting this question as someone may stumble upon it, and may find it useful. And also, I'm very curios with, how does std::swap works on a 2D array like: Arr[10][10].
My question arised because as to my understanding an array like this is just a 1D array with some reindexing.
For reference:
How are 2-Dimensional Arrays stored in memory?
int main()
{
const int x = 10;
const int y = 10;
int Arr[y][x];
// fill the array with some elements...
for (int i = 0; i < x*y; i++)
{
Arr[i / y][i % x] = i;
}
// swap 'row 5 & 2'
// ??? how does swap know how many elements to swap?
// if it is in fact stored in a 1D array, just the
// compiler will reindex it for us
std::swap(Arr[5], Arr[2]);
return 0;
}
I could understand swapping two 'rows' if our data type is, say a pointer to a pointer like int** Arr2D then swap with std::swap(Arr2D[2], Arr2D[5]) as we do not need to know the length here, we just need to swap the two pointers, pointing to '1D arrays'.
But how does std::swap work with Arr[y][x]?
Is it using a loop maybe, to swap all elements within x length?
std::swap has an overload for arrays that effectively swaps each two elements, again, using std::swap.
As for the size information, it is embedded within the array type (Arr[i] is int[x]), so the compiler knows to deduce T2 as int and N as 10.
OT: Why aren't variable-length arrays part of the C++ standard? (but this particular case is OK)

How to convert between flat and multidimensional arrays without copying data?

I've got some data structured as a multi-dimensional array, i.e. double[][], and I need to pass it to a function that expects a single linear array of double[] along with dimensional metadata for the multi-dimensional representation.
For example, I might have a 3 x 5 multidimensional array, which I need to pass as a 15-element flat array along with height and width parameters so that the function knows it is a 3x5 array rather than a 5x3 array.
The function will then return a flat array and size metadata, which I need to use to convert the data back into a multidimensional type.
I believe the data layout in memory is exactly the same for both the flat and multi-dimensional representations; the only difference is how the indexing operations are performed. So I'd like to do the "conversion" with typecasting rather than copying the array values.
What's the most correct and readable way to typecast between multidimensional and flat arrays of the same total size?
I actually know what the dimensions of the multi-dimensional array will be at compile time. The array sizes aren't dynamic.
The most correct way has been given by #Maxim Egorushkin and #ypnos: double *flat = &multi[0][0];. And it will work fine with any decent compiler. But unfortunately is not valid C++ code and invokes Undefined Bahaviour.
The problem is that for an array double multi[N][M]; (N and M being compile time contant expressions), &multi[0][0] is the address of the first element of an array of size M. So it is legal to do pointer arithmetics only up to M. See this other question of mine for more details.
What's the most correct and readable way to typecast between multidimensional and flat arrays of the same total size?
The address of the first array element coincides with the address of the array. You can pass around the address of the first element, no casting is necessary.
I would assume the most popular way to do it is:
double *flat = &multi[0][0];
This is how it is done in C, and you do operate with simple C arrays.
You could also have a look at std::array in your use case (dimensions known at compile time), but that one is not multi-dimensional, so if you would cascade it, you would lose the contiguous layout.
You can use cast to a reference to an array. This require to use some fancy C++ type syntax but in return it allows to use all features that work on arrays, like for each loop.
#include <iostream>
using namespace std;
int main()
{
static constexpr size_t x = 5, y = 3;
unsigned multiArray[x][y];
for (size_t i = 0; i != x; ++i)
for (size_t j = 0; j != y; ++j)
multiArray[i][j] = i * j;
static constexpr size_t z = x * y;
unsigned (&singleArray)[z] = (unsigned (&)[z])multiArray[0][0];
for (const unsigned value : singleArray)
cout << value << ' ';
cout << endl;
return 0;
}
Take into account that this and other methods basing on casts work only with real multi-dimensional arrays. If it is an array of arrays (like unsigned **multiArray;), it isn't allocated in a continuous block of memory and a cast cannot bypass that.

2D-array as argument to function

Why can't you declare a 2D array argument in a function as you do with a normal array?
void F(int bar[]){} //Ok
void Fo(int bar[][]) //Not ok
void Foo(int bar[][SIZE]) //Ok
Why is it needed to declare the size for the column?
Static Arrays:
You seem not to have got the point completely. I thought to try to explain it somewhat. As some of the above answers describe, a 2D Array in C++ is stored in memory as a 1D Array.
int arr[3][4] ; //consider numbers starting from zero are stored in it
Looks somewhat like this in Memory.
1000 //ignore this for some moments 1011
^ ^
^ ^
0 1 2 3 4 5 6 7 8 9 10 11
|------------| |-----------| |-------------|
First Array Second Array Third Array
|----------------------------------------------|
Larger 2D Array
Consider that here, the Bigger 2D Array is stored as contiguous memory units. It consists of total 12 elements, from 0 to 11. Rows are 3 and columns are 4. If you want to access the third array, you need to skip the whole first and second arrays. That is, you need to skip elements equal to the number of your cols multiplied by how many arrays you want skip. It comes out to be cols * 2.
Now when you specify the dimensions to access any single index of the array, you need to tell the compiler beforehand exactly how much elements to skip. So you give it the exact number of cols to perform the rest of the calculation.
So how does it perform the calculation? Let us say it works on the column major order, that is, it needs to know the number of columns to skip. When you specify one element of this array as...
arr[i][j] ;
Compiler performs this calculation automatically.
Base Address + (i * cols + j) ;
Let us try the formula for one index to test its veracity. We want to access the 3rd element of the 2nd Array. We would do it like this...
arr[1][2] ; //access third element of second array
We put it in the formula...
1000 + ( 1 * 4 + 2 )
= 1000 + ( 6 )
= 1006 //destination address
And we reach at the address 1006 where 6 is located.
In a nutshell, we need to tell the compiler the number of cols for this calculation. So we send it as a parameter in a function.
If we are working on a 3D Array, like this...
int arr[ROWS][COLS][HEIGHT] ;
We would have to send it the last two dimensions of the array in a function.
void myFunction (int arr[][COLS][HEIGHT]) ;
The formula now would become this..
Base Address + ( (i * cols * height) + (j * height) + k ) ;
To access it like this...
arr[i][j][k] ;
COLS tell the compiler to skip the number of 2D Array, and HEIGHT tells it to skip the number of 1D Arrays.
And so on and so forth for any dimension.
Dynamic Arrays:
As you ask about different behavior in case of dynamic arrays which are declared thus..
int ** arr ;
Compiler treats them differently, because each index of a Dynamic 2D Array consists of an address to another 1D Array. They may or may not be present on contiguous locations on heap. Their elements are accessed by their respective pointers. The dynamic counterpart of our static array above would look somewhat like this.
1000 //2D Pointer
^
^
2000 2001 2002
^ ^ ^
^ ^ ^
0 4 8
1 5 9
2 6 10
3 7 11
1st ptr 2nd ptr 3rd ptr
Suppose this is the situation. Here the 2D Pointer or Array on the location 1000. It hold the address to 2000 which itself holds address of a memory location. Here pointer arithmetic is done by the compiler by virtue of which it judges the correct location of an element.
To allocate memory to 2D Pointer, we do it..
arr = new int *[3] ;
And to allocate memory to each of its index pointer, this way..
for (auto i = 0 ; i < 3 ; ++i)
arr[i] = new int [4] ;
At the end, each ptr of the 2D Array is itself an array. To access an element you do...
arr[i][j] ;
Compiler does this...
*( *(arr + i) + j ) ;
|---------|
1st step
|------------------|
2nd step
In the first step, the 2D Array gets dereferenced to its appropriate 1D Array and in the second step, the 1D Array gets dereferenced to reach at the appropriate index.
That is the reason why Dynamic 2D Arrays are sent to the function without any mention of their row or column.
Note:
Many details have been ignored and many things supposed in the description, especially the memory mapping just to give you an idea.
You can't write void Foo(int bar[][]), because bar decays to a pointer. Imagine following code:
void Foo(int bar[][]) // pseudocode
{
bar++; // compiler can't know by how much increase the pointer
// as it doesn't know size of *bar
}
So, compiler must know size of *bar, therefore size of rightmost array must be provided.
Because when you pass an array, it decays to a pointer, so excluding the outer-most dimension is ok and that's the only dimension you can exclude.
void Foo(int bar[][SIZE])
is equivalent to:
void Foo(int (*bar)[SIZE])
The compiler needs to know how long the second dimension is to calculate the offsets. A 2D array is in fact stored as a 1D array.
If you want to send an array with no known dimensions, consider using pointer to pointers and some sort of way to know the dimension yourself.
This is different from e.g. java, because in java the datatype also contains the dimension.
Since static 2D arrays are like 1D arrays with some sugar to better access data, you have to think about the arithmetic of pointers.
When the compiler tries to access element array[x][y], it has to calculate the address memory of the element, that is array+x*NUM_COLS+y. So it needs to know the length of a row (how many elements it contains).
If you need more information I suggest this link.
there are basically three ways to allocate a 2d array in C/C++
allocate on heap as a 2d array
you can allocate a 2d array on the heap using malloc such as:
const int row = 5;
const int col = 10;
int **bar = (int**)malloc(row * sizeof(int*));
for (size_t i = 0; i < row; ++i)
{
bar[i] = (int*)malloc(col * sizeof(int));
}
this is actually stored as an array of arrays therefore isn't necessarily
contiguous in memory. note that this also means there will be a pointer for
each array costing yout extra memory usage (5 pointers in this example, 10
pointers if you allocate it the other way around). you can pass this array to
a function with the signature:
void foo(int **baz)
allocate on heap as 1d array
for various reasons (cache optimizations, memory usage etc.) it may be
desirable to store the 2d array as a 1d array:
const int row = 5;
const int col = 10;
int *bar = (int*)malloc(row * col * sizeof(int));
knowing second dimension you can access the elements using:
bar[1 + 2 * col] // corresponds semantically to bar[2][1]
some people use preprocessor magic (or method overloading of () in C++) to
handle this automatically such as:
#define BAR(i,j) bar[(j) + (i) * col]
..
BAR(2,1) // is actually bar[1 + 2 * col]
you need to have the function signature:
void foo(int *baz)
in order to pass this array to a function.
allocate on stack
you can allocate a 2d array on stack using something like:
int bar[5][10];
this is allocated as a 1d array on the stack therefore compiler needs to know
the second dimension to reach the element you need just like we did in the
second example, therefore the following is also true:
bar[2][1] == (*bar)[1 + 2 * 10]
function signature for this array should be:
void foo(int baz[][10])
you need to provide the second dimension so that compiler would know where to reach in memory. you don't have to give the first dimension since C/C++ is not a safe language in this respect.
let me know if I mixed up rows and columns somewhere..

How to use (2d) arrays with negative index?

In 1D you can simulate x-coordinate in such a way:
int temp[1000];
int *x = a+500;
How can we have a grid now? (Something like a[10][-13].)
You can easily convert -ve and +ve integers into just +ve integers as an index into an array as you are unable to use -ve indexes.
Here is how
if (index < 0)
then index = -index * 2 - 1
else index = index * 2
i.e. -ve indexes use the odd numbers, +ve use the even numbers. 0 stays at 0.
Don't confuse mathematics with array dimensions in C/C++, those are different things. If you have a mathematical matrix with indices -500 to 500, you use a C array with indices 0 to 1000 to store it in.
However you can access an array by using a negative index, as long as you make sure you aren't accessing the array out of bounds. For example:
int arr[1000];
int* ptr = &arr[499];
printf("%d", ptr[-100]);
2D arrays work in the very same way, although strictly speaking you can still not access a sub array out of bounds and expect to end up in an adjacent array, this is undefined behavior in C/C++. But in real world implementations static 2D arrays are always allocated using adjacent memory cells, so one can often safely assume they are, no matter what the C standard says.
You just have to calculate the offsets yourself, for instance
int grid[400]; // twenty by twenty grid, origin at (10, 10)
int get_grid_value(int x, int y)
{
return grid[20*(x + 10) + (y + 10)];
}
Of course in real code you shouldn't use so many magic numbers.
First of all, this only works if the memory allocated for the array is contiguous. Then you can find out the "middle point" of the array by
int temp[5][5];
int *a = temp[2] + 2;
Or, in more general terms
int len
int *temp = malloc(len * len * sizeof(int));
int *a = temp + (len/2)*len + len/2;
If you want to simulate geometry using arrays ... you could do something like
have a variable with maximum number of points and assign a pointer to the middle value. So with that pointer you could have negative indeces.
A sample program.
int main() {
int c[10000];
int *a = &c[5000];
for(int i=-5000;i<5000;i++)
a[i] = i;
for(int i=-5000;i<5000;i++)
cout<<a[i]<<" ";
cout<<endl;
return 0;
}
Hope this was helpful ..
To use it in a more proper way, you could have a class which internally manages this. Or you could have your template.
I'm not sure you can do that with a simple 2-D array without invoking the gremlins of undefined behavior, but you could set it up as an array of pointers. Create an array of pointers to int, then set a pointer to point into the middle of the array; that gives you signed indices for the first dimension. Then set each element of the pointer array to point to an array of int, and advance each to point to the middle of that array; that gives you signed indices for the second dimension. You can use the same arr[x][y] syntax you'd use for an actual 2-D array, but the second [] applies to an actual pointer, not an array that decayed to a pointer.
If any of these arrays are allocated with malloc(), you must pass the original pointer to free().
If there's sufficient interest, I'll try to post some code later.
BTW, I'm not at all convinced this would be worth the effort. You could easily fake all this with ordinary 0-based arrays, at the cost of a little syntactic sugar.

Multi-dimensional array and pointers in C++?

int *x = new int[5]();
With the above mentality, how should the code be written for a 2-dimensional array - int[][]?
int **x = new int[5][5] () //cannot convert from 'int (*)[5]' to 'int **'
In the first statement I can use:
x[0]= 1;
But the second is more complex and I could not figure it out.
Should I use something like:
x[0][1] = 1;
Or, calculate the real position then get the value
for the fourth row and column 1
x[4*5+1] = 1;
I prefer doing it this way:
int *i = new int[5*5];
and then I just index the array by 5 * row + col.
You can do the initializations separately:
int **x = new int*[5];
for(unsigned int i = 0; i < 5; i++)
x[i] = new int[5];
There is no new[][] operator in C++. You will first have to allocate an array of pointers to int:
int **x = new int*[5];
Then iterate over that array. For each element, allocate an array of ints:
for (std::size_t i = 0; i < 5; ++i)
x[i] = new int[5];
Of course, this means you will have to do the inverse when deallocating: delete[] each element, then delete[] the larger array as a whole.
This is how you do it:
int (*x)[5] = new int[7][5] ;
I made the two dimensions different so that you can see which one you have to use on the lhs.
Ff the array has predefined size you can write simply:
int x[5][5];
It compiles
If not why not to use a vector?
There are several ways to accomplish this:
Using gcc's support for flat multidimensional arrays (TonyK's answer, the most relevant to the question IMO). Note that you must preserve the bounds in the array's type everywhere you use it (e.g. all the array sizes, except possibly the first one), and that includes functions that you call, because the produced code will assume a single array. The allocation of $ new int [7][5] $ causes a single array to be allocated in memory. indexed by the compiler (you can easily write a little program and print the addresses of the slots to convince yourself).
Using arrays of pointers to arrays. The problem with that approach is having to allocate all the inner arrays manually (in loops).
Some people will suggest using std::vector's of std::vectors, but this is inefficient, due to the memory allocation and copying that has to occur when the vectors resize.
Boost has a more efficient version of vectors of vectors in its multi_array lib.
In any case, this question is better answered here:
How do I use arrays in C++?