Swap rows in a 2D array with std::swap. How does it work? - c++

I'm mostly just documenting this question as someone may stumble upon it, and may find it useful. And also, I'm very curios with, how does std::swap works on a 2D array like: Arr[10][10].
My question arised because as to my understanding an array like this is just a 1D array with some reindexing.
For reference:
How are 2-Dimensional Arrays stored in memory?
int main()
{
const int x = 10;
const int y = 10;
int Arr[y][x];
// fill the array with some elements...
for (int i = 0; i < x*y; i++)
{
Arr[i / y][i % x] = i;
}
// swap 'row 5 & 2'
// ??? how does swap know how many elements to swap?
// if it is in fact stored in a 1D array, just the
// compiler will reindex it for us
std::swap(Arr[5], Arr[2]);
return 0;
}
I could understand swapping two 'rows' if our data type is, say a pointer to a pointer like int** Arr2D then swap with std::swap(Arr2D[2], Arr2D[5]) as we do not need to know the length here, we just need to swap the two pointers, pointing to '1D arrays'.
But how does std::swap work with Arr[y][x]?
Is it using a loop maybe, to swap all elements within x length?

std::swap has an overload for arrays that effectively swaps each two elements, again, using std::swap.
As for the size information, it is embedded within the array type (Arr[i] is int[x]), so the compiler knows to deduce T2 as int and N as 10.
OT: Why aren't variable-length arrays part of the C++ standard? (but this particular case is OK)

Related

How to convert between flat and multidimensional arrays without copying data?

I've got some data structured as a multi-dimensional array, i.e. double[][], and I need to pass it to a function that expects a single linear array of double[] along with dimensional metadata for the multi-dimensional representation.
For example, I might have a 3 x 5 multidimensional array, which I need to pass as a 15-element flat array along with height and width parameters so that the function knows it is a 3x5 array rather than a 5x3 array.
The function will then return a flat array and size metadata, which I need to use to convert the data back into a multidimensional type.
I believe the data layout in memory is exactly the same for both the flat and multi-dimensional representations; the only difference is how the indexing operations are performed. So I'd like to do the "conversion" with typecasting rather than copying the array values.
What's the most correct and readable way to typecast between multidimensional and flat arrays of the same total size?
I actually know what the dimensions of the multi-dimensional array will be at compile time. The array sizes aren't dynamic.
The most correct way has been given by #Maxim Egorushkin and #ypnos: double *flat = &multi[0][0];. And it will work fine with any decent compiler. But unfortunately is not valid C++ code and invokes Undefined Bahaviour.
The problem is that for an array double multi[N][M]; (N and M being compile time contant expressions), &multi[0][0] is the address of the first element of an array of size M. So it is legal to do pointer arithmetics only up to M. See this other question of mine for more details.
What's the most correct and readable way to typecast between multidimensional and flat arrays of the same total size?
The address of the first array element coincides with the address of the array. You can pass around the address of the first element, no casting is necessary.
I would assume the most popular way to do it is:
double *flat = &multi[0][0];
This is how it is done in C, and you do operate with simple C arrays.
You could also have a look at std::array in your use case (dimensions known at compile time), but that one is not multi-dimensional, so if you would cascade it, you would lose the contiguous layout.
You can use cast to a reference to an array. This require to use some fancy C++ type syntax but in return it allows to use all features that work on arrays, like for each loop.
#include <iostream>
using namespace std;
int main()
{
static constexpr size_t x = 5, y = 3;
unsigned multiArray[x][y];
for (size_t i = 0; i != x; ++i)
for (size_t j = 0; j != y; ++j)
multiArray[i][j] = i * j;
static constexpr size_t z = x * y;
unsigned (&singleArray)[z] = (unsigned (&)[z])multiArray[0][0];
for (const unsigned value : singleArray)
cout << value << ' ';
cout << endl;
return 0;
}
Take into account that this and other methods basing on casts work only with real multi-dimensional arrays. If it is an array of arrays (like unsigned **multiArray;), it isn't allocated in a continuous block of memory and a cast cannot bypass that.

Replace vector of vector with flat memory structure

I have the following type:
std::vector<std::vector<int>> indicies
where the size of the inner vector is always 2. The problem is, that vectors are non-contiguous in memory. I would like to replace the inner vector with something contiguous so that I can cast the flattened array:
int *array_a = (int *) &(a[0][0])
It would be nice if the new type has the [] operator, so that I don't have to change the whole code. (I could also implement it myself if necessary). My ideas are either:
std::vector<std::array<int, 2>>
or
std::vector<std::pair<int, int>>
How do these look in memory? I wrote a small test:
#include <iostream>
#include <array>
#include <vector>
int main(int argc, char *argv[])
{
using namespace std;
vector<array<int, 2>> a(100);
cout << sizeof(array<int, 2>) << endl;
for(auto i = 0; i < 10; i++){
for(auto j = 0; j < 2; j++){
cout << "a[" << i << "][" << j << "] "
<<&(a[i][j]) << endl;
}
}
return 0;
}
which results in:
8
a[0][0] 0x1b72c20
a[0][1] 0x1b72c24
a[1][0] 0x1b72c28
a[1][1] 0x1b72c2c
a[2][0] 0x1b72c30
a[2][1] 0x1b72c34
a[3][0] 0x1b72c38
a[3][1] 0x1b72c3c
a[4][0] 0x1b72c40
a[4][1] 0x1b72c44
a[5][0] 0x1b72c48
a[5][1] 0x1b72c4c
a[6][0] 0x1b72c50
a[6][1] 0x1b72c54
a[7][0] 0x1b72c58
a[7][1] 0x1b72c5c
a[8][0] 0x1b72c60
a[8][1] 0x1b72c64
a[9][0] 0x1b72c68
a[9][1] 0x1b72c6c
It seems to work in this case. Is this behavior in the standard or just a lucky coincidence? Is there a better way to do this?
An array<int,2> is going to be a struct containing an array int[2]; the standard does not directly mandate it, but there really is no other sane and practical way to do it.
See 23.3.7 [array] within the standard. There is nothing in the standard I can find that requires sizeof(std::array<char, 10>)==1024 to be false. It would be a ridiculous QOI (quality of implementation); every implementation I have seen has sizeof(std::array<T,N>) == N*sizeof(T), and anything else I would consider hostile.
Arrays must be contiguous containers which are aggregates that can be initialized by up to N arguments of types convertible to T.
The standard permits padding after such an array. I am aware of 0 compilers who insert such padding.
A buffer of contiguous std::array<int,2> is not guaranteed to be safely accessed as a flat buffer of int. In fact, aliasing rules almost certainly ban such access as undefined behaviour. You cannot even do this with a int[3][7]! See this SO question and answer, and here, and here.
Most compilers will make what you describe work, but the optimizer might decide that access through an int* and through the array<int,2>* cannot access the same memory, and generate insane results. It does not seem worth it.
A standards compliant approach would be to write an array view type (that takes two pointers and forms an iterable range with [] overloaded). Then write a 2d view of a flat buffer, with the lower dimension either a runtime or compile time value. Its [] would then return an array view.
There is going to be code in boost and other "standard extension" libraries to do this for you.
Merge the 2d view with a type owning a vector, and you get your 2d vector.
The only behaviour difference is that when the old vector of vector code copies the lower dimension (like auto inner=outer[i]) it copies data, afer it will instead create a view.
Is there a better way to do this?
I recently finished yet-another-version of Game-of-Life.
The game board is 2d, and yes, the vector of vectors has wasted space in it.
In my recent effort I chose to try a 1d vector for the 2d game board.
typedef std::vector<Cell_t*> GameBoard_t;
Then I created a simple indexing function, for when use of row/col added to the code's readability:
inline size_t gbIndx(int row, int col)
{ return ((row * MAXCOL) + col); }
Example: accessing row 27, col 33:
Cell_t* cell = gameBoard[ gbIndx(27, 33) ];
All the Cell_t* in gameBoard are now packed back to back (definition of vector) and trivial to access (initialize, display, etc) in row/col order using gbIndx().
In addition, I could use the simple index for various efforts:
void setAliveRandom(const GameBoard_t& gameBoard)
{
GameBoard_t myVec(m_gameBoard); // copy cell vector
time_t seed = std::chrono::system_clock::
now().time_since_epoch().count();
// randomize copy's element order
std::shuffle (myVec.begin(), myVec.end(), std::default_random_engine(seed));
int count = 0;
for ( auto it : myVec )
{
if (count & 1) it->setAlive(); // touch odd elements
count += 1;
}
}
I was surprised by how often I did not need row/col indexing.
As far as I know, std::vector are contiguous in memory. Take a look at this questions:
Why is std::vector contiguous?,
Are std::vector elements guaranteed to be contiguous?
In case you have to resize an inner vector, you wouldn't have the whole structure contiguous, but the inner vectors would still be it. If you use a vector of vectors, though, you'd have a fully contiguous structure (and I edit here, sorry I misunderstood your question) meaning that the pointers that point to your inner vectors will also be contiguous.
If you want to implement a structure that is always contiguous, from the first element of the first vector to the last element of the last vector, you can implement it as a custom class that has a vector<int> and elems_per_vector that indicates the number of elements in each inner vector.
Then, you can overload the operator(), so to access to a(i,j) you are actually accessing a.vector[a.elems_per_vector*i+j]. To insert new elements, though, and in order to keep the inner vectors at constant size between them, you'll have to make as many inserts as inner vectors you have.

Designating a pointer to a 2D array

If I declare a 2D array
int A[sz][sz];
How can I create a pointer to this object?
I ask because I want to return an array via pointer to a pointer, int**, from a function but I want to build the array without knowing the size beforehand. The size will be passed as an argument. I want to know if there is a way to do this without using dynamic allocation.
The problem is if I do something like int** A inside the function this gives A no information about the size I want.
How can I create the array and then assign a pointer to this array, if it's a 2D array.
I should be more clear. I want return a pointer to a pointer so it wouldn't be a pointer to the 2D array but a something like int**.
Your problem is, that a 2D array in the form int** requires an array of int* for the two step dereferencing, which simply does not exist when you declare an array with int A[sz][sz];.
You can build it yourself like this:
int* pointers[sz];
for(size_t i = sz; i--; ) pointers[i] = A[i];
This might seem absurd, but is rooted in the way C handles arrays: A[i] is of type int ()[sz], which is the subarray of row i. But when you use that array in the assignment, it decays to a pointer to the first element in that subarray, which is of type int*. After the loop, A and pointers are two very different things (the type of A is int ()[sz][sz])
Sidenote: You say that you want to return this from a function. If your array is allocated on the stack, you must not return a pointer to its data, it will disappear the moment your function returns. You can only return pointers/references to objects that have either static storage or are part of another existing object. If you fail to comply with this, you are likely to get stack corruption.
Edit:
A little known fact about C is, that you can actually pass around pointers to real C arrays, not just the pointer types that an array decays to. Here is a small program to demonstrate this:
#include <stddef.h>
#include <stdio.h>
int (*foo(int size, int (*bar)[size][size], int y))[] {
return &(*bar)[y];
}
int main() {
int mySize = 30;
int baz[mySize][mySize];
int (*result)[mySize];
result = foo(mySize, &baz, 15);
printf("%ld\n", (size_t)result - (size_t)baz);
}
The expected output of this example program is 1800. The important thing is that the actual size of the array must be known, either by being a compile time constant, or by being passed along with the array pointer (and if it's passed along with the array pointer, the size argument must appear before the array pointer does).
Let me flesh out your question a little bit. You mention:
I ask because I want to return an array [...] from a function but I
want to build the array without knowing the size beforehand. The size
will be passed as an argument. I want to know if there is a way to do
this without using dynamic allocation.
For the I want to return an array from a function [...] size passed as an argument, it seems reasonable to me that you can use std::vector everywhere, and call its .data() method when you need access to the underlying array (which is guaranteed to be contiguous). For example:
std:vector<double> myfun(size_t N) {
std::vector<double> r(N);
// fill r[0], r[1], ..., r[N-1]
return r;
}
// later on:
r.data(); // gives you a pointer to the underlying double[N]
And for the I want to to do this without dynamic allocation, that is not possible unless you know the size at compile time. If that is the case, then do exactly as before but use std::array, which can implement optimizations based on known compile-time size:
std::array<double, N> myfun() {
std::array<double, N> r;
// fill r[0], r[1], ..., r[N-1]
return r;
}
// later on:
r.data(); // gives you a pointer to the underlying double[N]
And to be generic, I would actually use a template function capable of working with arbitrary containers:
template<typename T>
void myfun(T& data) {
for(int k=0; k<data.size(); k++) {
// do stuff to data[k]
}
}
// call as, for example:
std::vector<double> data(10);
myfun(data);
// or equally valid:
std::array<double, 10> data;
myfun(data);
Finally, if you are working with two-dimensional data, please remember that when you store the Matrix in row-major order that is:
Matrix [1, 2; 3 4] is stored as [1 2 3 4]
then you can refer to element (i, j) of the matrix by calling data[i * ncols + j]. For example: consider a three by four matrix:
a b c d
e f g h
i j k l
The element (2, 2) (that is: third row, third column because we assume zero-based C-type indexing) is calculated as: M[2][2] = M[2 * 4 + 2] = M[10] = k. This is the case because it was stored as:
[a b c d e f g h i j k l]
[0 1 2 3 4 5 6 7 8 9 10 11]
and k is the element with index 10.
the responses to your question are weird. Just do this:
int A[2][2];
int**p =NULL;
*p = A[0]; // **p==A[0][0] , *(*p+1)==A[0][1]

Passing array with unknown size to function

Let's say I have a function called MyFunction(int myArray[][]) that does some array manipulations.
If I write the parameter list like that, the compiler will complain that it needs to know the size of the array at compile time. Is there a way to rewrite the parameter list so that I can pass an array with any size to the function?
My array's size is defined by two static const ints in a class, but the compiler won't accept something like MyFunction(int myArray[Board::ROWS][Board::COLS]).
What if I could convert the array to a vector and then pass the vector to MyFunction? Is there a one-line conversion that I can use or do I have to do the conversion manually?
In C++ language, multidimensional array declarations must always include all sizes except possibly the first one. So, what you are trying to do is not possible. You cannot declare a parameter of built-in multidimensional array type without explicitly specifying the sizes.
If you need to pass a run-time sized multidimensional array to a function, you can forget about using built-in multidimensional array type. One possible workaround here is to use a "simulated" multidimensional array (1D array of pointers to other 1D arrays; or a plain 1D array that simulates multidimensional array through index recalculation).
In C++ use std::vector to model arrays unless you have a specific reason for using an array.
Example of a 3x2 vector filled with 0's called "myArray" being initialized:
vector< vector<int> > myArray(3, vector<int>(2,0));
Passing this construct around is trivial, and you don't need to screw around with passing length (because it keeps track):
void myFunction(vector< vector<int> > &myArray) {
for(size_t x = 0;x < myArray.length();++x){
for(size_t y = 0;y < myArray[x].length();++y){
cout << myArray[x][y] << " ";
}
cout << endl;
}
}
Alternatively you can iterate over it with iterators:
void myFunction(vector< vector<int> > &myArray) {
for(vector< vector<int> >::iterator x = myArray.begin();x != myArray.end();++x){
for(vector<int>::iterator y = x->begin();y != x->end();++y){
cout << *y << " ";
}
cout << endl;
}
}
In C++0x you can use the auto keyword to clean up the vector iterator solution:
void myFunction(vector< vector<int> > &myArray) {
for(auto x = myArray.begin();x != myArray.end();++x){
for(auto y = x->begin();y != x->end();++y){
cout << *y << " ";
}
cout << endl;
}
}
And in c++0x for_each becomes viable with lambdas
void myFunction(vector< vector<int> > &myArray) {
for_each(myArray.begin(), myArray.end(), [](const vector<int> &x){
for_each(x->begin(), x->end(), [](int value){
cout << value << " ";
});
cout << endl;
});
}
Or a range based for loop in c++0x:
void myFunction(vector< vector<int> > &myArray) {
for(auto x : myArray){
for(auto y : *x){
cout << *y << " ";
}
cout << endl;
}
}
*I am not near a compiler right now and have not tested these, please feel free to correct my examples.
If you know the size of the array at compile time you can do the following (assuming the size is [x][10]):
MyFunction(int myArray[][10])
If you need to pass in a variable length array (dynamically allocated or possibly just a function which needs to take different sizes of arrays) then you need to deal with pointers.
And as the comments to this answer state:
boost::multiarray may be appropriate since it more efficiently models a multidimensional array. A vector of vectors can have performance implications in critical path code, but in typical cases you will probably not notice an issue.
Pass it as a pointer, and take the dimension(s) as an argument.
void foo(int *array, int width, int height) {
// initialize xPos and yPos
assert(xPos >= 0 && xPos < width);
assert(yPos >= 0 && yPos < height);
int value = array[yPos * width + xPos];
}
This is assuming you have a simple two-dimensional array, like int x[50][50].
There are already a set of answers with the most of the common suggestions: using std::vector, implementing a matrix class, providing the size of the array in the function argument... I am only going to add yet another solution based on native arrays --note that if possible you should use a higher level abstraction.
At any rate:
template <std::size_t rows, std::size_t cols>
void function( int (&array)[rows][cols] )
{
// ...
}
This solution uses a reference to the array (note the & and the set of parenthesis around array) instead of using the pass-by-value syntax. This forces the compiler not to decay the array into a pointer. Then the two sizes (which could have been provided as compile time constants can be defined as template arguments and the compiler will deduct the sizes for you.
NOTE: You mention in the question that the sizes are actually static constants you should be able to use them in the function signature if you provide the value in the class declaration:
struct test {
static const int rows = 25;
static const int cols = 80;
};
void function( int *array[80], int rows ) {
// ...
}
Notice that in the signature I prefer to change the double dimension array for a pointer to an array. The reason is that this is what the compiler interprets either way, and this way it is clear that there is no guarantee that the caller of the function will pass an array of exactly 25 lines (the compiler will not enforce it), and it is thus apparent the need for the second integer argument where the caller passes the number of rows.
You can't pass an arbitrary size like that; the compiler doesn't know how to generate the pointer arithmetic. You could do something like:
MyFunction(int myArray[][N])
or you could do:
MyFunction(int *p, int M, int N)
but you'll have to take the address of the first element when you call it (i.e. MyFunction(&arr[0][0], M, N).
You can get round all of these problems in C++ by using a container class; std::vector would be a good place to start.
The compiler is complaining because it needs to know the size of the all but the first dimension to be able to address an element in the array. For instance, in the following code:
int array[M][N];
// ...
array[i][j] = 0;
To address the element, the compiler generates something like the following:
*(array+(i*N+j)) = 0;
Therefore, you need to re-write your signature like this:
MyFunction(int array[][N])
in which case you will be stuck with a fixed dimension, or go with a more general solution such as a (custom) dynamic 2D array class or a vector<vector<int> >.
Use a vector<vector<int> > (this would be cheating if underlying storage was not guaranteed to be contiguous).
Use a pointer to element-of-array (int*) and a size (M*N) parameter. Here be dragons.
First, lets see why compiler is complaining.
If an array is defined as int arr[ ROWS ][ COLS ]; then any array notation arr[ i ][ j ] can be translated to pointer notation as
*( arr + i * COLS + j )
Observe that the expression requires only COLS, it does not require ROWS. So, the array definition can be written equivalently as
int arr [][ COLS ];
But, missing the second dimension is not acceptable. For little more details, read here.
Now, on your question:
Is there a way to rewrite the
parameter list so that I can pass an
array with any size to the function?
Yes, perhaps you can use a pointer, e.g. MyFunction( int * arr );. But, think about it, how would MyFunction() know where to stop accessing the array? To solve that you would need another parameter for the length of the array, e.g. MyFunction( int * arr, size_t arrSize );
Yes: MyFunction(int **myArray);
Careful, though. You'd better know what you're doing. This will only accept an array of int pointers.
Since you're trying to pass an array of arrays, you'll need a constant expression as one of the dimentions:
MyFunction(int myArray[][COLS]);
You'll need to have COLS at compile time.
I suggest using a vector instead.
Pass a pointer and do the indexing yourself or use a Matrix class instead.
yes - just pass it as pointer(s):
MyFunction(int** someArray)
The downside is that you'll probably need to pas the array's lengths as well
Use MyFunction(int *myArray[])
If you use MyFunction(int **myArray) an pass int someArray[X][Y], the program will crash.
EDIT: Don't use the first line, it's explained in comments.
I don't know about C++, but the C99 standard introduced variable length arrays.
So this would work in a compiler that supports C99:
void func(int rows, int cols, double[rows][cols] matrix) {
for (int r = 0; r < rows; r++) {
for (int c = 0; c < cols; c++) {
printf("%f", matrix[r][c]);
}
}
}
Note that the size arguments come before the array. Really, only the number of columns has to be known at compile time, so this would be valid as well:
void func(int rows, int cols, double[][cols] matrix)
For three or more dimensions, all but the first dimension must have known sizes. The answer ArunSaha linked to explains why.
Honestly, I don't know whether C++ supports variable-length arrays, so this may or may not work. In either case, you may also consider encapsulating your array in some sort of matrix class.
EDIT: From your edit, it looks like C++ may not support this feature. A matrix class is probably the way to go. (Or std::vector if you don't mind that the memory may not be allocated contiguously.)
Don't pass an array, which is an implementation detail. Pass the Board
MyFunction(Board theBoard)
{
...
}
in reality my array's size is defined by two static const ints in a class, but the compiler won't accept something like MyFunction(int myArray[Board::ROWS][Board::COLS]).
That's strange, it works perfectly fine for me:
struct Board
{
static const int ROWS = 6;
static const int COLS = 7;
};
void MyFunction(int myArray[Board::ROWS][Board::COLS])
{
}
Maybe ROWS and COLS are private? Can you show us some code?
In C++, using the inbuilt array types is instant fail. You could use a boost::/std:: array of arrays or vector of arrays. Primitive arrays are not up to any sort of real use
In C++0x, you can use std::initializer_list<...> to accomplish this:
MyFunction(std::initializer_list<std::initializer_list<int>> myArray);
and use it (I presume) like this (with the range based for syntax):
for (const std::initializer_list<int> &subArray: myArray)
{
for (int value: subArray)
{
// fun with value!
}
}

Multi-dimensional array and pointers in C++?

int *x = new int[5]();
With the above mentality, how should the code be written for a 2-dimensional array - int[][]?
int **x = new int[5][5] () //cannot convert from 'int (*)[5]' to 'int **'
In the first statement I can use:
x[0]= 1;
But the second is more complex and I could not figure it out.
Should I use something like:
x[0][1] = 1;
Or, calculate the real position then get the value
for the fourth row and column 1
x[4*5+1] = 1;
I prefer doing it this way:
int *i = new int[5*5];
and then I just index the array by 5 * row + col.
You can do the initializations separately:
int **x = new int*[5];
for(unsigned int i = 0; i < 5; i++)
x[i] = new int[5];
There is no new[][] operator in C++. You will first have to allocate an array of pointers to int:
int **x = new int*[5];
Then iterate over that array. For each element, allocate an array of ints:
for (std::size_t i = 0; i < 5; ++i)
x[i] = new int[5];
Of course, this means you will have to do the inverse when deallocating: delete[] each element, then delete[] the larger array as a whole.
This is how you do it:
int (*x)[5] = new int[7][5] ;
I made the two dimensions different so that you can see which one you have to use on the lhs.
Ff the array has predefined size you can write simply:
int x[5][5];
It compiles
If not why not to use a vector?
There are several ways to accomplish this:
Using gcc's support for flat multidimensional arrays (TonyK's answer, the most relevant to the question IMO). Note that you must preserve the bounds in the array's type everywhere you use it (e.g. all the array sizes, except possibly the first one), and that includes functions that you call, because the produced code will assume a single array. The allocation of $ new int [7][5] $ causes a single array to be allocated in memory. indexed by the compiler (you can easily write a little program and print the addresses of the slots to convince yourself).
Using arrays of pointers to arrays. The problem with that approach is having to allocate all the inner arrays manually (in loops).
Some people will suggest using std::vector's of std::vectors, but this is inefficient, due to the memory allocation and copying that has to occur when the vectors resize.
Boost has a more efficient version of vectors of vectors in its multi_array lib.
In any case, this question is better answered here:
How do I use arrays in C++?