C++ copy constructor for a class with an array attribute - c++

I am creating a matrix template and ran into a problem writing the copy constructor. While data appears to be copied correctly from within the constructor, the object returned to the main program does not have the correct values (looks like its pointing to a different memory address). In my attempts to debug this I tried creating a minimalist example, though strangely this did not produce the same error. I get the sense that this issue is either beyond my understanding of C++ ...or caused by a typo I've somehow missed. Can anyone spot what I've done wrong?
matlib.h
#ifndef MATLIB_H
#define MATLIB_H
#include <iostream>
namespace matlib{
template <typename T>
struct Matrix {
unsigned int rows; //number of rows
unsigned int cols; //number of columns
unsigned int length; //number of elements
T data[]; //contents of matrix
/* Constructors */
Matrix(unsigned int m, unsigned int n) : rows(m), cols(n) {
length = m*n;
T data[m*n];
//::std::cout << "Hello from the null constructor!" << ::std::endl;
//::std::cout << "rows = " << rows << ", cols = " << cols << ", length = " << length << ::std::endl;
}
Matrix(const Matrix<T> &mat) {
rows = mat.rows;
cols = mat.cols;
length = mat.length;
T data[mat.length];
::std::cout << "Hello from the copy constructor!" << ::std::endl;
for (int i = 0; i < length; ++i) {
data[i] = mat.data[i];
::std::cout << "data[" << i << "] = " << data[i] << ", mat.data[" << i << "] = " << mat.data[i] << ::std::endl;
}
}
//Single element indexing and assigment
T& operator() (int i, int j) {
return data[ i*this->cols + j ];
}
T& operator() (unsigned int i, unsigned int j) {
return data[ i*this->cols + j ];
}
//Single element indexing and assigment
T& operator() (int i) {
return data[i];
}
T& operator() (unsigned int i) {
return data[i];
}
};
}
#endif
testscript.cpp
#include <iostream>
#include "matlib.h"
int main() {
float arr[7] = {4, 1, 6, 6, 8, 4, 2};
matlib::Matrix<float> mat1(1,7);
//Assign values and print
std::cout << "mat1 = ";
for (int i = 0; i < mat1.length; ++i) {
mat1(i) = arr[i];
std::cout << mat1(i) << " ";
}
std::cout << "\n" << std::endl;
//Copy the object
matlib::Matrix<float> mat2 = mat1;
//Print the copied values
std::cout << "mat2 = ";
for (int i = 0; i < mat2.length; ++i) {
std::cout << mat2(i) << " ";
}
std::cout << std::endl;
return 0;
}
Console output:
mat1 = 4 1 6 6 8 4 2
Hello from the copy constructor!
data[0] = 4, mat.data[0] = 4
data[1] = 1, mat.data[1] = 1
data[2] = 6, mat.data[2] = 6
data[3] = 6, mat.data[3] = 6
data[4] = 8, mat.data[4] = 8
data[5] = 4, mat.data[5] = 4
data[6] = 2, mat.data[6] = 2
mat2 = 9.80909e-45 1.4013e-45 9.80909e-45 9.80909e-45 4 1 6
I'm sure many people will suggest solutions involving 'std::vector' though this is mostly a learning exercise with HPC in mind. I will likely add bound checking once this is a bit more developed.

Short Version: Just use std::vector. It will will make your life easier, and has far less pitfalls than a manual approach.
Long Version: You have two major problems in your code:
You are using Flexible-Arrays incorrectly (and this is a compiler extension, not standard C++), and
You are using Variable-Length Arrays incorrectly (and this is also a compiler extension, not standard C++)
1. Flexible Array Members
The first compiler-extension you use is a feature from c called flexible arrays:
struct Matrix {
...
T data[];
// ^~~~~~~~~
c allows using arrays of unsized data at the end of a struct to denote objects that may be given dynamic sizes at runtime when allocated by malloc. This is not, however, valid standard c++ and it is ill-advised to be using this since it does not fit into C++'s allocator model at all.
This should be changed out for something more coherent.
2. Variable Length Arrays
The second extension you are using is also from c, called variable-length arrays:
Matrix(unsigned int m, unsigned int n) : rows(m), cols(n) {
...
T data[m*n];
This is also not valid standard C++. You cannot construct an array from runtime values in C++ -- full stop. Arrays are known at compile-time, and only at compile-time.
Additionally, and this is where you are experiencing the problem, T data[m*n] is creating a new VLA named data that shadows the flexible array also named data. So each function you define T data[m*n] or T data[other.length], you are actually creating new arrays, writing to them, and then doing nothing with them. This is why you are seeing different addresses.
Suggested Fixes
Use heap memory, perhaps with std::unique_ptr to manage things for you. Allocate the size on construction, clone it on copy.
// Construction
Matrix(unsigned int m, unsigned int n) : rows(m), cols(n), data(std::make_unique<T[]>(m * n))
// where 'data' is std::unique_ptr<T[]>
{ ... }
This will then require a custom copy constructor:
Matrix(const Matrix& other) : rows(other.rows), cols(other.cols), data(std::make_unique<T[]>(rows * cols)){
// Copy all elements from 'other' to 'data'
std::copy_n(other.get(), rows * cols, data.get());
}
Or, better yet:
Use std::vector. It already knows how to do lifetime and saves you from a number of pitfalls. If you already know the max size of the vector, you can just use resize or reserve+push_back and this saves reallocation costs.
Matrix(unsigned int m, unsigned int n) : rows(m), cols(n), data(m * n)
// where 'data' is std::vector<T>
{ ... }
Using std::vector you can just do:
Matrix(const Matrix& other) = default;
in your class declaration, and it will use std::vector's underlying copy constructor. This is a far better approach IMO.
A separate note on "High-Performance Computing"
I encourage you to not shy away from containers like std::vector purely for the purpose of HPC.
To be blunt, developers are notoriously bad at determining what is, and is not, good for performance. The underlying hardware plays the biggest role with factors like speculative execution, branch prediction, instruction pipelining, and cache locality. Heap memory and a few extra byte-copies are usually the least of your worries unless you are repeatedly growing the container in a very tight loop.
On the contrary, heap memory is easy to move around (e.g. move constructors), since it's a pointer-copy, whereas buffer storage would be copied in totality even for moves. Additionally, c++17 introduces polymorphic allocators with different options where memory resources come from -- allowing for far faster allocation options (e.g. a virtual memory resource that allocates full pages for std::vector).
Even where performance matters: Try a solution and profile before trying to optimize it. Don't waste effort up front, because the results may surprise you. Sometimes doing more work can result in faster code in the right conditions.

I'd suggest moving your dynamic (de)allocation code in specific protected methods. It will help you avoid memory leaks, double free, useless reallocations and keep your constructors more readable.
template <typename T>
struct Matrix {
std::size_t rows{0}, cols{0};
size_t capacity{0};
T* data{nullptr};
Matrix() = default;
Matrix(size_t rows, size_t cols): Matrix()
{
this->allocate(rows * cols);
this->rows = rows;
this->cols = cols;
}
Matrix(const Matrix<T>& other): Matrix()
{
*this = other;
}
Matrix& operator=(const Matrix<T>& other)
{
if (this != &other) {
this->allocate(other.length());
std::copy_n(other.data, other.length(), data);
this->rows = other.rows;
this->cols = other.cols;
}
return *this;
}
~Matrix()
{
this->release();
}
size_t length() const { return rows * cols; }
// Access
T& operator()(size_t row, size_t col) { /*TODO*/ }
const T& operator()(size_t row, size_t col) const { /*TODO*/ }
protected:
void allocate(size_t reqLength)
{
if (data && capacity >= reqLength) return;
this->release();
data = new T [reqLength];
capacity = reqLength;
}
void release()
{
if (data) delete [] data;
data = nullptr;
capacity = 0;
}
};

Related

cannot use push_back to insert an integer to a 1D/2D vector

I am trying to write a function to extract a slice from a given matrix, where the input is 1D and the slice can be 1D or 2D.
I am trying to use the push_back function for this purpose but for some reasons the push_back does not work.
I receive an error in my line OutPut.push_back(DumyValue);
Can anyone help me why I am receiving this error?
Also, it would be appreciated if you can tell me how to solve this issue.
Also, if the first part becomes clear, can anyone tell me how I should use the push_back for inserting an integer in a specific location so I can use it for extracting a 2D slice?
If you remove the line OutPut.push_back(DumyValue); the code should work.
#include<iostream>
#include<vector>
using namespace std;
int MatrixSlice(vector<vector<int>> Input, int Row1, int Row2, int Col1, int Col2) {
//define the slice size, if it is iD or 2D
if (abs(Row1-Row2)>1 && abs(Col1-Col2)>1){
vector<vector<int>> OutPut;
}else{
vector<int> OutPut;
}
int i2;
int j2;
for (int i = Row1; i <= Row2; i++) {
i2=0;
for (int j = Col1; j <= Col2; j++) {
int DumyValue=Input[i][j];
OutPut.push_back(DumyValue);
i2++;
//cout << Input[i][j] << endl;
}
j2++;
}
return 0;
}
int main() {
//Define a matrix for test:
vector<vector<int>> Matrix2(4, vector<int>(5, 1));
int R = 4;
int C = 4;
vector<vector<int>> MatrixInput(R, vector<int>(C, 1));;
for (int i = 0; i < MatrixInput.size(); i++) {
for (int j = 0; j < MatrixInput[0].size(); j++) {
int temp;
temp = i^2+j^2;
MatrixInput[i][j] = temp;
}
}
MatrixSlice(MatrixInput, 0, 3, 1, 1);
printf("\n");
return 0;
}
Matrix slice has a couple problems:
It is impossible define a variable with two possible types and have both active in the same scope.
The return type of int makes little sense. The matrix is sliced up, but then what? It can't be handed back to the caller to do anything with it.
This can be fixed with a union, but yikes! The bookkeeping on that will be a Smurfing nightmare. Don't do it!
The next is to always use a vector of vectors, but I don't like that idea for a couple reasons I'll get into below.
Instead I pitch a simple wrapper object around a single vector. This is done for two reasons:
It preserves the ability to back a 1 dimensional matrix with a 1 dimensional container. If you have many rows of one column, all of the row data remains contiguous and cache friendly.
It tends to be much faster. The data of one vector is contiguous in memory and reaps the rewards of cache friendliness. A vector of vectors is basically a list of pointers to arrays of data, sending the poor CPU on an odyssey of pointer-chasing through memory to find the columns. If the columns are short, this can really, really hurt performance.
Here we go:
template<class TYPE>
class Matrix
{
private:
size_t mNrRows; // note size_t. This is unsigned because there is no reason
// for a matrix with a negative size. size_t is also guaranteed
// to fit anything you can throw at it.
size_t mNrColumns;
std::vector<TYPE> mVec;
public:
// make a default-initialized matrix
Matrix(size_t nrRows, size_t nrColumns) :
mNrRows(nrRows), mNrColumns(nrColumns), mVec(mNrRows * mNrColumns)
{
}
// make a def-initialized matrix
Matrix(size_t nrRows, size_t nrColumns, TYPE def) :
mNrRows(nrRows), mNrColumns(nrColumns), mVec(mNrRows * mNrColumns,
def)
{
}
// gimme a value and allow it to be changed
TYPE & operator()(size_t row, size_t column)
{
// could check for out of bounds and throw an exception here
return mVec[row * mNrColumns + column];
}
//gimme a value and do not allow it to be changed
TYPE operator()(size_t row, size_t column) const
{
return mVec[row * mNrColumns + column];
}
// gimme the number of rows
size_t getRows() const
{
return mNrRows;
}
// gimmie the number of columns.
size_t getColumns() const
{
return mNrColumns;
}
// printing convenience
friend std::ostream & operator<<(std::ostream & out, const Matrix & mat)
{
int count = 0;
for (TYPE val: mat.mVec)
{
out << val;
if (++count == mat.mNrColumns)
{
out << '\n';
count = 0;
}
else
{
out << ' ';
}
}
return out;
}
};
The vector member handles all of the heavy lifting so the Rule of Zero recommends leaving the copy and move constructors, assignment operators, and destructor up to the compiler.
What does this do to MatrixSlice? Well, first it now received and returns a Matrix instead of vector<vector> and int. The insides use Matrix and the confusion about 1D or 2D is just plain gone, resulting in a simpler function.
Matrix<int> MatrixSlice(const Matrix<int> & Input,
int Row1,
int Row2,
int Col1,
int Col2)
{
Matrix<int> OutPut(Row2-Row1 + 1,
Col2-Col1 + 1); // but what if Row1 > Row2?
int i2;
int j2= 0; // definitely need to initialize this sucker.
for (int i = Row1; i <= Row2; i++) // logical problem here: What if Row2 >= input.getRows()?
{
i2 = 0;
for (int j = Col1; j <= Col2; j++) // similar problem here
{
int DumyValue = Input(i, j);
OutPut(j2, i2) = DumyValue;
i2++;
}
j2++;
}
return OutPut;
}
Not that this completely ignores the very logical option of making slice a Matrix method. While it makes sense, it doesn't need to be a method and the stock recommendation is to prefer a free function. One good improvement is to make the function a template so that it can handle all sorts of Matrix in addition to Matrix<int>.
And finally, what happens to main?
int main()
{
//Define a matrix for test:
Matrix<int> Matrix2(4, 5, 1); // initialize matrix to all 1s
int R = 4;
int C = 4;
Matrix<int> MatrixInput(R, C); // default initialize the matrix
for (int i = 0; i < MatrixInput.getRows(); i++)
{
for (int j = 0; j < MatrixInput.getColumns(); j++)
{
int temp;
temp = i ^ 2 + j ^ 2;
// WARNING: ^ is XOR, not exponent. Maybe OP wants i XOR 2, but not
// likely. But if XOR is the desired operation, there is a lurking
// order of operation bug that needs to be addressed
MatrixInput(i, j) = temp;
}
}
std::cout << MatrixInput << '\n';
std::cout << MatrixSlice(MatrixInput, 0, 3, 1, 1);
return 0;
}
In your code
if (abs(Row1-Row2)>1 && abs(Col1-Col2)>1){
vector<vector<int> > OutPut;
// OutPut dies here
}else{
vector<int> OutPut;
// OutPut dies here
}
// here is no OutPut
OutPut lives only to the end of IF statement.
You either use it without the if statement or you add all code that uses it to the if statement.

Selective shallow copy from one array to another

Assuming I have 2 array of different size i.e
int arr[] = {0,1,2,3,4,5,6,7,8,9};
int *arr2 = new int[5];
I want to shallow copy some of them,
Deep copy equivalent would be
int j =0;
if(!(i%2))
{
arr2[j]=arr[i];
j++;
}
Right now a print of arr2 will output : 0, 2, 4, 6 ,8
The reason I want to shallow copy is because I want arr2 to update with any changes to arr.
That is if I loop and square all the elements in arr
I want arr2 to output : 0, 4, 16, 36 ,64
These 2 arrays are part of the same class, one is my polygonal information, and the other part is data driven. arr is actually 4000+ elements in size, and arr2 is close to 3000. At the moment my algorithm works great with deep copy. but because I need to deep copy 3000 elements per update frame, i am wasting resources and was wondering if i could somehow do this via shallow copy so I don't have to have to update arr2 every frame. The way my code needs it to work, arr actually has repeated values of arr2. arr2 is a list of points that is animated. then the data is duplicated to arr which hold the positional data for vertices. this is because arr contains multiple bezier patches, some of them share one edge or more with another patch. but i want that to be ignored when animating else there are breaks in the surface.
It is important that the copy involves indices like
arr2[j]=arr[i];
because that is how my code is setup.
And that the operation be low load.
You will need an array of integer pointers for that.
int *arr2[5];
for (int i = 0, j = 0; i < 10; i++) {
if (!(i%2)) {
arr2[j]= &arr[i];
j++;
}
}
So you need to set each element of arr2 to point to corresponding element in arr by arr2[j]= &arr[i];
When you need to access element in arr2, you call some thing like: int a = *arr2[j];
Later on let say you change arr[0] to 10 arr[0] = 10; then int a = *arr2[0]; will give you 10.
As an alternative to the pointer array approach, here's a crude C++03 example of how to this programmatically. Which one is better depends on how complex the operator[] here needs to be in the real use case, and how much smaller the 2nd array is (ie. how much extra memory it needs, causing cache misses etc).
#include <iostream>
class c_array_view {
public:
c_array_view(int *array) : array_(array) {}
int& operator[](size_t index) { return array_[index*2]; }
static size_t convert_length(size_t original) { return original / 2; }
private:
int *array_;
};
int main()
{
int arr[] = {0,1,2,3,4,5,6,7,8,9};
size_t arr_len = sizeof arr / sizeof arr[0];
c_array_view arr2(arr);
size_t arr2_len = arr2.convert_length(arr_len);
for(unsigned i = 0; i < arr_len; ++i) {
std::cout << "arr: " << i << " = " << arr[i] << std::endl;
}
std::cout << std::endl;
for(unsigned j = 0; j < arr2_len; ++j) {
std::cout << "arr2: " << j << " = " << arr2[j] << std::endl;
}
std::cout << std::endl;
arr2[2] = 42;
std::cout << "modifeid arr2[2] to 42, now arr[4] = " << arr[4] << std::endl;
return 0;
}
The c_array_view could be turned into a template, a nice general purpose class which would take the mapping function as a C++11 lambda, etc, this just demonstrates the principle.
if you want squares then you should not do arr2[j]=arr[i]. The correct answer would be
arr2[j]=arr[i]*arr[i];

Declaring and allocating a 2d array in C++

I am a Fortran user and do not know C++ well enough. I need to make some additions into an existing C++ code. I need to create a 2d matrix (say A) of type double whose size (say m x n) is known only during the run. With Fortran this can be done as follows
real*8, allocatable :: A(:,:)
integer :: m, n
read(*,*) m
read(*,*) n
allocate(a(m,n))
A(:,:) = 0.0d0
How do I create a matrix A(m,n), in C++, when m and n are not known at the time of compilation? I believe the operator new in C++ can be useful but not not sure how to implement it with doubles. Also, when I use following in C++
int * x;
x = new int [10];
and check the size of x using sizeof(x)/sizeof(x[0]), I do not have 10, any comments why?
To allocate dynamically a construction similar to 2D array use the following template.
#include <iostream>
int main()
{
int m, n;
std::cout << "Enter the number of rows: ";
std::cin >> m;
std::cout << "Enter the number of columns: ";
std::cin >> n;
double **a = new double * [m];
for ( int i = 0; i < m; i++ ) a[i] = new double[n]();
//...
for ( int i = 0; i < m; i++ ) delete []a[i];
delete []a;
}
Also you can use class std::vector instead of the manually allocated pointers.
#include <iostream>
#include <vector>
int main()
{
int m, n;
std::cout << "Enter the number of rows: ";
std::cin >> m;
std::cout << "Enter the number of columns: ";
std::cin >> n;
std::vector<std::vector<double>> v( m, std::vector<double>( n ) );
//...
}
As for this code snippet
int * x;
x = new int [10];
then x has type int * and x[0] has type int. So if the size of the pointer is equal to 4 and the size of an object of type int is equal also to 4 then sizeof( x ) / sizeof( x[0] ) will yields 1. Pointers do not keep the information whether they point to only a single object or the first object pf some sequence of objects.
I would recommend using std::vector and avoid all the headache of manually allocating and deallocating memory.
Here's an example program:
#include <iostream>
#include <vector>
typedef std::vector<double> Row;
typedef std::vector<Row> Matrix;
void testMatrix(int M, int N)
{
// Create a row with all elements set to 0.0
Row row(N, 0.0);
// Create a matrix with all elements set to 0.0
Matrix matrix(M, row);
// Test accessing the matrix.
for ( int i = 0; i < M; ++i )
{
for ( int j = 0; j < N; ++j )
{
matrix[i][j] = i+j;
std::cout << matrix[i][j] << " ";
}
std::cout << std::endl;
}
}
int main()
{
testMatrix(10, 20);
}
The formal C++ way of doing it would be this:
std::vector<std::vector<int>> a;
This creates container which contains a zero size set of sub-containers. C++11/C++13 provide std::array for fixed-sized containers, but you specified runtime sizing.
We now have to impart our dimensions on this and, unfortunately. Lets assign the top-level:
a.resize(10);
(you can also push or insert elements)
What we now have is a vector of 10 vectors. Unfortunately, they are all independent, so you would need to:
for (size_t i = 0; i < a.size(); ++i) {
a[i].resize(10);
}
We now have a 10x10. We can also use vectors constructor:
std::vector<std::vector<int>> a(xSize, std::vector<int>(ySize)); // assuming you want a[x][y]
Note that vectors are fully dynamic, so we can resize elements as we need:
a[1].push_back(10); // push value '10' onto a[1], creating an 11th element in a[1]
a[2].erase(2); // remove element 2 from a[2], reducing a[2]s size to 9
To get the size of a particular slot:
a.size(); // returns 10
a[1].size(); // returns 11 after the above
a[2].size(); // returns 9 after teh above.
Unfortunately C++ doesn't provide a strong, first-class way to allocate an array that retains size information. But you can always create a simple C-style array on the stack:
int a[10][10];
std::cout << "sizeof a is " << sizeof(a) <<'\n';
But using an allocator, that is placing the data onto the heap, requires /you/ to track size.
int* pointer = new int[10];
At this point, "pointer" is a numeric value, zero to indicate not enough memory was available or the location in memory where the first of your 10 consecutive integer storage spaces are located.
The use of the pointer decorator syntax tells the compiler that this integer value will be used as a pointer to store addresses and so allow pointer operations via the variable.
The important thing here is that all we have is an address, and the original C standard didn't specify how the memory allocator would track size information, and so there is no way to retrieve the size information. (OK, technically there is, but it requires using compiler/os/implementation specific information that is subject to frequent change)
These integers must be treated as a single object when interfacing with the memory allocation system -- you can't, for example:
delete pointer + 5;
to delete the 5th integer. They are a single allocation unit; this notion allows the system to track blocks rather than individual elements.
To delete an array, the C++ syntax is
delete[] pointer;
To allocate a 2-dimensional array, you will need to either:
Flatten the array and handle sizing/offsets yourself:
static const size_t x = 10, y = 10;
int* pointer = new int[x * y];
pointer[0] = 0; // position 0, the 1st element.
pointer[x * 1] = 0; // pointer[1][0]
or you could use
int access_2d_array_element(int* pointer, const size_t xSize, const size_t ySize, size_t x, size_t y)
{
assert(x < xSize && y < ySize);
return pointer[y * xSize + x];
}
That's kind of a pain, so you would probably be steered towards encapsulation:
class Array2D
{
int* m_pointer;
const size_t m_xSize, m_ySize;
public:
Array2D(size_t xSize, size_t ySize)
: m_pointer(new int[xSize * ySize])
, m_xSize(xSize)
, m_ySize(ySize)
{}
int& at(size_t x, size_t y)
{
assert(x < m_xSize && y < m_ySize);
return m_pointer[y * m_xSize + x];
}
// total number of elements.
size_t arrsizeof() const
{
return m_xSize * m_ySize;
}
// total size of all data elements.
size_t sizeof() const
{
// this sizeof syntax makes the code more generic.
return arrsizeof() * sizeof(*m_pointer);
}
~Array2D()
{
delete[] m_pointer;
}
};
Array2D a(10, 10);
a.at(1, 3) = 13;
int x = a.at(1, 3);
Or,
For each Nth dimension (N < dimensions) allocate an array of pointers-to-pointers, only allocating actual ints for the final dimension.
const size_t xSize = 10, ySize = 10;
int* pointer = new int*(x); // the first level of indirection.
for (size_t i = 0; i < x; ++i) {
pointer[i] = new int(y);
}
pointer[0][0] = 0;
for (size_t i = 0; i < x; ++i) {
delete[] pointer[i];
}
delete[] pointer;
This last is more-or-less doing the same work, it just creates more memory fragmentation than the former.
-----------EDIT-----------
To answer the question "why do I not have 10" you're probably compiling in 64-bit mode, which means that "x" is an array of 10 pointers-to-int, and because you're in 64-bit mode, pointers are 64-bits long, while ints are 32 bits.
The C++ equivalent of your Fortran code is:
int cols, rows;
if ( !(std::cin >> cols >> rows) )
// error handling...
std::vector<double> A(cols * rows);
To access an element of this array you would need to write A[r * rows + c] (or you could do it in a column-major fashion, that's up to you).
The element access is a bit clunky, so you could write a class that wraps up holding this vector and provides a 2-D accessor method.
In fact your best bet is to find a free library that already does this, instead of reinventing the wheel. There isn't a standard Matrix class in C++, because somebody would always want a different option (e.g. some would want row-major storage, some column-major, particular operations provided, etc. etc.)
Someone suggested boost::multi_array; that stores all its data contiguously in row-major order and is probably suitable. If you want standard matrix operations consider something like Eigen, again there are a lot of alternatives out there.
If you want to roll your own then it could look like:
struct FortranArray2D // actually easily extensible to any number of dimensions
{
FortranArray2D(size_t n_cols, size_t n_rows)
: n_cols(n_cols), n_rows(n_rows), content(n_cols * n_rows) { }
double &operator()(size_t col, size_t row)
{ return content.at(row * n_rows + col); }
void resize(size_t new_cols, size_t new_rows)
{
FortranArray2D temp(new_cols, new_rows);
// insert some logic to move values from old to new...
*this = std::move(temp);
}
private:
size_t n_rows, n_cols;
std::vector<double> content;
};
Note in particular that by avoiding new you avoid the thousand and one headaches that come with manual memory management. Your class is copyable and movable by default. You could add further methods to replicate any functionality that the Fortran array has which you need.
int ** x;
x = new int* [10];
for(int i = 0; i < 10; i++)
x[i] = new int[5];
Unfortunately you'll have to store the size of matrix somewhere else.
C/C++ won't do it for you. sizeof() works only when compiler knows the size, which is not true in dynamic arrays.
And if you wan to achieve it with something more safe than dynamic arrays:
#include <vector>
// ...
std::vector<std::vector<int>> vect(10, std::vector<int>(5));
vect[3][2] = 1;

Infinite array C++ resizing the array with two new values in one expression

I have started implementing infinite array using templates in C++. Adding integers works well except one particular situation where I add two new items in one expression which required two resizes one after another (cf. below).
#include <iostream>
#include <cstddef>
#include <new>
#include <string.h>
template <typename T>
struct infinite_array {
infinite_array();
auto operator[](unsigned long long idx) -> T&;
auto size() const -> unsigned long long;
void resize(unsigned long long idx);
private:
T *data;
unsigned long long array_length;
};
template <typename T>
void infinite_array<T>::resize(unsigned long long idx)
{
std::cout << "Resize with idx " << idx << std::endl;
T* temp = new T[idx];
memset(temp, 0, sizeof(T) * idx);
for (int i = 0; i < array_length; ++i) {
temp[i] = data[i];
std::cout << temp[i] << " ";
}
std::cout << std::endl;
//std::copy(data, data+size(), temp);
delete [] data;
data = temp;
array_length = idx;
}
template <typename T>
infinite_array<T>::infinite_array()
{
data = NULL;
array_length = 0;
}
template <typename T>
auto infinite_array<T>::size() const -> unsigned long long {
//array_length = sizeof(data)/sizeof(T);
return array_length;
}
template <typename T>
auto infinite_array<T>::operator[](unsigned long long idx) -> T& {
//std::cout << "Accessing element at idx " << idx << std::endl;
if (idx+1 > size()) {
resize(idx+1);
}
return data[idx];
}
int main() {
infinite_array<int> ar;
for (int i = 0; i < 10; ++i) {
ar[i] = i;
}
// PROBLEM: ONLY ar[31] is initialized successfully to 10
ar[30] = ar[31] = 10;
for (int i = 0; i < ar.size(); ++i)
std::cout << ar[i] << ' ';
std::cout << std::endl;
return 0;
}
I'm afraid, there is no way to fix your problem, because you cannot control the order in which the calls to operator[] are made. And this is the problem:
Your compiler chooses to evaluate ar[30] first, which will resize your array and return a reference to one of its elements.
After that, ar[31] is evaluated, the array is resized again, another reference to one of the new arrays elements is returned. The old reference still points to the element in the old array (which is deleted!).
Finally, you compiler performs the assignment, assigning 10 to both elements. But since one of these elements lives in the old deleted array, you don't see it in the new array.
The simple truth is, that you must not chain calls to your operator[] like this, you can't work around the fact, that the compiler is allowed to perform the calls in any order.
Aside: It is generally a bad idea to resize any buffer on a one by one basis, the complexity of this is quadratic. Typical code uses increments of at least a factor 2. The precise factor is not so relevant, relevant is that you use a factor, because then you cut the complexity down to O(n). The value 2 is just a good tradeoff between space and time overhead.
Typically, you will not want to have your accessor methods (at() or operator[]) resize the array as it would violate separation of concerns (each function should do 1 thing - having it resize would require it to do 2).
The way the standard library implements std::vector: If you use at() and supply an address out of bounds, it throws an exception (if you use operator[], it is UB).
The problem you are running into with
ar[30] = ar[31] = 10;
If the size of the array must be resized, then both calls are going to have to resize it. It is very similar to what happens with i = i++ + ++i; (which is also UB). When you resize to size 31, you have a temp buffer and set the value of the new element to 10. When you resize to 32, you have a (different) temp buffer and set the value of the new element to 10. When the later one returns, it does not have the value of the former, so only 1 is written. To fix it, separate your operations.

How do I best handle dynamic multi-dimensional arrays in C/C++?

What is the accepted/most commonly used way to manipulate dynamic (with all dimensions not known until runtime) multi-dimensional arrays in C and/or C++.
I'm trying to find the cleanest way to accomplish what this Java code does:
public static void main(String[] args){
Scanner sc=new Scanner(System.in);
int rows=sc.nextInt();
int cols=sc.nextInt();
int[][] data=new int[rows][cols];
manipulate(data);
}
public static void manipulate(int[][] data){
for(int i=0;i<data.length;i++)
for(int j=0;j<data[0].length.j++){
System.out.print(data[i][j]);
}
}
(reads from std_in just to clarify that dimensions aren't known until runtime).
Edit:I noticed that this question is pretty popular even though it's pretty old. I don't actually agree with the top voted answer. I think the best choice for C is to use a single-dimensional array as Guge said below "You can alloc rowscolssizeof(int) and access it by table[row*cols+col].".
There is a number of choices with C++, if you really like boost or stl then the answers below might be preferable, but the simplest and probably fastest choice is to use a single dimensional array as in C.
Another viable choice in C and C++ if you want the [][] syntax is lillq's answer down at the bottom is manually building the array with lots of malloc's.
Use boost::multi_array.
As in your example, the only thing you need to know at compile time is the number of dimensions. Here is the first example in the documentation :
#include "boost/multi_array.hpp"
#include <cassert>
int
main () {
// Create a 3D array that is 3 x 4 x 2
typedef boost::multi_array<double, 3> array_type;
typedef array_type::index index;
array_type A(boost::extents[3][4][2]);
// Assign values to the elements
int values = 0;
for(index i = 0; i != 3; ++i)
for(index j = 0; j != 4; ++j)
for(index k = 0; k != 2; ++k)
A[i][j][k] = values++;
// Verify values
int verify = 0;
for(index i = 0; i != 3; ++i)
for(index j = 0; j != 4; ++j)
for(index k = 0; k != 2; ++k)
assert(A[i][j][k] == verify++);
return 0;
}
Edit: As suggested in the comments, here is a "simple" example application that let you define the multi-dimensional array size at runtime, asking from the console input.
Here is an example output of this example application (compiled with the constant saying it's 3 dimensions) :
Multi-Array test!
Please enter the size of the dimension 0 : 4
Please enter the size of the dimension 1 : 6
Please enter the size of the dimension 2 : 2
Text matrix with 3 dimensions of size (4,6,2) have been created.
Ready!
Type 'help' for the command list.
>read 0.0.0
Text at (0,0,0) :
""
>write 0.0.0 "This is a nice test!"
Text "This is a nice test!" written at position (0,0,0)
>read 0.0.0
Text at (0,0,0) :
"This is a nice test!"
>write 0,0,1 "What a nice day!"
Text "What a nice day!" written at position (0,0,1)
>read 0.0.0
Text at (0,0,0) :
"This is a nice test!"
>read 0.0.1
Text at (0,0,1) :
"What a nice day!"
>write 3,5,1 "This is the last text!"
Text "This is the last text!" written at position (3,5,1)
>read 3,5,1
Text at (3,5,1) :
"This is the last text!"
>exit
The important parts in the code are the main function where we get the dimensions from the user and create the array with :
const unsigned int DIMENSION_COUNT = 3; // dimension count for this test application, change it at will :)
// here is the type of the multi-dimensional (DIMENSION_COUNT dimensions here) array we want to use
// for this example, it own texts
typedef boost::multi_array< std::string , DIMENSION_COUNT > TextMatrix;
// this provide size/index based position for a TextMatrix entry.
typedef std::tr1::array<TextMatrix::index, DIMENSION_COUNT> Position; // note that it can be a boost::array or a simple array
/* This function will allow the user to manipulate the created array
by managing it's commands.
Returns true if the exit command have been called.
*/
bool process_command( const std::string& entry, TextMatrix& text_matrix );
/* Print the position values in the standard output. */
void display_position( const Position& position );
int main()
{
std::cout << "Multi-Array test!" << std::endl;
// get the dimension informations from the user
Position dimensions; // this array will hold the size of each dimension
for( int dimension_idx = 0; dimension_idx < DIMENSION_COUNT; ++dimension_idx )
{
std::cout << "Please enter the size of the dimension "<< dimension_idx <<" : ";
// note that here we should check the type of the entry, but it's a simple example so lets assume we take good numbers
std::cin >> dimensions[dimension_idx];
std::cout << std::endl;
}
// now create the multi-dimensional array with the previously collected informations
TextMatrix text_matrix( dimensions );
std::cout << "Text matrix with " << DIMENSION_COUNT << " dimensions of size ";
display_position( dimensions );
std::cout << " have been created."<< std::endl;
std::cout << std::endl;
std::cout << "Ready!" << std::endl;
std::cout << "Type 'help' for the command list." << std::endl;
std::cin.sync();
// we can now play with it as long as we want
bool wants_to_exit = false;
while( !wants_to_exit )
{
std::cout << std::endl << ">" ;
std::tr1::array< char, 256 > entry_buffer;
std::cin.getline(entry_buffer.data(), entry_buffer.size());
const std::string entry( entry_buffer.data() );
wants_to_exit = process_command( entry, text_matrix );
}
return 0;
}
And you can see that to accede an element in the array, it's really easy : you just use the operator() as in the following functions :
void write_in_text_matrix( TextMatrix& text_matrix, const Position& position, const std::string& text )
{
text_matrix( position ) = text;
std::cout << "Text \"" << text << "\" written at position ";
display_position( position );
std::cout << std::endl;
}
void read_from_text_matrix( const TextMatrix& text_matrix, const Position& position )
{
const std::string& text = text_matrix( position );
std::cout << "Text at ";
display_position(position);
std::cout << " : "<< std::endl;
std::cout << " \"" << text << "\"" << std::endl;
}
Note : I compiled this application in VC9 + SP1 - got just some forgettable warnings.
There are two ways to represent a 2-dimension array in C++. One being more flexible than the other.
Array of arrays
First make an array of pointers, then initialize each pointer with another array.
// First dimension
int** array = new int*[3];
for( int i = 0; i < 3; ++i )
{
// Second dimension
array[i] = new int[4];
}
// You can then access your array data with
for( int i = 0; i < 3; ++i )
{
for( int j = 0; j < 4; ++j )
{
std::cout << array[i][j];
}
}
THe problem with this method is that your second dimension is allocated as many arrays, so does not ease the work of the memory allocator. Your memory is likely to be fragmented resulting in poorer performance. It provides more flexibility though since each array in the second dimension could have a different size.
Big array to hold all values
The trick here is to create a massive array to hold every data you need. The hard part is that you still need the first array of pointers if you want to be able to access the data using the array[i][j] syntax.
int* buffer = new int[3*4];
int** array = new int*[3];
for( int i = 0; i < 3; ++i )
{
array[i] = array + i * 4;
}
The int* array is not mandatory as you could access your data directly in buffer by computing the index in the buffer from the 2-dimension coordinates of the value.
// You can then access your array data with
for( int i = 0; i < 3; ++i )
{
for( int j = 0; j < 4; ++j )
{
const int index = i * 4 + j;
std::cout << buffer[index];
}
}
The RULE to keep in mind
Computer memory is linear and will still be for a long time. Keep in mind that 2-dimension arrays are not natively supported on a computer so the only way is to "linearize" the array into a 1-dimension array.
You can alloc rowscolssizeof(int) and access it by table[row*cols+col].
Here is the easy way to do this in C:
void manipulate(int rows, int cols, int (*data)[cols]) {
for(int i=0; i < rows; i++) {
for(int j=0; j < cols; j++) {
printf("%d ", data[i][j]);
}
printf("\n");
}
}
int main() {
int rows = ...;
int cols = ...;
int (*data)[cols] = malloc(rows*sizeof(*data));
manipulate(rows, cols, data);
free(data);
}
This is perfectly valid since C99, however it is not C++ of any standard: C++ requires the sizes of array types to be compile times constants. In that respect, C++ is now fifteen years behind C. And this situation is not going to change any time soon (the variable length array proposal for C++17 does not come close to the functionality of C99 variable length arrays).
The standard way without using boost is to use std::vector :
std::vector< std::vector<int> > v;
v.resize(rows, std::vector<int>(cols, 42)); // init value is 42
v[row][col] = ...;
That will take care of new / delete the memory you need automatically. But it's rather slow, since std::vector is not primarily designed for using it like that (nesting std::vector into each other). For example, all the memory is not allocated in one block, but separate for each column. Also the rows don't have to be all of the same width. Faster is using a normal vector, and then doing index calculation like col_count * row + col to get at a certain row and col:
std::vector<int> v(col_count * row_count, 42);
v[col_count * row + col) = ...;
But this will loose the capability to index the vector using [x][y]. You also have to store the amount of rows and cols somewhere, while using the nested solution you can get the amount of rows using v.size() and the amount of cols using v[0].size().
Using boost, you can use boost::multi_array, which does exactly what you want (see the other answer).
There is also the raw way using native C++ arrays. This envolves quite some work and is in no way better than the nested vector solution:
int ** rows = new int*[row_count];
for(std::size_t i = 0; i < row_count; i++) {
rows[i] = new int[cols_count];
std::fill(rows[i], rows[i] + cols_count, 42);
}
// use it... rows[row][col] then free it...
for(std::size_t i = 0; i < row_count; i++) {
delete[] rows[i];
}
delete[] rows;
You have to store the amount of columns and rows you created somewhere since you can't receive them from the pointer.
2D C-style arrays in C and C++ are a block of memory of size rows * columns * sizeof(datatype) bytes.
The actual [row][column] dimensions exist only statically at compile time. There's nothing there dynamically at runtime!
So, as others have mentioned, you can implement
int array [ rows ] [ columns ];
As:
int array [ rows * columns ]
Or as:
int * array = malloc ( rows * columns * sizeof(int) );
Next: Declaring a variably sized array. In C this is possible:
int main( int argc, char ** argv )
{
assert( argc > 2 );
int rows = atoi( argv[1] );
int columns = atoi( argv[2] );
assert(rows > 0 && columns > 0);
int data [ rows ] [ columns ]; // Yes, legal!
memset( &data, 0, sizeof(data) );
print( rows, columns, data );
manipulate( rows, columns, data );
print( rows, columns, data );
}
In C you can just pass the variably-sized array around the same as a non-variably-sized array:
void manipulate( int theRows, int theColumns, int theData[theRows][theColumns] )
{
for ( int r = 0; r < theRows; r ++ )
for ( int c = 0; c < theColumns; c ++ )
theData[r][c] = r*10 + c;
}
However, in C++ that is not possible. You need to allocate the array using dynamic allocation, e.g.:
int *array = new int[rows * cols]();
or preferably (with automated memory management)
std::vector<int> array(rows * cols);
Then the functions must be modified to accept 1-dimensional data:
void manipulate( int theRows, int theColumns, int *theData )
{
for ( int r = 0; r < theRows; r ++ )
for ( int c = 0; c < theColumns; c ++ )
theData[r * theColumns + c] = r*10 + c;
}
If you're using C instead of C++ you might want to look at the Array_T abstraction in Dave Hanson's library C Interfaces and Implementations. It's exceptionally clean and well designed. I have my students do a two-dimensional version as an exercise. You could do that or simply write an additional function that does an index mapping, e.g.,
void *Array_get_2d(Array_T a, int width, int height, int i, int j) {
return Array_get(a, j * width, i, j);
}
It is a bit cleaner to have a separate structure where you store the width, the height, and a pointer to the elements.
I recently came across a similar problem. I did not have Boost available. Vectors of vectors turned out to be pretty slow in comparison to plain arrays. Having an array of pointers makes the initialization a lot more laborious, because you have to iterate through every dimension and initialize the pointers, possibly having some pretty unwieldy, cascaded types in the process, possibly with lots of typedefs.
DISCLAIMER: I was not sure if I should post this as an answer, because it only answers part of your question. My apologies for the following:
I did not cover how to read the dimensions from standard input, as other commentators had remarked.
This is primarily for C++.
I have only coded this solution for two dimensions.
I decided to post this anyway, because I see vectors of vectors brought up frequently in reply to questions about multi-dimensional arrays in C++, without anyone mentioning the performance aspects of it (if you care about it).
I also interpreted the core issue of this question to be about how to get dynamic multi-dimensional arrays that can be used with the same ease as the Java example from the question, i.e. without the hassle of having to calculate the indices with a pseudo-multi-dimensional one-dimensional array.
I didn't see compiler extensions mentioned in the other answers, like the ones provided by GCC/G++ to declare multi-dimensional arrays with dynamic bounds the same way you do with static bounds. From what I understand, the question does not restrict the answers to standard C/C++. ISO C99 apparently does support them, but in C++ and prior versions of C they appear to be compiler-specific extensions. See this question: Dynamic arrays in C without malloc?
I came up with a way that people might like for C++, because it's little code, has the ease of use of the built-in static multi-dimensional arrays, and is just as fast.
template <typename T>
class Array2D {
private:
std::unique_ptr<T> managed_array_;
T* array_;
size_t x_, y_;
public:
Array2D(size_t x, size_t y) {
managed_array_.reset(new T[x * y]);
array_ = managed_array_.get();
y_ = y;
}
T* operator[](size_t x) const {
return &array_[x * y_];
}
};
You can use it like this. The dimensions do not
auto a = Array2D<int>(x, y);
a[xi][yi] = 42;
You can add an assertion, at least to all but the last dimension and extend the idea to to more than two dimensions. I have made a post on my blog about alternative ways to get multi-dimensional arrays. I am also much more specific on the relative performance and coding effort there.
Performance of Dynamic Multi-Dimensional Arrays in C++
You could use malloc to accomplish this and still have it accessible through normal array[][] mean, verses the array[rows * cols + cols] method.
main()
{
int i;
int rows;
int cols;
int **array = NULL;
array = malloc(sizeof(int*) * rows);
if (array == NULL)
return 0; // check for malloc fail
for (i = 0; i < rows; i++)
{
array[i] = malloc(sizeof(int) * cols)
if (array[i] == NULL)
return 0; // check for malloc fail
}
// and now you have a dynamically sized array
}
There is no way to determine the length of a given array in C++. The best way would probably be to pass in the length of each dimension of the array, and use that instead of the .length property of the array itself.