I need to work upon a variable number of fixed-size arrays. More specifically, N points in a K-dimensional space, where I know K beforehand, but I don't know N at compile time.
So I want to use a pointer to the fixed-size array, and allocate space for N K-dimensional points at runtime.
In C, I can allocate the said pointer with malloc. Example test.c below, where dimension is 3 for simplicity:
#include <stdlib.h>
#include <stdio.h>
#define DIMENSIONS 3
typedef float PointKDimensions[DIMENSIONS];
void do_stuff( int num_points){
PointKDimensions *points;
points = malloc(num_points * sizeof(PointKDimensions));
points[5][0] = 0; // set value to 6th point, first dimension
points[5][1] = 1.0; // to second dimension
points[5][2] = 3.14; // to third dimension
return;
}
int main(){
do_stuff(10); // at run-time I find out I have 10 points to handle
return 0;
}
I can compile this with gcc test.c without errors, and run without segmentation faults.
However, if I try to achieve the same behavior with C++ mv test.c test.cpp, followed by g++ test.cpp, I get:
test.cpp: In function ‘void do_stuff(int)’:
test.cpp:10:18: error: invalid conversion from ‘void*’ to ‘float (*)[3]’ [-fpermissive]
10 | points = malloc(num_points * sizeof(float) * DIMENSIONS);
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| void*
Searching up, I found that C++ does not do implicit conversions for malloc, so I changed the malloc to:
points = (float*) malloc(num_points * sizeof(float) * DIMENSIONS);
And then the error becomes:
test.cpp: In function ‘void do_stuff(int)’:
test.cpp:10:12: error: cannot convert ‘float*’ to ‘float (*)[3]’ in assignment
10 | points = (float*) malloc(num_points * sizeof(float) * DIMENSIONS);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| float*
But I could not find a way to do the appropriate cast/conversion to solve this error. E.g., (float**) is also not the same as float(*)[3].
Any suggestions on how to allocate space for a pointer to fixed-sized arrays in C++?
You need to cast the result of malloc as PointKDimensions* not as a float*:
typedef float PointKDimensions[DIMENSIONS];
void do_stuff( int num_points){
PointKDimensions *points;
points = (PointKDimensions*)malloc(num_points * sizeof(PointKDimensions));
points[5][0] = 0; // set value to 6th point, first dimension
points[5][1] = 1.0; // to second dimension
points[5][2] = 3.14; // to third dimension
return;
}
Or better, use C++'s built-in container for dynamically sized arrays, std::vector:
vector<vector<float>> points;
void do_stuff( int num_points){
points.resize(num_points, vector<float>(k)); // assuming k = number of dimensions.
points[5][0] = 0; // set value to 6th point, first dimension
points[5][1] = 1.0; // to second dimension
points[5][2] = 3.14; // to third dimension
return;
}
You could use std:array for that.
e.g.
#include <array>
inline constexpr const size_t DIMENSIONS = 3;
using PointKDimensions = std::array<float, DIMENSIONS>;
Related
I have a following array which I am passing as a pointer. BTW, I am new to C++ and just started pointers.
int arr[3][4]= {{2,3,4,8},{5,7,9,12},{1, 0, 6, 10}};
//int *a = &arr[0][0];
BuildStringFromMatrix((int *)arr, 3, 4);
I have a a following function with which I wanna access the elements of the passed array.
void BuildStringFromMatrix(int *a, int height, int width);
My implementation of the accessing the element is as follows
for(int i=0; i<height; i++){
for(int j=0; i<width; j++){
int x = *(*(a+i) + j);
std::cout<<x;
}
}
While using this implementation I am getting an error
invalid type argument of unary ‘*’ (have ‘int’
)
int x = *(*(a+i) + j);
How can I fix this issue.
P.S - I wanna implement this using single pointer.
Your function receives a pointer to int. It doesn't "remember" that it is actually pointing to an int that is in a 2-D array (let alone what the array dimensions are).
So if you want to access a certain element of the 2-D array you must perform a calculation to find how many units to offset from the pointer to get to the intended element. (This is sometimes called "flattening an array").
Typically row * row_length + column is the offset for a particular row-column entry, so in your case int x = a[i*width + j]; is the right statement to use.
If this is still unclear I suggest printing out the value of i*width+j at each iteration and seeing how it iterates over the array (or follow in your debugger).
I have a function that takes in a void* buffer parameter. This function (which is provided by HDF here. From my understanding, it reads info from a dataset into the buffer. I have this working, but only if I create a 3d int array using constant values. I need to be able to do this using values passed in by the user.
Here is the start of that function:
void* getDataTest(int countX, int countY)
{
int NX = countX;
int NY = countY;
int NZ = 1;
int data_out[NX][NY][NZ]; //I know this doesn't work, just posting it for reference
//.
//. more code here...
//.
// Read function is eventually called...
h5Dataset.read(data_out, H5::PredType::NATIVE_INT, memspace, h5Dataspace);
}
This constantly fails on me. However, my previoud implementation that used const int values when creating the data_out array worked fine:
void* getDataTest(int countX, int countY)
{
const int NX = 5;
const int NY = 5;
const int NZ = 1;
int data_out[NX][NY][NZ];
//.
//. more code here...
//.
// Read function is eventually called...
h5Dataset.read(data_out, H5::PredType::NATIVE_INT, memspace, h5Dataspace);
}
This works fine. From my understanding, this function (which I have no control over) requires dataspaces of the same dimensionality (e.g. a 3D array will only work with a 3D array while a 2D array will only work with a 2D array when copying over the data to the buffer).
So, my key problem here is that I can't seem to figure out how to create a 3D int array that the read function is happy with (the function parameter is a void* but I can't seem to get anything other than a 3d int array to work). I've tried a 3D int array represented as an array of arrays of arrays using:
int*** data_out = new int**[NX];
but this failed as well. Any ideas on how I can create a 3D int array of the form int arrayName[non-constant value][non-constant value][non-constant value]? I know you can't create an array using non-constant values, but I added them in an attempt to clarify my goal. Should there be a way in C++ to use function parameters as values for instantiating an array?
I think the easiest is to do this:
int* data_out = new int[NX * NY * NZ];
You can then access this 1D array as a 3D array like that:
int value = array[z * NX * NY + y * NX + x];
In a more C++11 style, you can use an std::vector:
std::vector<int> data_out;
data_out.resize(NX * NY * NZ);
And calling the function like that:
h5Dataset.read(data_out.begin(), H5::PredType::NATIVE_INT, memspace, h5Dataspace);
Do it like this:
std::vector<int> array;
array.resize(Nx*Ny*Nz);
array[z*Ny*Nx + y*Nx + x] = value
It's nice to have the array[z][y][x] syntax, but supporting it is more trouble than it is worth.
I have an algorithm that I want to run that uses a potentially long double array. Because the array can be millions in length, I'm putting it on the GPU so I need to export the array from a CPP file to a CU file. However, Im prototyping it in CPP only for now because it doesnt work in either case.
In my CPU prototype I get errors when I try to set the members of the double array with my for loop. For example, any operation including cout will give error c2109:subscript requires array or pointer type in the CPP file
or if the same code is run from a CU file, error: expression must have a pointer-to-object type
const int size = 100000;
double inputMeshPts_PROXY[size][4];
inputMeshPts.get(inputMeshPts_PROXY);
int lengthPts = inputMeshPts.length();
if (useCUDA == 1)
{
double *inputMeshPts_CUDA = &inputMeshPts_PROXY[size][4];
myArray(lengthPts, inputMeshPts_CUDA);
}
MStatus abjBlendShape::myArray(int length_CUDA, float weight_CUDA, double *inputMeshPts_CUDA)
{
for (int i = 0; i < length_CUDA; i++)
{
for (int j = 0; j < 3; j++)
{
cout << inputMeshPts_CUDA[i][j] << endl;
// inputMeshPts_CUDA[i][j] += (sculptedMeshPts_PROXY[i][j] - inputMeshPts_CUDA[i][j]); // WHAT I WANT, EVENTUALLY
}
}
}
When you are writing:
double *inputMeshPts_CUDA = &inputMeshPts_PROXY[size][4];
The variable inputMeshPts_CUDA is a pure pointer. You cannot use 2-dimensional indexing [][] as before. The right way to access it is now to linearize the indexes:
inputMeshPts_CUDA[i*4+j]
Alternatively you could declare "correctly" your pointer:
double (*inputMeshPts_CUDA)[4] = inputMeshPts_PROXY;
which allows you to use the 2-dimensional indexing again.
MStatus abjBlendShape::myArray(int length_CUDA, float weight_CUDA, double *inputMeshPts_CUDA)
{
inputMeshPts_CUDA is just a pointer, the compiler has lost all the dimension information. It needs that dimension information for inputMeshPts_CUDA[i][j], which gets converted to an access to address (byte arithmetic, not C++ pointer arithmetic)
inputMeshPts_CUDA + i * sizeof (double) * num_colums + j * sizeof (double)
You can either provide the missing information yourself and do the arithmetic like Angew suggests, or have the compiler pass the dimension information through:
template<size_t M, size_t N>
MStatus abjBlendShape::myArray(int length_CUDA, float weight_CUDA, double (&inputMeshPts_CUDA)[M][N])
Of course, this only works when the size is known at compile-time.
inputMeshPts_CUDA is a pointer to double - that is, it can represent a 1D array. You're accessing it as a 2D array: inputMeshPts_CUDA[i][j]. That doesn't make sense - you're effectively applying [j] to the double object storead at inputMeshPts_CUDA[i].
I believe you were looking for inputMeshPts_CUDA[i * 4 + j] - you have to compute the 2D addressing yourself.
I have written a program which make a 2d array and then set its numbers.
The second step that I have problem in it is that when I want to shift rows and columns I face with a problem in this line nmatrix[i*c+j] = 0;
the error is this : error: incompatible types in assignment of 'int' to 'int [(((sizetype)(((ssizetype)(c + shiftc)) + -1)) + 1)]'
here is the code :
void shiftMatrix(int *matrix, int r,int c ,int shiftr,int shiftc){
int nmatrix [r+shiftr][c+shiftc];
for(int i = 0; i< shiftr; i++)
{
for(int j = 0; j<shiftc;j++)
{
nmatrix[i*c+j] = 0;
}
}
for(int i = shiftr; i< r; i++)
{
for(int j = shiftc; j<c;j++)
{
nmatrix[i*c+j] = matrix[i*c+j];
}
}
}
Any help please??
thanks in advance
int nmatrix [r+shiftr][c+shiftc];
First of all, you are using an array with non-constant bounds, which is a controversial feature.
In addition, here you are declaring a two-dimensional array nmatrix, but your other matrix (matrix) is a pointer to int (or a one-dimensional array, if you like to look at it this way). This is a recipe for confusion.
You can easily declare nmatrix ("new matrix"?) as a one-dimensional array:
int nmatrix[(r+shiftr) * (c+shiftc)];
Or (presumably better)
std::vector<int> nmatrix((r+shiftr) * (c+shiftc));
Then, your code nmatrix[i*c+j] = 0 will work (however, you have to change c to c+shiftc whenever you work with nmatrix).
You cannot define an array dynamically the way you do it.
You need to use the c++ keyword new:
int nmatrix[][] = new int [r+shiftr][c+shiftc];
You cannot define arrays the way you did, with non constant int value for dimension, because such static arrays are to be defined for memory at the compile stage. Thus dimensions should be const expression.
On the contrary with keyword new you can define dimensions for arrays at run-time stage, because it's dynamic allocation.
There are more detailed answers in this SO question here.
I have a question-related to copying structure containing 2D pointer to the device from the host, my code is as follow
struct mymatrix
{
matrix m;
int x;
};
size_t pitch;
mymatrix m_h[5];
for(int i=0; i<5;i++){
m_h[i].m = (float**) malloc(4 * sizeof(float*));
for (int idx = 0; idx < 4; ++idx)
{
m_h[i].m[idx] = (float*)malloc(4 * sizeof(float));
}
}
mymatrix *m_hh = (mymatrix*)malloc(5*sizeof(mymatrix));
memcpy(m_hh,m_h,5*sizeof(mymatrix));
for(int i=0 ; i<5 ;i++)
{
cudaMallocPitch((void**)&(m_hh[i].m),&pitch,4*sizeof(float),4);
cudaMemcpy2D(m_hh[i].m, pitch, m_h[i].m, 4*sizeof(float), 4*sizeof(float),4,cudaMemcpyHostToDevice);
}
mymatrix *m_d;
cudaMalloc((void**)&m_d,5*sizeof(mymatrix));
cudaMemcpy(m_d,m_hh,5*sizeof(mymatrix),cudaMemcpyHostToDevice);
distance_calculation_begins<<<1,16>>>(m_d,pitch);
Problem
With this code I am unable to access 2D pointer elements of the structure, but I can access x from that structure in device. e.g. such as I have receive m_d with pointer mymatrix* m if I initialize
m[0].m[0][0] = 5;
and printing this value such as
cuPrintf("The value is %f",m[0].m[0][0]);
in the device, I get no output. Means I am unable to use 2D pointer, but if I try to access
m[0].x = 5;
then I am able to print this. I think my initializations are correct, but I am unable to figure out the problem. Help from anyone will be greatly appreciated.
In addition to the issues that #RobertCrovella noted on your code, also note:
You are only getting a shallow copy of your structure with the memcpy that copies m_h to m_hh.
You are assuming that pitch is the same in all calls to cudaMemcpy2D() (you overwrite the pitch and use only the latest copy at the end). I think that might be safe assumption for now but it could change in the future.
You are using cudaMemcpyHostToDevice() with cudaMemcpyHostToDevice to copy to m_hh, which is on the host, not the device.
Using many small buffers and tables of pointers is not efficient in CUDA. The small allocations and deallocations can end up taking a lot of time. Also, using tables of pointers cause extra memory transactions because the pointers must be retrieved from memory before they can be used as bases for indexing. So, if you consider a construct such as this:
a[10][20][30] = 3
The pointer at a[10] must first be retrieved from memory, causing your warp to be put on hold for a long time (up to around 600 cycles on Fermi). Then, the same thing happens for the second pointer, adding another 600 cycles. In addition, these requests are unlikely to be coalesced causing even more memory transactions.
As Robert mentioned, the solution is to flatten your memory structures. I've included an example for this, which you may be able to use as a basis for your program. As you can see, the code is overall much simpler. The part that does become a bit more complex is the index calculations. Also, this approach assumes that your matrixes are all of the same size.
I have added error checking as well. If you had added error checking in your code, you would have found at least a couple of the bugs without any extra effort.
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <stdio.h>
typedef float* mymatrix;
const int n_matrixes(5);
const int w(4);
const int h(4);
#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
inline void gpuAssert(cudaError_t code, char *file, int line, bool abort=true)
{
if (code != cudaSuccess)
{
fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
if (abort) exit(code);
}
}
__global__ void test(mymatrix m_d, size_t pitch_floats)
{
// Print the value at [2][3][4].
printf("%f ", m_d[3 + (2 * h + 4) * pitch_floats]);
}
int main()
{
mymatrix m_h;
gpuErrchk(cudaMallocHost(&m_h, n_matrixes * w * sizeof(float) * h));
// Set the value at [2][3][4].
m_h[2 * (w * h) + 3 + 4 * w] = 5.0f;
// Create a device copy of the matrix.
mymatrix m_d;
size_t pitch;
gpuErrchk(cudaMallocPitch((void**)&m_d, &pitch, w * sizeof(float), n_matrixes * h));
gpuErrchk(cudaMemcpy2D(m_d, pitch, m_h, w * sizeof(float), w * sizeof(float), n_matrixes * h, cudaMemcpyHostToDevice));
test<<<1,1>>>(m_d, pitch / sizeof(float));
gpuErrchk(cudaPeekAtLastError());
gpuErrchk(cudaDeviceSynchronize());
}
Your matrix m class/struct member appears to be some sort of double pointer based on how you are initializing it on the host:
m_h[i].m = (float**) malloc(4 * sizeof(float*));
Copying an array of structures with embedded pointers between host and device is somewhat compilicated. Copying a data structure that is pointed to by a double pointer is also complicated.
For an array of structures with embedded pointers, refer to this posting.
For copying a 2D array (double pointer, i.e. **), refer to this posting. We don't use cudaMallocPitch/cudaMemcpy2D to accomplish this. (Note that cudaMemcpy2D takes single pointer * arguments, you are passing it double pointer ** arguments e.g. m_h[i].m)
Instead of the above approaches, it's recommended that you flatten your data so that it can all be referenced with single pointer referencing, with no embedded pointers.