Passing dynamic array of structs to GPU kernel

Passing dynamic array of structs to GPU kernel - c++

I try to pass my dynamic array of structs to kernel but it doesn't works. I get - "Segmentation fault (core dumped)"
My code - EDITED
#include <stdio.h>
#include <stdlib.h>
struct Test {
unsigned char *array;
};
__global__ void kernel(Test *dev_test) {
}
int main(void) {
int n = 4;
int size = 5;
unsigned char *array[size];
Test *dev_test;
// allocate for host
Test *test = (Test*)malloc(sizeof(Test)*n);
for(int i = 0; i < n; i++)
test[i].array = (unsigned char*)malloc(size);
// fill data
for(int i=0; i<n; i++) {
unsigned char temp[] = { 'a', 'b', 'c', 'd' , 'e' };
memcpy(test[i].array, temp, size);
}
// allocate for gpu
cudaMalloc((void**)&dev_test, n * sizeof(Test));
for(int i=0; i < n; i++) {
cudaMalloc((void**)&(array[i]), size * sizeof(unsigned char));
cudaMemcpy(&(dev_test[i].array), &(array[i]), sizeof(unsigned char *), cudaMemcpyHostToDevice);
}
kernel<<<1, 1>>>(dev_test);
return 0;
}
How correctly I should allocate gpu memory and copy data to this memory?

You need to allocate memory for struct member array.
Test *test = malloc(sizeof(Test)*n);
for(int i = 0; i < n; i++)
test[i]->array = malloc(size);
I would suggest to read this answer to cope up with other issues after this fix.

what is your card ? if your card support compute capability >= 3.0, try the unified memory system , to have same data in host/device memory
you can have a look here :
it should maybe look like this one :
int main(void) {
int n = 4;
int size = 5;
Test *test;
cudaMallocManaged(&test, n * size);
unsigned char values[] = { 'a', 'b', 'c', 'd' , 'e' };
for(int i=0; i<n; i++)
{
unsigned char* temp;
cudaMallocManaged(&temp, size*sizeof(char) );
memcpy(temp, values, sizeof(values) );
}
// avoid copy code, makes a deep copy of objects
kernel<<<1, 1>>>(test);
return 0;
}
And i hope you know it, but Don't forget do call cudaFree & delete/free on allocated memory. (better to use std::vector and use data() to access to raw pointer)

Related

Different between fixed array and dynamic allocated array in C++

I have some C++ function like below:
void fillBuffer(char* buffer, int len) {
for (int i = 0; i < len; i++) {
buffer[i] = 1;
}
}
void fillByPointerOfBuffer(char **pBuffer, int len) {
fillBuffer(*pBuffer, len);
}
void printBuffer(char* buffer, int len) {
for (int i = 0; i < len; i++) {
printf("%d", buffer[i]);
}
}
In my main program, I try some test as below:
Test 1:
char *buffer = new char[6];
fillByPointerOfBuffer(&buffer, 6);
printBuffer(buffer, 6);
delete[] buffer;
// --> It output: 111111
Test 2:
char buffer[6];
fillByPointerOfBuffer( (char**)(&buffer), 6);
printBuffer(buffer, 6);
// --> Exception thrown: write access violation.
Test 3:
char buffer[6];
char *buffer2 = buffer;
fillByPointerOfBuffer(&buffer2, 6);
printBuffer(buffer, 6);
//// --> It output: 111111
In Test1, pointer to dynamic allocated array is passed to function fillByPointerOfBuffer.
In Test2, pointer to fixed length array is passed to function fillByPointerOfBuffer.
In Test3, pointer to "alias" variable of fixed length array is passed to function fillByPointerOfBuffer.
I think that they all must be the same, but why Test1 and Test3 worked, but Test2 does not work?

Memory Allocation Problems with structure

Why does the provided code crash at the following line?
data *fillA = (data*)calloc(matrixa->nzmax, sizeof(data));
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <algorithm>
#include <time.h>
using namespace std;
struct csr
{
int rows;
int cols;
int nzmax;
int *rowPtr;
int *colInd;
double *values;
};
struct data
{
int entry;
int index;
};
bool descend(const data &a, const data &b)
{
return a.entry > b.entry;
}
static bool ascend(const data &a, const data &b)
{
return a.entry < b.entry;
}
void csrtranspose(struct csr *matrixa)
{
int i, j, counter;
double *tArray = NULL;
data *fillA = (data*)calloc(matrixa->nzmax, sizeof(data));//fails here
for (int i = 0; i < matrixa->nzmax; i++)
{
fillA[i].entry = matrixa->colInd[i];
fillA[i].index = i;
}
sort(fillA, fillA + matrixa->nzmax, ascend);
tArray = (double*)calloc(matrixa->nzmax, sizeof(double));
for (int i = 0; i < matrixa->nzmax; i++)
{
tArray[i] = matrixa->values[i];
}
for (int i = 0; i < matrixa->nzmax; i++)
{
matrixa->colInd[i] = fillA[i].entry;
matrixa->values[i] = tArray[fillA[i].index];
}
free(tArray);
free(fillA);
}
int main()
{
int i;
struct data *total = 0;
struct csr *s = 0;
int nrows = 6, ncols = 5, counter = 0, nzmax = 10, rows = 3, cols = 5;
double values[10] = {0.2135, 0.8648, 7, 0.3446, 0.1429, 6, 0.02311, 0.3599, 0.0866, 8 };
int rowPtr[4] = { 0, 3, 6, 10 };
int colInd[10] = { 0, 2, 4, 1, 2, 3, 0, 1, 2, 4 };
s = (struct csr*) calloc(1, sizeof(struct csr));
s->rows = rows;
s->cols = cols;
s->nzmax = nzmax;
s->rowPtr = (int*)calloc(s->rows + 1, sizeof(int));
s->colInd = (int*)calloc(s->nzmax, sizeof(int));
s->values = (double*)calloc(s->nzmax, sizeof(int));
for (i = 0; i<10; i++)
{
s->colInd[i] = colInd[i];
s->values[i] = values[i];
if (i <= s->rows)
{
s->rowPtr[i] = rowPtr[i];
}
}
csrtranspose(s);
getchar();
}

The reason why it crashes there is because memory has already been corrupted by previously faulty code. So, the problem is not there where it crashes, the problem is in code which executed earlier.
Specifically, this line:
s->values = (double*)calloc(s->nzmax, sizeof(int));
Allocates doubles, but uses sizeof(int), so it does not allocate enough memory.
EDIT
Recommendations:
As others have already pointed out, when working with C++, use the new operator instead of C-style memory allocation. It will save you from LOTS of problems.
If you insist on using C-style allocation, never use p = (type*)malloc( sizeof(type) ), always use p = (type*)malloc( sizeof( *p ) ). This will at least make it more evident when you make the very common mistake of allocating memory for the wrong type.

The line (double*)calloc(s->nzmax, sizeof(int)); in itself is a good reason to switch to C++ allocation, where it's impossible to make that mistake even if you copy-and-paste.
You're allocating too little memory and writing out of bounds.
Since all your sizes are known at compile time, you don't really need dynamic allocation at all.

Something wrong in c++ code

Please what wrong in this code:
#include <iostream>
#include <vector>
unsigned __int32 ConvertToChars(std::vector<int> container, char** pChars)
{
*pChars = (char*)&container[0];
return container.size() * sizeof(int);
}
void ConvertToIntegers(char* chars, short size, std::vector<int>& container)
{
int count = size / sizeof(int);
int* pIntegers = (int*)chars;
for(int i=0; i < count; ++i)
{
container.push_back(*(pIntegers++));
}
}
void Print(const std::vector<int>& container)
{
for(int i=0; i < container.size(); ++i)
{
std::cout << container[i] << std::endl;
}
}
void main()
{
std::vector<int> vec1;
vec1.push_back(1);
vec1.push_back(2);
vec1.push_back(3);
char* buffer = 0;
short bufferSize = ConvertToChars(vec1, &buffer);
std::vector<int> vec2;
ConvertToIntegers(buffer, bufferSize, vec2);
Print(vec2);
char c;
std::cin >> c;
}
function Print prints values:
-572662307
-842150451
-572662307
Thank you!!!

You're copying the container when you pass it to ConvertToChars, then taking a pointer to one of its elements, then seeing the copy go out of scope, which invalidates your pointer.

I don't really understand the point of your program, but part of your problem is in your ConvertToIntegers function, where you make your program interpret a char * as int *.
int* pIntegers = (int*)chars;
for(int i=0; i < count; ++i)
{
container.push_back(*(pIntegers++));
}
You're interpreting the underlying bytes as int types, which could lead to the numbers you're seeing. I'm surprised you're not running into segmentation faults as this would cause you to overstep the block of the memory pointed to by the original char *.

c++ array to vector issue

Im trying to copy an array to a vector, however, when the data is copied to the vector its different from that of the original array.
int arraySize = 640000;
std::vector<unsigned char> vector_buffer;
unsigned char buffer[arraySize];
populateArray(&buffer);
for(int i = 0; i < arraySize; i++)
cout << buffer[i]; // this prints out data
std::copy ( buffer, buffer + arraySize, std::back_inserter(vector_buffer));
for(int i = 0; i < arraySize; i++)
cout << vector_buffer[i]; // this prints out different data
The data seems to get compressed somehow. Any approach at copying the array to a vector does the same thing.
Im using it to create a video from images. If i use the array data all is well, but if i use the vector data it doesn't work.
Any help would be highly appreciated.
Cheers

The
int arraySize = 640000;
needs to be const in standard C++. g++ allows variable length arrays as a C99-inspired language extension. It's best to turn that extension off. :-)
std::vector<unsigned char> vector_buffer;
unsigned char buffer[arraySize];
OK when arraySize is const, but will not compile with e.g. Visual C++ with your original code.
populateArray(&buffer);
This should most probably be populateArray(buffer), unless you have a really weird declaration of populateArray.
for(int i = 0; i < arraySize; i++)
cout << buffer[i]; // this prints out data
The above prints the data with no spacing between the elements. Better add some spacing. Or newlines.
std::copy ( buffer, buffer + arraySize, std::back_inserter(vector_buffer));
Better just use the assign method of std:.vector, like vector_buffer.assign( buffer, buffer + arraySize ).
for(int i = 0; i < arraySize; i++)
cout << vector_buffer[i]; // this prints out different data
Again, this displays the elements with no spacing between.
Is the apparent problem there still when you have fixed these things?
If so, then please post also your populateArray function.

I can see nothing wrong with your code. The following code
#include <iostream>
#include <vector>
int main()
{
const std::size_t arraySize = 640000;
unsigned char buffer[arraySize];
for(std::size_t idx = 0; idx < arraySize; ++idx)
buffer[idx] = idx;
std::vector<unsigned char> vector_buffer(buffer, buffer + arraySize);
//std::vector<unsigned char> vector_buffer;
//std::copy (buffer, buffer + arraySize, std::back_inserter(vector_buffer));
for(std::size_t idx = 0; idx < arraySize; ++idx)
if( buffer[idx] != vector_buffer[idx] )
{
std::cout << "error #" << idx << '\n';
return 1;
}
std::cout << "Ok.\n";
return 0;
}
prints Ok. for me. (Even if I use the less-than-optimal way of copying into the vector.)
From the fact that the code you showed wouldn't compile I conclude that you're not showing the real code. Please do so. Somewhere in the differences between your real code and my code must be the problem.

I've written a complete compilable program for you. The code appears fine. I run it and get expected output. Perhaps you need to re-check the code you posted against the real code.
#include <cstdlib>
#include <vector>
#include <iostream>
#include <iterator>
using namespace std;
void populateArray(unsigned char* buf, size_t buf_size)
{
unsigned char* buf_end = &buf[buf_size];
for( unsigned char c = 'A'; buf != buf_end; c = (c=='Z'?'A':c+1), ++buf )
*buf = c;
}
int main()
{
static const int arraySize = 64;
std::vector<unsigned char> vector_buffer;
unsigned char buffer[arraySize];
populateArray(buffer, sizeof(buffer));
for(int i = 0; i < arraySize; i++)
cout << buffer[i]; // this prints out data
cout << endl;
std::copy ( buffer, buffer + arraySize, std::back_inserter(vector_buffer));
for(int i = 0; i < arraySize; i++)
cout << vector_buffer[i]; // this prints out different data
return 0;
}

C++ Passing a dynamicly allocated 2D array by reference

This question builds off of a previously asked question:
Pass by reference multidimensional array with known size
I have been trying to figure out how to get my functions to play nicely with 2d array references. A simplified version of my code is:
unsigned int ** initialize_BMP_array(int height, int width)
{
unsigned int ** bmparray;
bmparray = (unsigned int **)malloc(height * sizeof(unsigned int *));
for (int i = 0; i < height; i++)
{
bmparray[i] = (unsigned int *)malloc(width * sizeof(unsigned int));
}
for(int i = 0; i < height; i++)
for(int j = 0; j < width; j++)
{
bmparray[i][j] = 0;
}
return bmparray;
}
I don't know how I can re-write this function so that it will work where I pass bmparray in as an empty unsigned int ** by reference so that I could allocate the space for the array in one function, and set the values in another.

Use a class to wrap it, then pass objects by reference
class BMP_array
{
public:
BMP_array(int height, int width)
: buffer(NULL)
{
buffer = (unsigned int **)malloc(height * sizeof(unsigned int *));
for (int i = 0; i < height; i++)
{
buffer[i] = (unsigned int *)malloc(width * sizeof(unsigned int));
}
}
~BMP_array()
{
// TODO: free() each buffer
}
unsigned int ** data()
{
return buffer;
}
private:
// TODO: Hide or implement copy constructor and operator=
unsigned int ** buffer
};

typedef array_type unsigned int **;
initialize_BMP_array(array_type& bmparray, int height, int width)

Mmm... maybe I don't understand well your question, but in C you can pass "by reference" by passing another pointer indirection level. That is, a pointer to the double pointer bmparray itself:
unsigned int ** initialize_BMP_array(int height, int width, unsigned int *** bmparray)
{
/* Note the first asterisk */
*bmparray = (unsigned int **)malloc(height * sizeof(unsigned int *));
...
the rest is the same but with a level of indirection
return *bmparray;
}
So the memory for the bmparray is reserved inside the function (and then, passed by reference).
Hope this helps.

To use the safer and more modern C++ idiom, you should be using vectors rather than dynamically allocated arrays.
void initialize_BMP_array(vector<vector<unsigned int> > &bmparray, int height, int width);

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Passing dynamic array of structs to GPU kernel - c++

You need to allocate memory for struct member array. Test test = malloc(sizeof(Test)n); for(int i = 0; i < n; i++) test[i]->array = malloc(size); I would suggest to read this answer to cope up with other issues after this fix.

Related

Different between fixed array and dynamic allocated array in C++

Memory Allocation Problems with structure

Something wrong in c++ code

c++ array to vector issue

C++ Passing a dynamicly allocated 2D array by reference

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Passing dynamic array of structs to GPU kernel - c++

You need to allocate memory for struct member array. Test *test = malloc(sizeof(Test)*n); for(int i = 0; i < n; i++) test[i]->array = malloc(size); I would suggest to read this answer to cope up with other issues after this fix.

Related

Different between fixed array and dynamic allocated array in C++

Memory Allocation Problems with structure

Something wrong in c++ code

c++ array to vector issue

C++ Passing a dynamicly allocated 2D array by reference

Categories

Resources

You need to allocate memory for struct member array. Test test = malloc(sizeof(Test)n); for(int i = 0; i < n; i++) test[i]->array = malloc(size); I would suggest to read this answer to cope up with other issues after this fix.