Passing array of stuct to kernel results in segfault on write?

Passing array of stuct to kernel results in segfault on write? - c++

Maybe I missed something from the tutorials because this is driving me nuts.
What I'm trying to accomplish: I want to create an array of structs for the OpenCL device to use as a work area. The host doesn't need to see it or interact with it in any way, it's just meant as a "scratch" space for the kernel to work within.
Here's what I have:
Declaration of struct inside header file accessible by both the main program and the OpenCL kernel:
typedef struct {
uint64_t a;
uint32_t b;
} result_list;
Initializing the scratch space buffer "outputBuffer" to hold MAX_SIZE elements:
cl_mem outputBuffer;
outputBuffer = clCreateBuffer(this->context,
CL_MEM_READ_WRITE,
sizeof(result_list) * MAX_SIZE,
NULL,
&status);
I never call clEnqueueWriteBuffer because the host doesn't care what the memory is. It's simply meant to be a working space for the kernel. I leave it as uninitialized but allocated.
Setting it as an argument for the kernel to use:
status = clSetKernelArg(myKernel,
1,
sizeof(cl_mem),
&this->outputBuffer);
The kernel (simplified to remove non-issue sections):
__kernel void kernelFunc(__global const uint32_t *input, __global result_list *outputBuffer) {
if (get_global_id(0) >= MAX_SIZE) { return; }
// Make a few local variables and play with them
outputBuffer[0].a = 1234; // Memory access violation here
// Code never reaches here
}
What am I doing wrong?
I installed CodeXL from AMD and it doesn't help much with debugging issues like these. The most it gives me is "The thread tried to read from or write to a virtual address to which it does not have access."
edit: It seems like it really doesn't like typedefs. Instead of using a struct, I simplified it to typedef uint64_t result_list and it refused to compile, saying "a value of type 'ulong' cannot be assigned to an entity of type 'result_list'", even though result_list -> uint64_t -> unsigned long.

Your problem is that you cannot put in a single header both definitions for HOST and DEVICE.
You have to separate them like this:
//HOST header
struct mystruct{
cl_ulong a;
cl_uint b;
};
//DEVICE header
typedef struct{
ulong a;
uint b;
} mystruct;
Notice that I also changed the datatype to the standar OpenCL datatypes. You should use those instead for compatibility.

Related

Using custom data type in separate library

I am trying to tie together a few different libraries for a project I'm working on. The code is for a Teensy 4.1 in VS Code and PlatformIO
I am trying to use lvgl https://lvgl.io/ (graphics library) and ILI9341_t3n LCD libraries https://github.com/KurtE/ILI9341_t3n.
The ILI9341_t3n library has functions that use DMA to asynchronously update the display. I would like to use it as my display driver.
The ILI9341_t3n library uses a framebuffer and the pointer is pointing to a uint16_t.
However, the lvgl library does things a bit differently. It uses a _lv_disp_draw_buf_t struct to hold several more variables, like fb size, which buffer to use (if double buffer) and other status indicators.
typedef struct _lv_disp_draw_buf_t {
void * buf1; /**< First display buffer.*/
void * buf2; /**< Second display buffer.*/
/*Internal, used by the library*/
void * buf_act;
uint32_t size; /*In pixel count*/
/*1: flushing is in progress. (It can't be a bit field because when it's cleared from IRQ Read-Modify-Write issue might occur)*/
volatile int flushing;
/*1: It was the last chunk to flush. (It can't be a bit field because when it's cleared from IRQ Read-Modify-Write issue might occur)*/
volatile int flushing_last;
volatile uint32_t last_area : 1; /*1: the last area is being rendered*/
volatile uint32_t last_part : 1; /*1: the last part of the current area is being rendered*/
} lv_disp_draw_buf_t;
The code to create and initialize a buffer is as follows.
static _lv_disp_draw_buf_t draw_buf;
static lv_color_t buf_1[MY_DISP_HOR_RES * 10]; /*A screen sized buffer*/
static lv_color_t buf_2[MY_DISP_HOR_RES * 10]; /*Another screen sized buffer*/
lv_disp_draw_buf_init(&draw_buf, buf_1, buf_2, MY_DISP_HOR_RES * 10); /*Initialize
The init code is:
void lv_disp_draw_buf_init(lv_disp_draw_buf_t * draw_buf, void * buf1, void * buf2, uint32_t
size_in_px_cnt)
{
lv_memset_00(draw_buf, sizeof(uint16_t));
draw_buf->buf1 = buf1;
draw_buf->buf2 = buf2;
draw_buf->buf_act = draw_buf->buf1;
draw_buf->size = size_in_px_cnt;
}
The ILI9341_t3n code for initializing the frame buffer is much more simple.
void ILI9341_t3n::setFrameBuffer(uint16_t *frame_buffer) {
#ifdef ENABLE_ILI9341_FRAMEBUFFER
_pfbtft = frame_buffer;
_dma_state &= ~ILI9341_DMA_INIT; // clear that we init the dma chain as our
// buffer has changed...
#endif
}
I would like to create a framebuffer like the lvgl library recommends, and then pass it into this function. I can modify the code so that the LCD library can do something like:
_pfbtft = draw_buf->buf1;
//or//
_pfbtft = draw_buf->buf2;
That way, I can use lvgl to update the framebuffer, and use the ILI9341 library just for handling the DMA transactions.
The problem is I don't know how to incorporate the _lv_disp_draw_buf_t data type into the ILI9341 library. I tried changing the uint16_t to _lv_disp_draw_buf_t just to see what would happen and it says it's an undefined variable type.
Is this a somewhat trivial task or will it require rewriting a lot of code?
Or maybe I dont need to incorporate it. If I call:
tft.setFrameBuffer(&frame_buffer);
Is there a way I could access the framebuffer pointer contained in the _lv_disp_draw_buf_t data type and pass it instead?
If it's not a simple answer, please advise some topics that I should study so that I can learn how to do this.
It's possible I'm going about this all wrong, so constructive criticism is appreciated.
I did several google searches to see if this has been answered before, but my search terms are limited by my knowledge of the terminology, so if it has I apologize.

How to allocate Dynamic Memory to Device Pointer variable in C++ CUDA

all programmers.
I'm converting the existing C++ project to CUDA enhanced program.
I'm new to CUDA. So I'm learning during work.
I have to allocate memory to a dynamic struct member variable and the struct variable is allocated as device variable.
like this:
_cuda_params* dcuda_params;
cudaMalloc(&dcuda_params, sizeof(_cuda_params));
cudaMemcpy((void *)dcuda_params, (void*)cuda_params, sizeof(_cuda_params), cudaMemcpyHostToDevice);
dcuda_params->DPht = (hashtb_entry *)malloc(c); // c is size to allocate.
But during run time I get Exception 0xC0000022.
I also tried this:
cudaMalloc(&dcuda_params->DPht, c);
but the result is the same.
How can I handle this?
.h file
typedef struct {
int blocksPerGrid;
int threadsPerBlock;
uint64_t HASH_SIZE;
hashtb_entry* DPht;
} _cuda_params;
.cu file
void _GpuSearch(_cuda_params* cuda_params){
...
_cuda_params* dcuda_params;
cudaMalloc(&dcuda_params, sizeof(_cuda_params));
cudaMemcpy((void *)dcuda_params, (void*)cuda_params, sizeof(_cuda_params),
cudaMemcpyHostToDevice);
dcuda_params->DPht = (hashtb_entry *)malloc(c); //c: size to allocate.
...
}

You are dereferencing a device pointer, dcuda_params->DPht = (hashtb_entry *)malloc(c); it is not allowed as the host doesn't have access to device memory.
The easy solution for your problem would be not using a pointer to an instance of your struct. You are not using an array of it anyway. So the function call changes to:
void _GpuSearch(_cuda_params cuda_params)
let's say cuda_params is not a pointer anymore you can simply do:
cudaMalloc(&cuda_params.DPht , sizeof(hashtb_entry));
from now on you are fine to pass cuda_params by value to the kernel. And if needed you copy from host to cuda_params.DPht.

list.h list_del() giving kernel paging error

I'm trying to implement a kernel system call to remove the first element from a queue. I'm getting a SEGKILL when debugging in gdb, with a line in kernel logs: BUG: unable to handle kernel paging request at ....
My struct for the queue is as follows:
typedef struct msgQueue
{
long len;
void *data;
struct list_head queue;
} msgQueue;
As you can see, it contains the pointer to a block of data, the length in bytes of that data, and a list_head struct object from list.h.
I inialize an object of type msgQueue(above) with these lines:
myQueue = (struct msgQueue *) kmalloc(sizeof(struct msgQueue), GFP_KERNEL);
INIT_LIST_HEAD(&myQueue->queue);
I implemnent a write function that is wroking correctly. The queue is not empty when I'm trying to delete from it. Here's the initialization of the new queue that I'm addingn and the lines to add it:
Function header:
asmlinkage long sys_writeMsgQueue(const void __user *data, long len)
Other lines:
tempQueue = (struct msgQueue *)kmalloc(sizeof(struct list_head), GFP_KERNEL);
tempQueue->data = kmalloc((size_t)len, GFP_KERNEL);
tempQueue->len = len;
uncopiedBytes = __copy_from_user(tempQueue->data, data, len);
list_add_tail(&(tempQueue->queue), &(myQueue->queue));
I can't paste all of even just my read function, because this is for a coure that I'm taking. But here is what I hope are the relevant parts:
asmlinkage long sys_readMsgQueue(void __user *data, long len)
{
long uncopiedBytes;
uncopiedBytes = __copy_to_user(myQueue, data, len);
printk("REMOVING FROM QUEUE AND FREEING\n\n\n");
list_del(&(myQueue->queue));
}
When I implement this basic functionality of this in a self contained c program in eclipse to try to debug it, it runs fine. Granted, I have to adjust it for user-space code so all of the kernel specific stuff is removed/changed (malloc instead of kmalloc, no system call-specific syntax, etc). I included list.h that I download so I'm using all of the same functions and such as far as list.h goes.
Does anything stand out at you that would cause the kernel paging error in my kernel logs?

tempQueue = (struct msgQueue *)kmalloc(sizeof(struct list_head), GFP_KERNEL);
looks wrong; you probably want
tempQueue = kmalloc(sizeof *tempQueue, GFP_KERNEL);

Sending and receiving structs using UDP (WinSock & C++)?

I have two programs. I need one of them to send data and the other to receive that data.
I have some code in place that is hopefully sending a struct across the network.
However, I don't even know if it is working properly because I don't know how to code the receiving program to receive structs and pass the data it receives into a local struct to be manipulated.
Here is the code I'm using to send if it helps any
gamePacket.PplayerX = userSprite.x;
gamePacket.PplayerY = userSprite.y;
gamePacket.Plives = lives;
gamePacket.Pstate = state;
for(int z=0;z<8;z++)
{
gamePacket.PenemyX[z] = enemySprite[z].x;
gamePacket.PenemyY[z] = enemySprite[z].y;
}
char Buffer[sizeof(gamePacket)];
UDPSocket.Send(Buffer);
The struct is called Packet and gamePacket is an instance of it.
What I am stuck with is:
Is the code I posted even sending the struct
How do I receive the struct in the receiving program so that I can use the data inside it.

Its not send, you only declare a buffer. To send it you need to fill it. Also the way you use sizeof is wrong, it probably doesn't return the right size of all fields, you should count them up.
When you received everything you do the opposite, you allocate a struct and fill it using ofsets
If you need examples just ask. But learning is doing research so an push in the right direction is I think enough. (There are thousand examples on this.)
Ps: you can use pointers + offset because the memory of the struct is layed out next to each other. It are blocks of memory, just like an array.
EDIT; this link is what you need: Passing a structure through Sockets in C
EDIT: Example using pointers:
EDIT: Is this C# or C/C++? I'm sorry if so, change the example to C/C++ ;)
'
struct StructExample
{
int x;
int y;
};
int GetBytes(struct* Struct, void* buf)
{
//Access the memory location and store
*(int*)(buf + 0) = Struct->x;
*(int*)(buf + sizeof(int)) = Struct->y;
return sizeof(Struct->x) + sizeof(Struct->y)
}
Ps: I typed it with my mobile, I'm not 100% sure it compiles/works.

In c and c++ it is possible to use this code:
struct StructExample
{
int x;
int y;
};
struct StructExample a;
a->x = 1;
a->y = 2;
send(FSocket, &a, sizeof(a), 0);

Problem with reading data into structs using a C casting method

I'm writing an MD3 model loader in C++ and I understand the file format and what I need to do but I can't seem to get the syntax right. I have a class for the model and within that class there are a set of structs which will have the model data read into them. In the implementation of the class there is a constructor which reads data in when initialised with an MD3 file. I am using C style casting to do this. The first struct seems to be working and the data appears to be read in correctly but the following two seem to have all values left as zero which shouldn't be the case. The code compiles without error but I'm fairly new to C++ so I've probably made a simple mistake here!
The main file just set up the object with an MD3 file and then goes on to set up some OpenGL things but that's all working correctly and doesn't effect the constructor.
The interface:GameObject.h
class GameObject1 {
public:
GameObject1();
GameObject1(const char * filename);
virtual ~GameObject1();
virtual void Draw();
private:
struct md3_header_t
{
int IDENT; //id of file, always "IDP3"
int Version; //version number, always 15
char Name[64]; //name of character
int Flags; //blank but needed
int Num_frames; //number of Frames
int Num_surfaces; // number of shaders
int Num_skins; //...
int Num_triangles; //num triangles - important one
int Ofs_triangles; // offset of triangles
int Ofs_frames; // frames offset
int Ofs_tags; // tags offset
int Ofs_surfaces; //offset to surfaces
int Ofs_eof; //offset of end of header
};
typedef float vec3[3];
struct md3_frame_header_t
{
vec3 Min_bounds; //first corner of bounding box
vec3 Max_bounds; //other corner
vec3 local_origin; //usually 0 0 0
float Radius; //radius of bounding sphere
char NAME[16]; // name of frame
};
struct md3_tag_header_t
{
char NAME[64]; //name of tag
vec3 origin; //origin of tag eg head or torso
vec3 Axis[3]; //axis stuff
};
struct md3_surface_header_t
{
int IDENT; //id, must be IDP3
char Name[64]; //name of mesh
int Flags; // blank space
int Num_frames; // number of frames
int Num_shaders; // no shaders
int Num_vert; // number verts
int Num_triangles; //number of triangles
int Ofs_triangles; //offset of triangle data from surface start
int Ofs_shaders; // offset of shaders
int Ofs_st; // offset texture data
int Ofs_xyznormal; // offset of verts
int Ofs_end; // offset of end of surface section from start
};
and the interface GameObect.cpp, NOTE I've only included the constructor method here since the destructor and the draw method are both empty currently:
#include "GameObject1.h"
GameObject1::GameObject1() {
//if we have no model...
}
//constructor if a model has been provided
GameObject1::GameObject1(const char * filename) {
ifstream md3file;
md3file.open(filename, ios::in|ios::binary);
// C stuff
md3_header_t * md3header = (struct md3_header_t *)
malloc(sizeof(struct md3_header_t));
md3file.read((char *) md3header, sizeof (struct md3_header_t));
// Check the file
if (md3header->IDENT != 860898377) {
// Error!
cerr << "Error: bad version or identifier" << endl;
}
// seekg to search through the file to add new data to structs
// frame struct
md3_frame_header_t * md3frame = (struct md3_frame_header_t *)
malloc(sizeof(struct md3_frame_header_t));
md3file.seekg(md3header->Ofs_frames);
md3file.read((char *) md3frame, sizeof (struct md3_frame_header_t));
//surface struct
md3_surface_header_t * md3surface = (struct md3_surface_header_t *)
malloc(sizeof( md3_surface_header_t));
md3file.seekg(md3header->Ofs_surfaces);
md3file.read((char *) md3surface, sizeof (struct md3_surface_header_t));
md3file.close();
}
GameObject1::~GameObject1() {
// deconstructor
}
void GameObject1::Draw(){
// eventually a method to draw the model to screen using OpenGL and SDL
}
I'm really lost on this one so any help would be greatly appreciated.
Thanks,
Jon

Check the values of md3header->Ofs_frames and md3header->Ofs_surfaces to make sure they aren't trying to index past the end of the file. If they are, then the MD3 file may be corrupt or reading the file is not mapping properly to the struct.
Not sure if this is an issue but you may run into problems on 64 bit system if the the file was generated on a 32 bit (and vice versa) since the size of the ints would be different which would throw off the mapping to the struct.
May want to find a tool that you know can read the files properly and compare those values with the ones you are getting. Or use a hex editor to manually look through the file to verify the values.

Have a look at the memory layout of your structs: print out sizeof(float), sizeof(int), sizeof(char), sizeof(md3_frame_header_t), etc. and see if they are exactly what you expect. If not, you might want to play around with #pragma pack, to force the momory layout of your struct to be the file layout. Also, is the endianness of the bytes in the file the same as the in-memory layout?
In general I would also advise against reading structs from files. The few CPU cycles you save by doing this have mattered in the 90ies, but these day's they don't anymore. You should rather implement proper read and write code that reads and parses the fields as specified in the file format documentation, and then store them with set methods in your own in-memory representation. This makes the in-memory layout independent from the file layout, makes the code easier to debug, and easier to port if you want to switch to a different architecture or such

Your structures need to be aligned properly, look here.

Thanks for the help. I found the problem eventually by checking the size of the md3_header_t struct. There was an extra int in there (the Ofs_triangles) which shouldn't have been there. It's now reading the data perfectly so I'll get to work on the next part.
Thanks again,
Jon

My suggestion :
The code has memory leaks in constructor since you have not released memory allocated on heap .
so free the memory when its not of any use.

first you give exact values and then check
check size is properly or not

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Passing array of stuct to kernel results in segfault on write? - c++

Related

Using custom data type in separate library

How to allocate Dynamic Memory to Device Pointer variable in C++ CUDA

list.h list_del() giving kernel paging error

Sending and receiving structs using UDP (WinSock & C++)?

Problem with reading data into structs using a C casting method

Categories

Resources