What's the correct way to assign one GPU memory buffer value from another GPU memory buffer with some arithmetic on each source buffer's element?

What's the correct way to assign one GPU memory buffer value from another GPU memory buffer with some arithmetic on each source buffer's element? - c++

I'm a newbie for GPU programming using Cuda toolkit, and I have to write some code offering the functionality as I mentioned in the title.
I'd like to paste the code to show what exactly I want to do.
void CTrtModelWrapper::forward(void **bindings,
unsigned height,
unsigned width,
short channel,
ColorSpaceFmt colorFmt,
PixelDataType pixelType) {
uint16_t *devInRawBuffer_ptr = (uint16_t *) bindings[0];
uint16_t *devOutRawBuffer_ptr = (uint16_t *) bindings[1];
const unsigned short bit = 16;
float *devInputBuffer_ptr = nullptr;
float *devOutputBuffer_ptr = nullptr;
unsigned volume = height * width * channel;
common::cudaCheck(cudaMalloc((void **) &devInputBuffer_ptr, volume * getElementSize(nvinfer1::DataType::kFLOAT)));
common::cudaCheck(cudaMalloc((void **) &devOutputBuffer_ptr, volume * getElementSize(nvinfer1::DataType::kFLOAT)));
unsigned short npos = 0;
switch (pixelType) {
case PixelDataType::PDT_INT8: // high 8bit
npos = bit - 8;
break;
case PixelDataType::PDT_INT10: // high 10bit
npos = bit - 10;
break;
default:
break;
}
switch (colorFmt) {
case CFMT_RGB: {
for (unsigned i = 0; i < volume; ++i) {
devInputBuffer_ptr[i] = float((devInRawBuffer_ptr[i]) >> npos); // SEGMENTATION Fault at this line
}
}
break;
default:
break;
}
void *rtBindings[2] = {devInputBuffer_ptr, devOutputBuffer_ptr};
// forward
this->_forward(rtBindings);
// convert output
unsigned short ef_bit = bit - npos;
switch (colorFmt) {
case CFMT_RGB: {
for (unsigned i = 0; i < volume; ++i) {
devOutRawBuffer_ptr[i] = clip< uint16_t >((uint16_t) devOutputBuffer_ptr[i],
0,
(uint16_t) pow(2, ef_bit)) << npos;
}
}
break;
default:
break;
}
}
bindings is a pointer to an array, the 1st element in the array is a device pointer that points to a buffer allocated using cudaMalloc on the gpu, each element in the buffer is a 16bit integer.the 2nd one the same, used to store the output data.
height,width,channel,colorFmt(RGB here),pixelType(PDT_INT8, aka 8bit) respective to the image height, width,channel number, colorspace, bits to store one pixel value.
the _forward function requires a pointer to an array, similar to bindings except that each element in the buffer should be a 32bit float number.
so I make some transformation using a loop
for (unsigned i = 0; i < volume; ++i) {
devInputBuffer_ptr[i] = float((devInRawBuffer_ptr[i]) >> npos); // SEGMENTATION Fault at this line
}
the >> operation is because the actual 8bit data is stored in the high 8 bit.
SEGMENTATION FAULT occurred at this line of code devInputBuffer_ptr[i] = float((devInRawBuffer_ptr[i]) >> npos); and i equals 0.
I try to separate this code into several line:
uint16_t value = devInRawBuffer_ptr[i];
float transferd = float(value >> npos);
devInputBuffer_ptr[i] = transferd;
and SEGMENTATION FAULT occurred at this line uint16_t value = devInRawBuffer_ptr[i];
I wonder that is this a valid way to assign value to an allocated gpu memory buffer?
PS: the buffer given in bindings are totally fine. they are from host memory using cudaMemcpy before the call to forward function, but I still paste the code below
nvinfer1::DataType type = nvinfer1::DataType::kHALF;
HostBuffer hostInputBuffer(volume, type);
DeviceBuffer deviceInputBuffer(volume, type);
HostBuffer hostOutputBuffer(volume, type);
DeviceBuffer deviceOutputBuffer(volume, type);
// HxWxC --> WxHxC
auto *hostInputDataBuffer = static_cast<unsigned short *>(hostInputBuffer.data());
for (unsigned w = 0; w < W; ++w) {
for (unsigned h = 0; h < H; ++h) {
for (unsigned c = 0; c < C; ++c) {
hostInputDataBuffer[w * H * C + h * C + c] = (unsigned short )(*(ppm.buffer.get() + h * W * C + w * C + c));
}
}
}
auto ret = cudaMemcpy(deviceInputBuffer.data(), hostInputBuffer.data(), volume * getElementSize(type),
cudaMemcpyHostToDevice);
if (ret != 0) {
std::cout << "CUDA failure: " << ret << std::endl;
return EXIT_FAILURE;
}
void *bindings[2] = {deviceInputBuffer.data(), deviceOutputBuffer.data()};
model->forward(bindings, H, W, C, sbsisr::ColorSpaceFmt::CFMT_RGB, sbsisr::PixelDataType::PDT_INT8);

In CUDA, it's generally not advisable to dereference a device pointer in host code. For example, you are creating a "device pointer" when you use cudaMalloc:
common::cudaCheck(cudaMalloc((void **) &devInputBuffer_ptr, volume * getElementSize(nvinfer1::DataType::kFLOAT)));
From the code you have posted, it's not possible to deduce that for devInRawBuffer_ptr but I'll assume it also is a device pointer.
In that case, to perform this operation:
for (unsigned i = 0; i < volume; ++i) {
devInputBuffer_ptr[i] = float((devInRawBuffer_ptr[i]) >> npos);
}
You would launch a CUDA kernel, something like this:
// put this function definition at file scope
__global__ void shift_kernel(float *dst, uint16_t *src, size_t sz, unsigned short npos){
for (size_t idx = blockIdx.x*blockDim.x+threadIdx.x, idx < sz; idx += gridDim.x*blockDim.x) dst[idx] = (float)((src[idx]) >> npos);
}
// call it like this in your code:
kernel<<<160, 1024>>>(devInputBuffer_ptr, devInRawBuffer_ptr, volume, npos);
(coded in browser, not tested)
If you'd like to learn more about what's going on here, you may wish to study CUDA. For example, you can get most of the basic concepts here and by studying the CUDA sample code vectorAdd. The grid-stride loop is discussed here.

Related

problem with sending a float number in a stream in vivado_hls

I am trying to do a simple image processing filter where the pixel values will be divided by half to reduce the intensity and I am trying to develop the hardware for the same. hence I am using vivado hls to generate the IP. As explained here https://forums.xilinx.com/t5/High-Level-Synthesis-HLS/Float-numbers-with-hls-stream/m-p/942747 to send floating numbers in a hls stream , an union needs to be used and I did the same. However, the results don't seem to be matching for the red and green components of the image whereas it is matching for the blue component of the image. It is a very simple algorithm where a pixel value will be divided by half.
I have been trying to resolve it but I am not able to see where the problem is. I have attached all the files below, can someone can help me resolve it??
////header file
#include "ap_fixed.h"
#include "hls_stream.h"
typedef union {
unsigned int i;
float r;
float g;
float b;
} conv;
typedef hls::stream <unsigned int> Stream_t;
void ftest(Stream_t& Sin,Stream_t& Sout);
////testbench
#include "stream_check_h.hpp"
int main()
{
Mat img_rev = imread("C:/Users/20181217/Desktop/images/imgs/output_fwd_v3.png");//(256x512)
Mat final_img(img_rev.rows,img_rev.cols,CV_8UC3);
Mat ref_img(img_rev.rows,img_rev.cols,CV_8UC3);
Stream_t S1,S2;
int err_r = 0;
int err_g = 0;
int err_b = 0;
for(int i=0;i<256;i++)
{
for(int j=0;j<512;j++)
{
conv c;
c.r = (float)img_rev.at<Vec3b>(i,j)[0];
c.g = (float)img_rev.at<Vec3b>(i,j)[1];
c.b = (float)img_rev.at<Vec3b>(i,j)[2];
S1 << c.i;
}
}
ftest(S1,S2);
conv c;
for(int i=0;i<256;i++)
{
for(int j=0;j<512;j++)
{
S2 >> c.i;
final_img.at<Vec3b>(i,j)[0]=(unsigned char)c.r;
final_img.at<Vec3b>(i,j)[1]=(unsigned char)c.g;
final_img.at<Vec3b>(i,j)[2]=(unsigned char)c.b;
ref_img.at<Vec3b>(i,j)[0] = (unsigned char)(((float)img_rev.at<Vec3b>(i,j)[0])/2.0);
ref_img.at<Vec3b>(i,j)[1] = (unsigned char)(((float)img_rev.at<Vec3b>(i,j)[1])/2.0);
ref_img.at<Vec3b>(i,j)[2] = (unsigned char)(((float)img_rev.at<Vec3b>(i,j)[2])/2.0);
}
}
Mat diff;
cout<<diff;
diff= abs(final_img-ref_img);
for(int i=0;i<256;i++)
{
for(int j=0;j<512;j++)
{
if((int)diff.at<Vec3b>(i,j)[0] > 0)
{
err_r++;
cout<<"expected value: "<<(int)ref_img.at<Vec3b>(i,j)[0]<<", final_value: "<<(int)final_img.at<Vec3b>(i,j)[0]<<", actual value:"<<(int)img_rev.at<Vec3b>(i,j)[0]<<endl;
}
if((int)diff.at<Vec3b>(i,j)[1] > 0)
err_g++;
if((int)diff.at<Vec3b>(i,j)[2] > 0)
err_b++;
}
}
cout<<"number of errors: "<<err_r<<", "<<err_g<<", "<<err_b;
return 0;
}
////core
#include "stream_check_h.hpp"
void ftest(Stream_t& Sin,Stream_t& Sout)
{
conv cin,cout;
for(int i=0;i<256;i++)
{
for(int j=0;j<512;j++)
{
Sin >> cin.i;
cout.r = cin.r/2.0 ;
cout.g = cin.g/2.0 ;
cout.b = cin.b/2.0 ;
Sout << cout.i;
}
}
}
when I debugged, it showed that the blue components of the pixels are matching. for one red pixel it showed me the following:
expected value: 22, final_value: 14, actual value:45
and the total errors for red, green, and blue are:
number of errors: 126773, 131072, 0
I am not able to see why it is going wrong for red and green. I posted here hoping a fresh set of eyes would help my problem.
Thanks in advance

I'm assuming you're using a 32bit-wide stream with 3 RGB pixels 8bit unsigned (CV_8U3). I believe the problem with the union type in your case is the overlapping of its three members (not just like the one float value in the example you cite). This means that by doing the division, you're actually doing it over the whole 32bit data you're receiving.
I possible workaround I quickly cam up with would be to cast the unsigned int you're getting from the stream into an ap_uint<32> type, then chop it in the R, G, B chunks (with the range() method) and divide. Finally, assemble back the result and stream it back.
unsigned int packet;
Sin >> packet;
ap_uint<32> packet_uint32 = *((ap_uint<32>*)&packet); // casting (not elegant, but works)
ap_int<8> b = packet_uint32.range(7, 0);
ap_int<8> g = packet_uint32.range(15, 8);
ap_int<8> r = packet_uint32.range(23, 16); // In case they are in the wrong bit range/order, just flip the r, g, b assignements
b /= 2;
g /= 2;
r /= 2;
packet_uint32.range(7, 0) = b;
packet_uint32.range(15, 8) = g;
packet_uint32.range(23, 16) = r;
packet = packet_uint32.to_int();
Sout << packet;
NOTE: I've reused the same variables in the code above: HLS shouldn't complain about it and come out with a good RTL anyway. In case it shouldn't, just create new ones.

BMP Exporter not working C++

So, as the title states, I'm having trouble exporting a .bmp (24-bit bmp) with C++. I am doing it as a school project type thing, and I need some help. To learn how .BMPs work I looked at the wikipedia page, and I got some help from here, but I still can't figure it out. Here is what I have:
//Export the map as a .bmp
void PixelMap::exportMap(const char* fileName)
{
//Size of the file in bytes
int fileSize = 54 + (3 * width * height);
//The sections of the file
unsigned char generalHeader[14] = {'B','M',0,0, 0,0,0,0, 0,0,54,0, 0,0};
unsigned char DIBHeader[40] = {40,0,0,0, 0,0,0,0, 0,0,0,0, 1,0,24,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0};
unsigned char pixelArray[] = "";
//Set the binary portion of the generalHeader, mainly just file size
generalHeader[2] = (unsigned char)(fileSize);
generalHeader[3] = (unsigned char)(fileSize << 8);
generalHeader[4] = (unsigned char)(fileSize << 16);
generalHeader[5] = (unsigned char)(fileSize << 24);
//The binary variable portion of the DIB header
DIBHeader[4] = (unsigned char)(width);
DIBHeader[5] = (unsigned char)(width << 8);
DIBHeader[6] = (unsigned char)(width << 16);
DIBHeader[7] = (unsigned char)(width << 24);
DIBHeader[8] = (unsigned char)(height);
DIBHeader[9] = (unsigned char)(height << 8);
DIBHeader[10] = (unsigned char)(height << 16);
DIBHeader[11] = (unsigned char)(height << 24);
int picSize = 3 * width * height;
DIBHeader[20] = (unsigned char)(picSize);
DIBHeader[21] = (unsigned char)(picSize << 8);
DIBHeader[22] = (unsigned char)(picSize << 16);
DIBHeader[23] = (unsigned char)(picSize << 24);
//Loop through all width and height places to add all pixels
int counter = 0;
for(short j = height; j >= 0; j--)
{
for(short i = 0; i < width; i++)
{
//Add all 3 RGB values
pixelArray[counter] = pixelColour[i, j].red;
counter++;
pixelArray[counter] = pixelColour[i, j].green;
counter++;
pixelArray[counter] = pixelColour[i, j].blue;
counter++;
}
}
//Open it
ofstream fileWorking(fileName);
//Write the sections
fileWorking << generalHeader;
fileWorking << DIBHeader;
fileWorking << pixelArray;
//NO MEMORY LEAKS 4 ME
fileWorking.close();
}
This is part of a class called 'PixelMap,' basically a frame buffer or surface. The PixelMap has the variables 'width,' 'height,' and the struct array 'pixelColour.' (The struct containing 3 chars called 'red' 'green' and 'blue') If you would like to see the class, here it is. (It's just a skeleton, trying to get the .bmp down first)
//This is a pixel map, mainly for exporting BMPs
class PixelMap
{
public:
//The standard pixel variables
int width;
int height;
Colour pixelColour[];
//The constructor will set said variables
PixelMap(int Width, int Height);
//Manipulate pixels
void setPixel(int X, int Y, char r, char g, char b);
//Export the map
void exportMap(const char* fileName);
};
(Colour is the struct)
So my problem here is that when I try to run this, I get this:
So pixelArray, the array of colours to be exported gets corrupted. I assume this has to do with not being properly given a size, but I try to assign it's proper value (3 * width * height (3 being RGB)) but it says that it needs to be a constant value.
Any help with this issue is greatly appreciated!

Instead of
unsigned char pixelArray[] = "";
you could use:
std::vector<unsigned char> pixelArray(3*width*height,0);
This declares a vector with 3*width*height elements, initialized to 0. You can access the elements using the same syntax you've used for the array version (except, as pointed out in comments, you'll have to take care to write the binary values correctly to the output file).

OpenCV: how to read .pfm files?

Is there a way to read .pfm files in OpenCV?
Thank you very much for any suggestions!

PFM is an uncommon image format and I don't know why the Middlebury dataset chose to use that, probably because it uses floating point values.
Anyway I was able to read the images with OpenCV:
import numpy as np
import cv2
groundtruth = cv2.imread('disp0.pfm', cv2.IMREAD_UNCHANGED)
Note the IMREAD_UNCHANGED flag. Somehow it is able to read all the correct values even if OpenCV does not support it.
But wait a minute: inf values are commonly used to set INVALID pixel disparity, so to properly display the image you should do:
# Remove infinite value to display
groundtruth[groundtruth==np.inf] = 0
# Normalize and convert to uint8
groundtruth = cv2.normalize(groundtruth, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
# Show
cv2.imshow("groundtruth", groundtruth)
cv2.waitKey(0)
cv2.destroyAllWindows()

Based on the description of the ".pfm" file formate (see http://netpbm.sourceforge.net/doc/pfm.html), I wrote the following read/write functions, which only depend standard C/C++ library. It is proved to work well on reading/writing the pfm file, like, the ground truth disparity ".pfm" files from MiddleBury Computer Vision (see http://vision.middlebury.edu/stereo/submit3/).
#ifndef _PGM_H_
#define _PGM_H_
#include <fstream>
#include <iostream>
#include <algorithm>
#include <string>
#include <cstdint>
#include <cstdlib>
#include <cstring>
#include <bitset>　/*std::bitset<32>*/
#include <cstdio>
enum PFM_endianness { BIG, LITTLE, ERROR};
class PFM {
public:
PFM();
inline bool is_little_big_endianness_swap(){
if (this->endianess == 0.f) {
std::cerr << "this-> endianness is not assigned yet!\n";
exit(0);
}
else {
uint32_t endianness = 0xdeadbeef;
//std::cout << "\n" << std::bitset<32>(endianness) << std::endl;
unsigned char * temp = (unsigned char *)&endianness;
//std::cout << std::bitset<8>(*temp) << std::endl;
PFM_endianness endianType_ = ((*temp) ^ 0xef == 0 ?
LITTLE : (*temp) ^ (0xde) == 0 ? BIG : ERROR);
// ".pfm" format file specifies that:
// positive scale means big endianess;
// negative scale means little endianess.
return ((BIG == endianType_) && (this->endianess < 0.f))
|| ((LITTLE == endianType_) && (this->endianess > 0.f));
}
}
template<typename T>
T * read_pfm(const std::string & filename) {
FILE * pFile;
pFile = fopen(filename.c_str(), "rb");
char c[100];
if (pFile != NULL) {
fscanf(pFile, "%s", c);
// strcmp() returns 0 if they are equal.
if (!strcmp(c, "Pf")) {
fscanf(pFile, "%s", c);
// atoi: ASCII to integer.
// itoa: integer to ASCII.
this->width = atoi(c);
fscanf(pFile, "%s", c);
this->height = atoi(c);
int length_ = this->width * this->height;
fscanf(pFile, "%s", c);
this->endianess = atof(c);
fseek(pFile, 0, SEEK_END);
long lSize = ftell(pFile);
long pos = lSize - this->width*this->height * sizeof(T);
fseek(pFile, pos, SEEK_SET);
T* img = new T[length_];
//cout << "sizeof(T) = " << sizeof(T);
fread(img, sizeof(T), length_, pFile);
fclose(pFile);
/* The raster is a sequence of pixels, packed one after another,
* with no delimiters of any kind. They are grouped by row,
* with the pixels in each row ordered left to right and
* the rows ordered bottom to top.
*/
T* tbimg = (T *)malloc(length_ * sizeof(T));// top-to-bottom.
//PFM SPEC image stored bottom -> top reversing image
for (int i = 0; i < this->height; i++) {
memcpy(&tbimg[(this->height - i - 1)*(this->width)],
&img[(i*(this->width))],
(this->width) * sizeof(T));
}
if (this->is_little_big_endianness_swap()){
std::cout << "little-big endianness transformation is needed.\n";
// little-big endianness transformation is needed.
union {
T f;
unsigned char u8[sizeof(T)];
} source, dest;
for (int i = 0; i < length_; ++i) {
source.f = tbimg[i];
for (unsigned int k = 0, s_T = sizeof(T); k < s_T; k++)
dest.u8[k] = source.u8[s_T - k - 1];
tbimg[i] = dest.f;
//cout << dest.f << ", ";
}
}
delete[] img;
return tbimg;
}
else {
std::cout << "Invalid magic number!"
<< " No Pf (meaning grayscale pfm) is missing!!\n";
fclose(pFile);
exit(0);
}
}
else {
std::cout << "Cannot open file " << filename
<< ", or it does not exist!\n";
fclose(pFile);
exit(0);
}
}
template<typename T>
void write_pfm(const std::string & filename, const T* imgbuffer,
const float & endianess_) {
std::ofstream ofs(filename.c_str(), std::ifstream::binary);
// ** 1) Identifier Line: The identifier line contains the characters
// "PF" or "Pf". PF means it's a color PFM.
// Pf means it's a grayscale PFM.
// ** 2) Dimensions Line:
// The dimensions line contains two positive decimal integers,
// separated by a blank. The first is the width of the image;
// the second is the height. Both are in pixels.
// ** 3) Scale Factor / Endianness:
// The Scale Factor / Endianness line is a queer line that jams
// endianness information into an otherwise sane description
// of a scale. The line consists of a nonzero decimal number,
// not necessarily an integer. If the number is negative, that
// means the PFM raster is little endian. Otherwise, it is big
// endian. The absolute value of the number is the scale
// factor for the image.
// The scale factor tells the units of the samples in the raster.
// You use somehow it along with some separately understood unit
// information to turn a sample value into something meaningful,
// such as watts per square meter.
ofs << "Pf\n"
<< this->width << " " << this->height << "\n"
<< endianess_ << "\n";
/* PFM raster:
* The raster is a sequence of pixels, packed one after another,
* with no delimiters of any kind. They are grouped by row,
* with the pixels in each row ordered left to right and
* the rows ordered bottom to top.
* Each pixel consists of 1 or 3 samples, packed one after another,
* with no delimiters of any kind. 1 sample for a grayscale PFM
* and 3 for a color PFM (see the Identifier Line of the PFM header).
* Each sample consists of 4 consecutive bytes. The bytes represent
* a 32 bit string, in either big endian or little endian format,
* as determined by the Scale Factor / Endianness line of the PFM
* header. That string is an IEEE 32 bit floating point number code.
* Since that's the same format that most CPUs and compiler use,
* you can usually just make a program use the bytes directly
* as a floating point number, after taking care of the
* endianness variation.
*/
int length_ = this->width*this->height;
this->endianess = endianess_;
T* tbimg = (T *)malloc(length_ * sizeof(T));
// PFM SPEC image stored bottom -> top reversing image
for (int i = 0; i < this->height; i++) {
memcpy(&tbimg[(this->height - i - 1)*this->width],
&imgbuffer[(i*this->width)],
this->width * sizeof(T));
}
if (this->is_little_big_endianness_swap()) {
std::cout << "little-big endianness transformation is needed.\n";
// little-big endianness transformation is needed.
union {
T f;
unsigned char u8[sizeof(T)];
} source, dest;
for (int i = 0; i < length_; ++i) {
source.f = tbimg[i];
for (size_t k = 0, s_T = sizeof(T); k < s_T; k++)
dest.u8[k] = source.u8[s_T - k - 1];
tbimg[i] = dest.f;
//cout << dest.f << ", ";
}
}
ofs.write((char *)tbimg, this->width*this->height * sizeof(T));
ofs.close();
free(tbimg);
}
inline float getEndianess(){return endianess;}
inline int getHeight(void){return height;}
inline int getWidth(void){return width;}
inline void setHeight(const int & h){height = h;}
inline void setWidth(const int & w){width = w;}
private:
int height;
int width;
float endianess;
};
#endif /* PGM_H_ */
Forgive me to leave lots of useless comments in the code.
A simple example shows the write/read:
int main(){
PFM pfm_rw;
string temp = "img/Motorcycle/disp0GT.pfm";
float * p_disp_gt = pfm_rw.read_pfm<float>(temp);
//int imgH = pfm_rw.getHeight();
//int imgW = pfm_rw.getWidth();
//float scale = pfm_rw.getEndianess();
string temp2 = "result/Motorcycle/disp0GT_n1.pfm";
pfm_rw.write_pfm<float>(temp2, p_disp_gt, -1.0f);
return 1;
}

As far as I know, OpenCV doesn't support to read PFM files directly.
You can refer to the code snippet here for a simple PFM reader, which will enable you to read PFM files into COLOR *data with COLOR defined as follows:
typedef struct {
float r;
float g;
float b;
} COLOR;

C++ Pointer to byte array optimization

I am currently using this approach to copy some byte values over:
for (int i = 0; i < (iLen + 1); i++)
{
*(pBuffer + i) = Image.pVid[i];
}
I would like to ask if there is a way to copy these values over in one go, perhaps by using memcopy to gain more speed.
The entire code is:
extern "C" __declspec(dllexport) int __stdcall GetCameraImage(BYTE pBuffer[], int Type, int uWidth, int uHeight)
{
CameraImage Image;
int ret;
Image.pVid = (unsigned int*)malloc(4 * uWidth*uHeight);
ret = stGetCameraImage(&Image, 1, uWidth, uHeight);
if (ret == ERR_SUCCESS)
{
int iLen = (4 * uWidth * uHeight);
for (int i = 0; i < (iLen + 1); i++)
{
*(pBuffer + i) = Image.pVid[i];
}
////print(“ImageType = %d, width = %d, height = %d”, Image.Type, Image.Width,
//// Image.Height);
////print(“First Pixel : B = %d, G = %d, R = %d”, Image.pVid[0], Image.pVid[1],
//// Image.pVid[2]);
////print(“Second Pixel : B = %d, G = %d, R = %d”, Image.pVid[4], Image.pVid[5],
//// Image.pVid[6]);
}
free(Image.pVid);
return ret;
}
Edit:
*pVid is this:
unsigned int *pVid; // pointer to image data (Format RGB32...)

The way your code is currently written, each assignment in your loop will overflow and give you some garbage value in pBuffer because you're trying to assign an unsigned int to a BYTE. On top of that, you will run off the end of the Image.pVid array because i is counting bytes, not unsigned ints
You could fix your code by doing this:
*(pBuffer + i) = ((BYTE*)Image.pVid)[i];
But that is pretty inefficient. Better to move whole words at a time, or you could just use memcpy instead:
memcpy(pBuffer,Image.pVid,iLen) //pBuffer must be at least iLen bytes long

SDL_GetPixel pointer problems

This is my very first question:
First of these 2 functions you see here below works fine to some extent:
Uint32 AWSprite::get_pixelColor_location(SDL_Surface * surface, int x, int y) {
int bpp = surface->format->BytesPerPixel;
/* Here p is the address to the pixel we want to retrieve */
Uint8 *p = (Uint8 *)surface->pixels + y * surface->pitch + x * bpp;
switch (bpp) {
case 1:
return *p;
case 2:
return *(Uint16 *)p;
case 3:
if (SDL_BYTEORDER == SDL_BIG_ENDIAN)
return p[0] << 16 | p[1] << 8 | p[2];
else
return p[0] | p[1] << 8 | p[2] << 16;
case 4:
return *(Uint32 *)p;
default:
return 0;
}
}
void AWSprite::set_all_frame_image_actual_size() {
/* This function finds an entire rows that has transparency
then stores the amount of rows to a Frame_image_absolute structure
*/
absolute_sprite = new Frame_image_absolute*[howManyFrames];
for (int f = 0; f < howManyFrames; f++) {
SDL_LockSurface(frames[f]);
int top_gap = 0; int bottom_gap = 0;
int per_transparent_px_count = 1;
for (int i = 0; i < frames[f]->h; i++) {
int per_transparent_px_count = 1;
if (this->get_pixelColor_location(frames[f], j, i) == transparentColour) per_transparent_px_count++;
if (per_transparent_px_count >= frames[f]->w) {
if (i < frames[f]->h / 2) {
per_transparent_px_count = 1;
top_gap++;
} else {
per_transparent_px_count = 1;
bottom_gap++;
}
}
}
}
int realHeight = frames[f]->h - (top_gap + bottom_gap);
absolute_sprite[f] = new Frame_image_absolute();
absolute_sprite[f]->offset_y = top_gap;
absolute_sprite[f]->height = realHeight;
}
}
When i ran this i get:
Unhandled exception at 0x00173746 in SE Game.exe: 0xC0000005: Access violation reading location 0x03acc0b8.
When i when through debuging, i found that it crashes at:
When iterators variable f == 31, i == 38, j = 139
And stops at AWSprite::get_pixelColor_location() in the line at " return *(Uint32 *)p;
I found that if i ran it again and go through debugging line by line then i will works sometime and sometime it dont!!! So i mean that "It crash at randomly when f > 30, i, j iterators value"
What is going on...

I cannot comment on the question yet, but here are some questions:
Where does j come from? Based on the get_pixelColor_location function I would assume that you're iterating over the width of the surface. This part seems to be missing from the code you posted.
Did you validate that i and j are within the bounds of your surface?
Also, you don't seem to Unlock the surface.
Running your function seems to work adequately here so I suspect you're reading outside of your buffer with invalid parameters.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

What's the correct way to assign one GPU memory buffer value from another GPU memory buffer with some arithmetic on each source buffer's element? - c++

Related

problem with sending a float number in a stream in vivado_hls

BMP Exporter not working C++

OpenCV: how to read .pfm files?

C++ Pointer to byte array optimization

SDL_GetPixel pointer problems

Categories

Resources