BMP reader not functioning correctly - c++

I wrote a small BMP loader for 24-bit BMP, and it works, except it does not display the colors. Everything is grayscale with bits of color (not the correct ones) here and there. The loader for my code is below
void BMP::Read(char* filename)
{
FILE* f;
unsigned char info[54];
if ((f = fopen(filename, "rb")) == NULL) return;
fread(info, sizeof(unsigned char), 54, f);
m_width = *(int*)&info[18];
m_height = *(int*)&info[22];
m_size = 3* m_width * m_height;
m_pdata = new unsigned char[m_size];
fread(m_pdata, sizeof(unsigned char), m_size, f);
fclose(f);
}
I then access the array using the following formula:
red = m_pdata[(y * m_width + x) + 2];
blue = m_pdata[(y * m_width + x) + 0];
green = m_pdata[(y * m_width + x) + 1];
Any suggestions here? I think the problem is in the load function, but not sure.

You forgot to include the pixel width when extracting the pixel channels:
int pixel_width = 3; // 3 bytes for 24 bit
According to Strange values when reading pixels from 24-bit bitmap, there's also padding at the end of each row. The padding can be calculated by:
int row_padding = (4 - (m_width * pixel_width) % 4) % 4;
The final resulting formula to access the pixel color channels would be:
red = m_pdata[(y * m_width + x) * pixel_width + y * row_padding + 2];
blue = m_pdata[(y * m_width + x) * pixel_width + y * row_padding + 0];
green = m_pdata[(y * m_width + x) * pixel_width + y * row_padding + 1];
Since rows have padding, your calculation of m_size is slightly too small. You can account for the row padding as:
m_size = pixel_width * m_width * m_height + row_padding * m_height;

Related

Image subtraction with CUDA and textures

My goal is to use C++ with CUDA to subtract a dark frame from a raw image. I want to use textures for acceleration. The input of the images is cv::Mat with the type CV_8UC4 (I use the pointer to the data of the cv::Mat). This is the kernel I came up with, but I have no idea how to eventually subtract the textures from each other:
__global__ void DarkFrameSubtractionKernel(unsigned char* outputImage, size_t pitchOutputImage,
cudaTextureObject_t inputImage, cudaTextureObject_t darkImage, int width, int height)
{
const int x = blockIdx.x * blockDim.x + threadIdx.x;
const int y = blockDim.y * blockIdx.y + threadIdx.y;
const float tx = (x + 0.5f);
const float ty = (y + 0.5f);
if (x >= width || y >= height) return;
uchar4 inputImageTemp = tex2D<uchar4>(inputImage, tx, ty);
uchar4 darkImageTemp = tex2D<uchar4>(darkImage, tx, ty);
outputImage[y * pitchOutputImage + x] = inputImageTemp - darkImageTemp; // this line will throw an error
}
This is the function that calls the kernel (you can see that I create the textures from unsigned char):
void subtractDarkImage(unsigned char* inputImage, size_t pitchInputImage, unsigned char* outputImage,
size_t pitchOutputImage, unsigned char* darkImage, size_t pitchDarkImage, int width, int height,
cudaStream_t stream)
{
cudaResourceDesc resDesc = {};
resDesc.resType = cudaResourceTypePitch2D;
resDesc.res.pitch2D.width = width;
resDesc.res.pitch2D.height = height;
resDesc.res.pitch2D.devPtr = inputImage;
resDesc.res.pitch2D.pitchInBytes = pitchInputImage;
resDesc.res.pitch2D.desc = cudaCreateChannelDesc(8, 8, 8, 8, cudaChannelFormatKindUnsigned);
cudaTextureDesc texDesc = {};
texDesc.readMode = cudaReadModeElementType;
texDesc.addressMode[0] = cudaAddressModeBorder;
texDesc.addressMode[1] = cudaAddressModeBorder;
cudaTextureObject_t imageInputTex, imageDarkTex;
CUDA_CHECK(cudaCreateTextureObject(&imageInputTex, &resDesc, &texDesc, 0));
resDesc.res.pitch2D.devPtr = darkImage;
resDesc.res.pitch2D.pitchInBytes = pitchDarkImage;
CUDA_CHECK(cudaCreateTextureObject(&imageDarkTex, &resDesc, &texDesc, 0));
dim3 block(32, 8);
dim3 grid = paddedGrid(block.x, block.y, width, height);
DarkImageSubtractionKernel << <grid, block, 0, stream >> > (reinterpret_cast<uchar4*>(outputImage), pitchOutputImage / sizeof(uchar4),
imageInputTex, imageDarkTex, width, height);
CUDA_CHECK(cudaDestroyTextureObject(imageInputTex));
CUDA_CHECK(cudaDestroyTextureObject(imageDarkTex));
}
The code does not compile as I can not subtract a uchar4 from another one (in the kernel). Is there an easy way of subtraction here?
Help is very much appreciated.
Is there an easy way of subtraction here?
There are no arithmetic operators defined for CUDA built-in vector types. If you replace
outputImage[y * pitchOutputImage + x] = inputImageTemp - darkImageTemp;
with
uchar4 val;
val.x = inputImageTemp.x - darkImageTemp.x;
val.y = inputImageTemp.y - darkImageTemp.y;
val.z = inputImageTemp.z - darkImageTemp.z;
val.w = inputImageTemp.w - darkImageTemp.w;
outputImage[y * pitchOutputImage + x] = val;
things will work. If this offends you, I suggest writing a small library of helper functions to hide the mess.

What is the highest bit depth greyscale image I can export from FreeImage?

As context, I'm working with building a topographic program which needs relatively extreme detail. I do not expect the files to be small, and they do not formally need to be viewed on a monitor, they just need to have very high resolution.
I know that most image formats are limited to 8 bpp, on account of the standard limits on both monitors (at a reasonable price) and on human perception. However, 2⁸ is just 256 possible values, which induces plateauing artifacts in a reconstructed displacement. 2¹⁶ may be close enough at 65,536 possible values, which I have achieved.
I'm using FreeImage and DLang to construct the data, currently on a Linux Mint machine.
However, when I went on to 2³², software support seemed to fade on me. I tried a TIFF of this form and nothing seemed to be able to interpret it, either showing a completely (or mostly) transparent image (remembering that I didn't expect any monitor to really support 2³² shades of a channel) or complaining about being unable to decode the RGB data. I imagine that it's because it was assumed to be an RGB or RGBA image.
FreeImage is reasonably well documented for most purposes, but I'm now wondering, what is the highest-precision single-channel format I can export, and how would I do it? Can anyone provide an example? Am I really limited, in any typical and not-home-rolled image format, to 16-bit? I know that's high enough for, say, medical imaging, but I'm sure I'm not the first person to try to aim higher and we science-types can be pretty ambitious about our precision-level…
Did I make a glaring mistake in my code? Is there something else I should try instead for this kind of precision?
Here's my code.
The 16-bit TIFF that worked
void writeGrayscaleMonochromeBitmap(const double width, const double height) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_UINT16, cast(int)width, cast(int)height);
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
for(int x = 0; x < width; x++) {
ushort v = cast(ushort)((x * 0xFFFF)/width);
ubyte[2] bytes = nativeToLittleEndian(cast(ushort)(x/width * 0xFFFF));
scanline[x * ushort.sizeof + 0] = bytes[0];
scanline[x * ushort.sizeof + 1] = bytes[1];
}
}
FreeImage_Save(FIF_TIFF, bitmap, "test.tif", TIFF_DEFAULT);
FreeImage_Unload(bitmap);
}
The 32-bit TIFF that didn't really work
void writeGrayscaleMonochromeBitmap32(const double width, const double height) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_UINT32, cast(int)width, cast(int)height);
writeln(width, ", ", height);
writeln("Width: ", FreeImage_GetWidth(bitmap));
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
writeln(y, ": ", scanline);
for(int x = 0; x < width; x++) {
//writeln(x, " < ", width);
uint v = cast(uint)((x/width) * 0xFFFFFFFF);
writeln("V: ", v);
ubyte[4] bytes = nativeToLittleEndian(v);
scanline[x * uint.sizeof + 0] = bytes[0];
scanline[x * uint.sizeof + 1] = bytes[1];
scanline[x * uint.sizeof + 2] = bytes[2];
scanline[x * uint.sizeof + 3] = bytes[3];
}
}
FreeImage_Save(FIF_TIFF, bitmap, "test32.tif", TIFF_NONE);
FreeImage_Unload(bitmap);
}
Thanks for any pointers.
For a single channel, the highest available from FreeImage is 32-bit, as FIT_UINT32. However, the file format must be capable of this, and as of the moment, only TIFF appears to be up to the task (See page 104 of the Stanford Documentation). Additionally, most monitors are incapable of representing more than 8-bits-per-sample, 12 in extreme cases, so it is very difficult to read data back out and have it render properly.
A unit test involving comparing bytes before marshaling to the bitmap, and sampled from the same bitmap afterward, show that the data is in fact being encoded.
To imprint data to a 16-bit gray scale (currently supported by J2K, JP2, PGM, PGMRAW, PNG and TIF), you would do something like this:
void toFreeImageUINT16PNG(string fileName, const double width, const double height, double[] data) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_UINT16, cast(int)width, cast(int)height);
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
for(int x = 0; x < width; x++) {
//This magic has to happen with the y-coordinate in order to keep FreeImage from following its default behavior, and generating
//the image upside down.
ushort v = cast(ushort)(data[cast(ulong)(((height - 1) - y) * width + x)] * 0xFFFF); //((x * 0xFFFF)/width);
ubyte[2] bytes = nativeToLittleEndian(v);
scanline[x * ushort.sizeof + 0] = bytes[0];
scanline[x * ushort.sizeof + 1] = bytes[1];
}
}
FreeImage_Save(FIF_PNG, bitmap, fileName.toStringz);
FreeImage_Unload(bitmap);
}
Of course you would want to make adjustments for your target file type. To export as 48-bit RGB16, you would do this.
void toFreeImageColorPNG(string fileName, const double width, const double height, double[] data) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_RGB16, cast(int)width, cast(int)height);
uint pitch = FreeImage_GetPitch(bitmap);
uint bpp = FreeImage_GetBPP(bitmap);
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
for(int x = 0; x < width; x++) {
ulong offset = cast(ulong)((((height - 1) - y) * width + x) * 3);
ushort r = cast(ushort)(data[(offset + 0)] * 0xFFFF);
ushort g = cast(ushort)(data[(offset + 1)] * 0xFFFF);
ushort b = cast(ushort)(data[(offset + 2)] * 0xFFFF);
ubyte[6] bytes = nativeToLittleEndian(r) ~ nativeToLittleEndian(g) ~ nativeToLittleEndian(b);
scanline[(x * 3 * ushort.sizeof) + 0] = bytes[0];
scanline[(x * 3 * ushort.sizeof) + 1] = bytes[1];
scanline[(x * 3 * ushort.sizeof) + 2] = bytes[2];
scanline[(x * 3 * ushort.sizeof) + 3] = bytes[3];
scanline[(x * 3 * ushort.sizeof) + 4] = bytes[4];
scanline[(x * 3 * ushort.sizeof) + 5] = bytes[5];
}
}
FreeImage_Save(FIF_PNG, bitmap, fileName.toStringz);
FreeImage_Unload(bitmap);
}
Lastly, to encode a UINT32 greyscale image (limited purely to TIFF at the moment), you would do this.
void toFreeImageTIF32(string fileName, const double width, const double height, double[] data) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_UINT32, cast(int)width, cast(int)height);
//DEBUG
int xtest = cast(int)(width/2);
int ytest = cast(int)(height/2);
uint comp1a = cast(uint)(data[cast(ulong)(((height - 1) - ytest) * width + xtest)] * 0xFFFFFFFF);
writeln("initial: ", nativeToLittleEndian(comp1a));
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
for(int x = 0; x < width; x++) {
//This magic has to happen with the y-coordinate in order to keep FreeImage from following its default behavior, and generating
//the image upside down.
ulong i = cast(ulong)(((height - 1) - y) * width + x);
uint v = cast(uint)(data[i] * 0xFFFFFFFF);
ubyte[4] bytes = nativeToLittleEndian(v);
scanline[x * uint.sizeof + 0] = bytes[0];
scanline[x * uint.sizeof + 1] = bytes[1];
scanline[x * uint.sizeof + 2] = bytes[2];
scanline[x * uint.sizeof + 3] = bytes[3];
}
}
//DEBUG
ulong index = cast(ulong)(xtest * uint.sizeof);
writeln("Final: ", FreeImage_GetScanLine(bitmap, ytest)
[index .. index + uint.sizeof]);
FreeImage_Save(FIF_TIFF, bitmap, fileName.toStringz);
FreeImage_Unload(bitmap);
}
I've yet to find a program, built by anyone else, which will readily render a 32-bit gray-scale image on a monitor's available palette. However, I left my checking code in which will consistently write out the same array both at the top DEBUG and the bottom one, and that's consistent enough for me.
Hopefully this will help someone else out in the future.

Using swscale for image composing

I have an input image A and a resulting image B with the size 800x600 stored in YUV420 format and I need to scale image A into 100x100 size and place it into resulting image B at some point (x=100, y=100). To decrease memory and CPU usage I put swscale result right into final B image.
Here is a code snippets (pretty straightforward):
//here we a creating sws context for scaling into 100x100
sws_ctx = sws_getCachedContext(sws_ctx, frame.hdr.width, frame.hdr.height, AV_PIX_FMT_YUV420P,
100, 100, AV_PIX_FMT_YUV420P, SWS_BILINEAR, nullptr, nullptr, nullptr);
Next we create corresponding slice and strides describing image A
int src_y_plane_sz = frame.hdr.width * frame.hdr.height;
int src_uv_plane_sz = src_y_plane_sz / 2;
std::int32_t src_stride[] = {
frame.hdr.width,
frame.hdr.width / 2,
frame.hdr.width / 2,
0};
const uint8_t* const src_slice[] = {
&frame.raw_frame[0],
&frame.raw_frame[0] + src_y_plane_sz,
&frame.raw_frame[0] + src_y_plane_sz + src_uv_plane_sz,
nullptr};
Now doing the same for a destination B image
std::int32_t dst_stride[] = {
current_frame.hdr.width,
current_frame.hdr.width /2,
current_frame.hdr.width /2,
0
};
std::int32_t y_plane_sz = current_frame.hdr.width * current_frame.hdr.height;
std::int32_t uv_plane_sz = y_plane_sz / 2;
//calculate offset in slices for x=100, y=100 position
std::int32_t y_offset = current_frame.hdr.width * 100 + 100;
uint8_t* const dst_slice[] = {
&current_frame.raw_frame[0] + y_offset,
&current_frame.raw_frame[0] + y_plane_sz + y_offset / 2,
&current_frame.raw_frame[0] + y_plane_sz + uv_plane_sz + y_offset / 2,
nullptr};
After all - calling swscale
int ret = sws_scale(sws_ctx, src_slice, src_stride, 0, frame.hdr.height,
dst_slice, dst_stride);
After using a testing sequence I having some invalid result with the following problems:
Y component got some padding line
UV components got misplaced -
they are a bit lower then original Y components.
Does anyone have had the same problems with swscale function? I am pretty new to this FFmpeg library collection so I am open to any opinions how to perform this task correctly.
FFmpeg version used 3.3
YUV420 format scales both width and height of the image by two. That is each chromatic plane is 4 times smaller than luma plane:
int src_uv_plane_sz = src_y_plane_sz / 4;
Also i'm not sure whether calculated stride values are correct. Typically stride != width.
Thanks to #VTT for pointing out the possible problem - I have fixed destination slice pointers calculation to the following:
int dest_x = 200, dest_y = 70;
//into 100x100 position
std::int32_t y_offset = current_frame.hdr.width * dest_y + dest_x;
std::int32_t u_offset = ( current_frame.hdr.width * dest_y ) / 4 + dest_x /2;
std::int32_t v_offset = u_offset + y_plane_sz / 4;
uint8_t* const dst_slice[] = {
&current_frame.raw_frame[0] + y_offset,
&current_frame.raw_frame[0] + y_plane_sz + u_offset,
&current_frame.raw_frame[0] + y_plane_sz + v_offset,
nullptr};
And the second problem with the "line artefact" is solved by using scaled dimensions sizes factor by 8.
One more addition for proper position calculations for destination slice pointers - that y coordinates must be re-adjust according to a current Y plane pointing because per each two Y strides there is only one U or V stride. For example (see adjusted_uv_y variable):
std::int32_t adjusted_uv_y = dest_y % 2 == 0 ? dest_y : dest_y - 1;
std::int32_t y_offset = current_frame.hdr.width * dest_y + dest_x;
std::int32_t u_offset = ( current_frame.hdr.width * adjusted_uv_y ) / 4 + dest_x /2;
std::int32_t v_offset = u_offset + y_plane_sz / 4;

Flipping the 2D texture on a sphere with Ray-Tracing

I am working on my ray-tracer and I think I've made some significant achievements. I am currently trying to place texture images onto objects. However they don't place quite well. They appear flipped on the sphere. Here is the final image of my current code:
Here are the relevant code:
-Image Class for opening image
class Image
{
public:
Image() {}
void read_bmp_file(char* filename)
{
int i;
FILE* f = fopen(filename, "rb");
unsigned char info[54];
fread(info, sizeof(unsigned char), 54, f); // read the 54-byte header
// extract image height and width from header
width = *(int*)&info[18];
height = *(int*)&info[22];
int size = 3 * width * height;
data = new unsigned char[size]; // allocate 3 bytes per pixel
fread(data, sizeof(unsigned char), size, f); // read the rest of the data at once
fclose(f);
for(i = 0; i < size; i += 3)
{
unsigned char tmp = data[i];
data[i] = data[i+2];
data[i+2] = tmp;
}
/*Now data should contain the (R, G, B) values of the pixels. The color of pixel (i, j) is stored at
data[j * 3* width + 3 * i], data[j * 3 * width + 3 * i + 1] and data[j * 3 * width + 3*i + 2].
In the last part, the swap between every first and third pixel is done because windows stores the
color values as (B, G, R) triples, not (R, G, B).*/
}
public:
int width;
int height;
unsigned char* data;
};
-Texture class
class Texture: public Material
{
public:
Texture(char* filename): Material() {
image_ptr = new Image;
image_ptr->read_bmp_file(filename);
}
virtual ~Texture() {}
virtual void set_mapping(Mapping* mapping)
{ mapping_ptr = mapping;}
virtual Vec get_color(const ShadeRec& sr) {
int row, col;
if(mapping_ptr)
mapping_ptr->get_texel_coordinates(sr.local_hit_point, image_ptr->width, image_ptr->height, row, col);
return Vec (image_ptr->data[row * 3 * image_ptr->width + 3*col ]/255.0,
image_ptr->data[row * 3 * image_ptr->width + 3*col+1]/255.0,
image_ptr->data[row * 3 * image_ptr->width + 3*col+2]/255.0);
}
public:
Image* image_ptr;
Mapping* mapping_ptr;
};
-Mapping class
class SphericalMap: public Mapping
{
public:
SphericalMap(): Mapping() {}
virtual ~SphericalMap() {}
virtual void get_texel_coordinates (const Vec& local_hit_point,
const int hres,
const int vres,
int& row,
int& column) const
{
float theta = acos(local_hit_point.y);
float phi = atan2(local_hit_point.z, local_hit_point.x);
if(phi < 0.0)
phi += 2*PI;
float u = phi/(2*PI);
float v = (PI - theta)/PI;
column = (int)((hres - 1) * u);
row = (int)((vres - 1) * v);
}
};
-Local hit points:
virtual void Sphere::set_local_hit_point(ShadeRec& sr)
{
sr.local_hit_point.x = sr.hit_point.x - c.x;
sr.local_hit_point.y = (sr.hit_point.y - c.y)/R;
sr.local_hit_point.z = sr.hit_point.z -c.z;
}
-This is how I constructed the sphere in main:
Texture* t1 = new Texture("Texture\\earthmap2.bmp");
SphericalMap* sm = new SphericalMap();
t1->set_mapping(sm);
t1->set_ka(0.55);
t1->set_ks(0.0);
Sphere *s1 = new Sphere(Vec(-60,0,50), 149);
s1->set_material(t1);
w.add_object(s1);
Sorry for long codes but if I had any idea where that problem might occur, I'd have posted that part. Finally this is how I call get_color() function from the main:
xShaded += sr.material_ptr->get_color(sr).x * in.x * max(0.0, sr.normal.dot(l)) +
sr.material_ptr->ks * in.x * pow((max(0.0,sr.normal.dot(h))),1);
yShaded += sr.material_ptr->get_color(sr).y * in.y * max(0.0, sr.normal.dot(l)) +
sr.material_ptr->ks * in.y * pow((max(0.0,sr.normal.dot(h))),1);
zShaded += sr.material_ptr->get_color(sr).z * in.z * max(0.0, sr.normal.dot(l)) +
sr.material_ptr->ks * in.z * pow((max(0.0,sr.normal.dot(h))),1);
Shot in the dark: if memory serves, BMPs are stored from the bottom up, while many other image formats are top-down. Could that possibly be the problem? Perhaps your file reader just needs to reverse the rows?
Changing float phi = atan2(local_hit_point.z, local_hit_point.x); to float phi = atan2(local_hit_point.x, local_hit_point.z); solved the problem.

Issue with writing YUV image frame in C/C++

I am trying to convert an RGB frame, which is taken from OpenGL glReadPixels(), to a YUV frame, and write the YUV frame to a file (.yuv). Later on I would like to write it to a named_pipe as an input for FFMPEG, but as for now I just want to write it to a file and view the image result using a YUV Image Viewer. So just disregard the "writing to pipe" for now.
After running my code, I encountered the following errors:
The number of frames shown in the YUV Image Viewer software is always 1/3 of the number of frames I declared in my program. When I declare fps as 10, I could only view 3 frames. When I declared fps as 30, I could only view 10 frames. However when I view the file in Text Editor, I could see that I have the correct amount of word "FRAME" printed in the file.
This is the example output that I got: http://www.bobdanani.net/image.yuv
I could not see the correct image, but just some distorted green, blue, yellow, and black pixels.
I read about YUV format from http://wiki.multimedia.cx/index.php?title=YUV4MPEG2 and http://www.fourcc.org/fccyvrgb.php#mikes_answer and http://kylecordes.com/2007/pipe-ffmpeg
Here is what I have tried so far. I know that this conversion approach is quite in-efficient, and I can optimize it later. Now I just want to get this naive approach to work and have the image shown properly.
int frameCounter = 1;
int windowWidth = 0, windowHeight = 0;
unsigned char *yuvBuffer;
unsigned long bufferLength = 0;
unsigned long frameLength = 0;
int fps = 10;
void display(void) {
/* clear the color buffers */
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
/* DRAW some OPENGL animation, i.e. cube, sphere, etc
.......
.......
*/
glutSwapBuffers();
if ((frameCounter % fps) == 1){
bufferLength = 0;
windowWidth = glutGet(GLUT_WINDOW_WIDTH);
windowHeight = glutGet (GLUT_WINDOW_HEIGHT);
frameLength = (long) (windowWidth * windowHeight * 1.5 * fps) + 100; // YUV 420 length (width*height*1.5) + header length
yuvBuffer = new unsigned char[frameLength];
write_yuv_frame_header();
}
write_yuv_frame();
frameCounter = (frameCounter % fps) + 1;
if ( (frameCounter % fps) == 1){
snprintf(filename, 100, "out/image-%d.yuv", seq_num);
ofstream out(filename, ios::out | ios::binary);
if(!out) {
cout << "Cannot open file.\n";
}
out.write (reinterpret_cast<char*> (yuvBuffer), bufferLength);
out.close();
bufferLength = 0;
delete[] yuvBuffer;
}
}
void write_yuv_frame_header (){
char *yuvHeader = new char[100];
sprintf (yuvHeader, "YUV4MPEG2 W%d H%d F%d:1 Ip A0:0 C420mpeg2 XYSCSS=420MPEG2\n", windowWidth, windowHeight, fps);
memcpy ((char*)yuvBuffer + bufferLength, yuvHeader, strlen(yuvHeader));
bufferLength += strlen (yuvHeader);
delete (yuvHeader);
}
void write_yuv_frame() {
int width = glutGet(GLUT_WINDOW_WIDTH);
int height = glutGet(GLUT_WINDOW_HEIGHT);
memcpy ((void*) (yuvBuffer+bufferLength), (void*) "FRAME\n", 6);
bufferLength +=6;
long length = windowWidth * windowHeight;
long yuv420FrameLength = (float)length * 1.5;
long lengthRGB = length * 3;
unsigned char *rgb = (unsigned char *) malloc(lengthRGB * sizeof(unsigned char));
unsigned char *yuvdest = (unsigned char *) malloc(yuv420FrameLength * sizeof(unsigned char));
glReadPixels(0, 0, windowWidth, windowHeight, GL_RGB, GL_UNSIGNED_BYTE, rgb);
int r, g, b, y, u, v, ypos, upos, vpos;
for (int j = 0; j < windowHeight; ++j){
for (int i = 0; i < windowWidth; ++i){
r = (int)rgb[(j * windowWidth + i) * 3 + 0];
g = (int)rgb[(j * windowWidth + i) * 3 + 1];
b = (int)rgb[(j * windowWidth + i) * 3 + 2];
y = (int)(r * 0.257 + g * 0.504 + b * 0.098) + 16;
u = (int)(r * 0.439 + g * -0.368 + b * -0.071) + 128;
v = (int)(r * -0.148 + g * -0.291 + b * 0.439 + 128);
ypos = j * windowWidth + i;
upos = (j/2) * (windowWidth/2) + i/2 + length;
vpos = (j/2) * (windowWidth/2) + i/2 + length + length/4;
yuvdest[ypos] = y;
yuvdest[upos] = u;
yuvdest[vpos] = v;
}
}
memcpy ((void*) (yuvBuffer + bufferLength), (void*)yuvdest, yuv420FrameLength);
bufferLength += yuv420FrameLength;
free (yuvdest);
free (rgb);
}
This is just the very basic approach, and I can optimize the conversion algorithm later.
Can anyone tell me what is wrong in my approach? My guess is that one of the issues is with the outstream.write() call, because I converted the unsigned char* data to char* data that it may lose data precision. But if I don't cast it to char* I will get a compile error. However this doesn't explain why the output frames are corrupted (only account to 1/3 of the number of total frames).
It looks to me like you have too many bytes per frame for 4:2:0 data. ACcording to the spec you linked to, the number of bytes for a 200x200 pixel 4:2:0 frame should be 200 * 200 * 3 / 2 = 60,000. But you have ~90,000 bytes. Looking at your code, I don't see where you are convert from 4:4:4 to 4:2:0. So you have 2 choices - either set the header to 4:4:4, or convert the YCbCr data to 4:2:0 before writing it out.
I compiled your code and surely there is a problem when computing upos and vpos values.
For me this worked (RGB to YUV NV12):
vpos = length + (windowWidth * (j/2)) + (i/2)*2;
upos = vpos + 1;