How to compress YUYV raw data to JPEG using libjpeg? - c++

I'm looking for an example of how to save a YUYV format frame to a JPEG file using the libjpeg library.

In typical computer APIs, "YUV" actually means YCbCr, and "YUYV" means "YCbCr 4:2:2" stored as Y0, Cb01, Y1, Cr01, Y2 ...
Thus, if you have a "YUV" image, you can save it to libjpeg using the JCS_YCbCr color space.
When you have a 422 image (YUYV) you have to duplicate the Cb/Cr values to the two pixels that need them before writing the scanline to libjpeg. Thus, this write loop will do it for you:
// "base" is an unsigned char const * with the YUYV data
// jrow is a libjpeg row of samples array of 1 row pointer
cinfo.image_width = width & -1;
cinfo.image_height = height & -1;
cinfo.input_components = 3;
cinfo.in_color_space = JCS_YCbCr;
jpeg_set_defaults(&cinfo);
jpeg_set_quality(&cinfo, 92, TRUE);
jpeg_start_compress(&cinfo, TRUE);
unsigned char *buf = new unsigned char[width * 3];
while (cinfo.next_scanline < height) {
for (int i = 0; i < cinfo.image_width; i += 2) {
buf[i*3] = base[i*2];
buf[i*3+1] = base[i*2+1];
buf[i*3+2] = base[i*2+3];
buf[i*3+3] = base[i*2+2];
buf[i*3+4] = base[i*2+1];
buf[i*3+5] = base[i*2+3];
}
jrow[0] = buf;
base += width * 2;
jpeg_write_scanlines(&cinfo, jrow, 1);
}
jpeg_finish_compress(&cinfo);
delete[] buf;
Use your favorite auto-ptr to avoid leaking "buf" if your error or write function can throw / longjmp.
Providing YCbCr to libjpeg directly is preferrable to converting to RGB, because it will store it directly in that format, thus saving a lot of conversion work. When the image comes from a webcam or other video source, it's also usually most efficient to get it in YCbCr of some sort (such as YUYV.)
Finally, "U" and "V" mean something slightly different in analog component video, so the naming of YUV in computer APIs that really mean YCbCr is highly confusing.

libjpeg also has a raw data mode, whereby you can directly supply the raw downsampled data (which is almost what you have in the YUYV format). This is more efficient than duplicating the UV values only to have libjpeg downscale them again internally.
To do so, you use jpeg_write_raw_data instead of jpeg_write_scanlines, and by default it will process exactly 16 scanlines at a time. JPEG expects the U and V planes to be 2x downsampled by default. YUYV format already has the horizontal dimension downsampled but not the vertical, so I skip U and V every other scanline.
Initialization:
cinfo.image_width = /* width in pixels */;
cinfo.image_height = /* height in pixels */;
cinfo.input_components = 3;
cinfo.in_color_space = JCS_YCbCr;
jpeg_set_defaults(&cinfo);
cinfo.raw_data_in = true;
JSAMPLE y_plane[16][cinfo.image_width];
JSAMPLE u_plane[8][cinfo.image_width / 2];
JSAMPLE v_plane[8][cinfo.image_width / 2];
JSAMPROW y_rows[16];
JSAMPROW u_rows[8];
JSAMPROW v_rows[8];
for (int i = 0; i < 16; ++i)
{
y_rows[i] = &y_plane[i][0];
}
for (int i = 0; i < 8; ++i)
{
u_rows[i] = &u_plane[i][0];
}
for (int i = 0; i < 8; ++i)
{
v_rows[i] = &v_plane[i][0];
}
JSAMPARRAY rows[] { y_rows, u_rows, v_rows };
Compressing:
jpeg_start_compress(&cinfo, true);
while (cinfo.next_scanline < cinfo.image_height)
{
for (JDIMENSION i = 0; i < 16; ++i)
{
auto offset = (cinfo.next_scanline + i) * cinfo.image_width * 2;
for (JDIMENSION j = 0; j < cinfo.image_width; j += 2)
{
y_plane[i][j] = image.data[offset + j * 2 + 0];
y_plane[i][j + 1] = image.data[offset + j * 2 + 2];
if (i % 2 == 0)
{
u_plane[i / 2][j / 2] = image_data[offset + j * 2 + 1];
v_plane[i / 2][j / 2] = image_data[offset + j * 2 + 3];
}
}
}
jpeg_write_raw_data(&cinfo, rows, 16);
}
jpeg_finish_compress(&cinfo);
I was able to get about a 33% decrease in compression time with this method compared to the one in #JonWatte's answer. This solution isn't for everyone though; some caveats:
You can only compress images with dimensions that are a multiple of 8. If you have different-sized images, you will have to write code to pad in the edges. If you're getting the images from a camera though, they will most likely be this way.
The quality is somewhat impaired by the fact that I simply skip color values for alternating scanlines instead of something fancier like averaging them. For my application though, speed was more important than quality.
The way it's written right now it allocates a ton of memory on the stack. This was acceptable for me because my images were small (640x480) and enough memory was available.
Documentation for libjpeg-turbo: https://raw.githubusercontent.com/libjpeg-turbo/libjpeg-turbo/master/libjpeg.txt

Related

Why my bitmap image have another color overlay after converting 32-bit to 8-bit

Im working on resizing bitmap image and converting bitmap image to 8-bit (grayscale). But I have the problem that when I convert 32-bit image to 8-bit image, the result has another color overlay while it works perfectly on 24-bit. I guess the cause is in the alpha color. but I dont know where the problem exactly is.
This is my code to generate 8-bit palette color and write it after DIB part:
char* palette = new char[1024];
for (int i = 0; i < 256; i++) {
palette[i * 4] = palette[i * 4 + 1] = palette[i * 4 + 2] = (char)i;
palette[i * 4 + 3] = 255;
}
fout.write(palette, 1024);
delete[] palette;
As I said, my code works perfectly on 24-bit. In 32-bit the color is still kept after resizing, but when converting to 8-bit, it will look like this:
expected image (when converted from 24-bit) //
unexpected image (when converted from 32-bit)
This is how I get the colors and save it to srcPixel[]:
int i = 0;
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
int index = getIndex(width, x, y);
srcPixel[index].A = srcBMP.pImageData[i];
i += alpha;
srcPixel[index].B = srcBMP.pImageData[i++];
srcPixel[index].G = srcBMP.pImageData[i++];
srcPixel[index].R = srcBMP.pImageData[i++];
}
i += padding;
}
And this is the code I converted it by getting average of 4 colors A, B, G and R from that srcPixel[]:
int i = 0;
for (int y = 0; y < dstHeight; y++) {
for (int x = 0; x < dstWidth; x++) {
int index = getIndex(dstWidth, x, y);
dstBMP.pImageData[i++] = (srcPixel[index].A + srcPixel[index].B + srcPixel[index].G + srcPixel[index].R) / 4;
}
i += dstPadding;
}
If I remove and skip all alpha bytes in my code, when converting my image is still like that and I will have another problem is when resizing, my image will have another color overlay like the problem when converting to 8-bit: resizing without alpha channel.
If I skip the alpha channel while getting average (change into dstBMP.pImageData[i++] = (srcPixel[index].B + srcPixel[index].G + srcPixel[index].R) / 3, there is almost nothing different, the overlay still exists.
If I remove palette[i * 4 + 3] = 255; or doing anything with it, the result is still not affected.
Thank you very much.
You add alpha channel to the color and that's why it becomes brighter. From here I found that opaque is 255 and transparent 0 - therefore you add another channel which is set to 'white' to your result.
Remove alpha channel from your equation and see if I'm right.

How to convert CMSampleBufferRef/CIImage/UIImage into pixels e.g. uint8_t[]

I have input from captured camera frame as CMSampleBufferRef and I need to get the raw pixels preferably in C type uint8_t[].
I also need to find the color scheme of the input image.
I know how to convert CMSampleBufferRef to UIImage and then to NSData with png format but I dont know how to get the raw pixels from there. Perhaps I could get it already from CMSampleBufferRef/CIImage`?
This code shows the need and the missing bits.
Any thoughts where to start?
int convertCMSampleBufferToPixelArray (CMSampleBufferRef sampleBuffer)
{
// inputs
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CIImage *ciImage = [CIImage imageWithCVPixelBuffer:imageBuffer];
CIContext *imgContext = [CIContext new];
CGImageRef cgImage = [imgContext createCGImage:ciImage fromRect:ciImage.extent];
UIImage *uiImage = [UIImage imageWithCGImage:cgImage];
NSData *nsData = UIImagePNGRepresentation(uiImage);
// Need to fill this gap
uint8_t* data = XXXXXXXXXXXXXXXX;
ImageFormat format = XXXXXXXXXXXXXXXX; // one of: GRAY8, RGB_888, YV12, BGRA_8888, ARGB_8888
// sample showing expected data values
// this routine converts the image data to gray
//
int width = uiImage.size.width;
int height = uiImage.size.height;
const int size = width * height;
std::unique_ptr<uint8_t[]> new_data(new uint8_t[size]);
for (int i = 0; i < size; ++i) {
new_data[i] = uint8_t(data[i * 3] * 0.299f + data[i * 3 + 1] * 0.587f +
data[i * 3 + 2] * 0.114f + 0.5f);
}
return 1;
}
Some pointers you can use to search for more info. It's nicely documented and you shouldn't have an issue.
int convertCMSampleBufferToPixelArray (CMSampleBufferRef sampleBuffer) {
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
if (imageBuffer == NULL) {
return -1;
}
// Get address of the image buffer
CVPixelBufferLockBaseAddress(imageBuffer, 0);
uint8_t* data = CVPixelBufferGetBaseAddress(imageBuffer);
// Get size
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
// Get bytes per row
size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
// At `data` you have a bytesPerRow * height bytes of the image data
// To get pixel info you can call CVPixelBufferGetPixelFormatType, ...
// you can call CVImageBufferGetColorSpace and inspect it, ...
// When you're done, unlock the base address
CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
return 0;
}
There're couple of things you should be aware of.
First one is that it can be planar. Check the CVPixelBufferIsPlanar, CVPixelBufferGetPlaneCount, CVPixelBufferGetBytesPerRowOfPlane, etc.
Second one is that you have to calculate pixel size based on CVPixelBufferGetPixelFormatType. Something like:
CVPixelBufferGetPixelFormatType(imageBuffer)
size_t pixelSize;
switch (pixelFormat) {
case kCVPixelFormatType_32BGRA:
case kCVPixelFormatType_32ARGB:
case kCVPixelFormatType_32ABGR:
case kCVPixelFormatType_32RGBA:
pixelSize = 4;
break;
// + other cases
}
Let's say that the buffer is not planar and:
CVPixelBufferGetWidth returns 200 (pixels)
Your pixelSize is 4 (calcuated bytes per row is 200 * 4 = 800)
CVPixelBufferGetBytesPerRow can return anything >= 800
In other words, the pointer you have is not a pointer to a contiguous buffer. If you need row data you have to do something like this:
uint8_t* data = CVPixelBufferGetBaseAddress(imageBuffer);
// Get size
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
size_t pixelSize = 4; // Let's pretend it's calculated pixel size
size_t realRowSize = width * pixelSize;
size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
for (int row = 0 ; row < height ; row++) {
// bytesPerRow acts like an offset where the next row starts
// bytesPerRow can be >= realRowSize
uint8_t *rowData = data + row * bytesPerRow;
// realRowSize = how many bytes are available for this row
// copy them somewhere
}
You have to allocate a buffer and copy these row data there if you'd like to have contiguous buffer. How many bytes to allocate? CVPixelBufferGetDataSize.

Generating .pfm image in c++

I have a small program that outputs an rgb image. And I need it to be in .pfm format.
So, I have some data in the range [0, 255].
float * data;
data = new float[PixelWidth * PixelHeight * 3];
for (int i = 0; i < PixelWidth * PixelHeight * 3; i += 3) {
int idx = i / 3;
data[i] = img[idx].x;
data[i + 1] = img[idx].y;
data[i + 2] = img[idx].z;
}
(img[] here is Vec3[] of unsigned char)
Now I generate the image.
char sizes[256];
f = fopen("outputimage.pfm", "wb");
double scale = -1.0;
fprintf(f, "PF\n%d %d\n%lf\n", PixelWidth, PixelHeight, scale);
for (int i = 0; i < PixelWidth*PixelHeight*3; i++) {
float d = data[i];
fwrite((void *)&d, 1, 4, f);
}
fclose(f);
But somehow I get a grayscale image instead of RGB.
The data is fine. I tried to output it as .ppm and it works fine.
I guess the problem is with scaling, but I am not really sure how it should be done correctly.
To close the question.
I just had to convert all the values from [0-255] range to [0.0-1.0]. So, I divided each rgb value by 255.

How to efficiently render a 24-bpp image on a 32-bpp display?

First of all, I'm programming in the kernel context so no existing libraries exist. In fact this code is going to go into a library of my own.
Two questions, one more important than the other:
As the title suggests, how can I efficiently render a 24-bpp image onto a 32-bpp device, assuming that I have the address of the frame buffer?
Currently I have this code:
void BitmapImage::Render24(uint16_t x, uint16_t y, void (*r)(uint16_t, uint16_t, uint32_t))
{
uint32_t imght = Math::AbsoluteValue(this->DIB->GetBitmapHeight());
uint64_t ptr = (uint64_t)this->ActualBMP + this->Header->BitmapArrayOffset;
uint64_t rowsize = ((this->DIB->GetBitsPerPixel() * this->DIB->GetBitmapWidth() + 31) / 32) * 4;
uint64_t oposx = x;
uint64_t posx = oposx;
uint64_t posy = y + (this->DIB->Type == InfoHeaderV1 && this->DIB->GetBitmapHeight() < 0 ? 0 : this->DIB->GetBitmapHeight());
for(uint32_t d = 0; d < imght; d++)
{
for(uint32_t w = 0; w < rowsize / (this->DIB->GetBitsPerPixel() / 8); w++)
{
r(posx, posy, (*((uint32_t*)ptr) & 0xFFFFFF));
ptr += this->DIB->GetBitsPerPixel() / 8;
posx++;
}
posx = oposx;
posy--;
}
}
r is a function pointer to a PutPixel-esque thing that accepts x, y, and colour parameters.
Obviously this code is terribly slow, since plotting pixels one at a time is never a good idea.
For my 32-bpp rendering code (which I also have a question about, more on that later) I can easily Memory::Copy() the bitmap array (I'm loading bmp files here) to the frame buffer.
However, how do I do this with 24bpp images? On a 24bpp display this would be fine but I'm working with a 32bpp one.
One solution I can think of right now is to create another bitmap array which essentially contains values of 0x00(colour) and the use that to draw to the screen -- I don't think this is very good though, so I'm looking for a better alternative.
Next question:
2. Given, for obvious reasons, one cannot simply Memory::Copy() the entire array at once onto the frame buffer, the next best thing would be to copy them row by row.
Is there a better way?
Basically something like this:
for (uint32_t l = 0; l < h; ++l) // l line index in pixels
{
// srcPitch is distance between lines in bytes
char* srcLine = (char*)srcBuffer + l * srcPitch;
unsigned* trgLine = ((unsigned*)trgBuffer) + l * trgPitch;
for (uint32_t c = 0; c < w; ++c) // c is column index in pixels
{
// build target pixel. arrange indexes to fit your render target (0, 1, 2)
++(*trgLine) = (srcLine[0] << 16) | (srcLine[1] << 8)
| srcLine[2] | (0xff << 24);
srcLine += 3;
}
}
A few notes:
- better to write to a different buffer than the render buffer so the image is displayed at once.
- using functions for pixel placement like you did is very (very very) slow.

array, copy pixels to correct index, algorithm

I have image size is 2x2, so count pixels = 4
one pixel - 4 bytes
so I have an array of 16 bytes - mas[16] - width * height * 4 = 16
I want to make the same image, but the size is more a factor of 2, this means that instead of one will be four pixels
new array will have size of 64 bytes - newMas[16] - width*2 * height*2 * 4
problem, that i can't correct copy pixels to newMas,that with different size image correctly copy pixels
this code copy pixels to mas[16]
size_t width = CGImageGetWidth(imgRef);
size_t height = CGImageGetHeight(imgRef);
const size_t bytesPerRow = width * 4;
const size_t bitmapByteCount = bytesPerRow * height;
size_t mas[bitmapByteCount];
UInt8* data = (UInt8*)CGBitmapContextGetData(bmContext);
for (size_t i = 0; i < bitmapByteCount; i +=4)
{
UInt8 a = data[i];
UInt8 r = data[i + 1];
UInt8 g = data[i + 2];
UInt8 b = data[i + 3];
mas[i] = a;
mas[i+1] = r;
mas[i+2] = g;
mas[i+3] = b;
}
In general, using the built-in image drawing API will be faster and less error-prone than writing your own image-manipulation code. There are at least three potential errors in the code above:
It assumes that there's no padding at the end of rows (iOS seems to pad up to a multiple of 16 bytes); you need to use CGImageGetBytesPerRow().
It assumes a fixed pixel format.
It gets the width/height from a CGImage but the data from a CGBitmapContext.
Assuming you have a UIImage,
CGRect r = {{0,0},img.size};
r.size.width *= 2;
r.size.height *= 2;
UIGraphicsBeginImageContext(r.size);
// This turns off interpolation in order to do pixel-doubling.
CGContextSetInterpolationQuality(UIGraphicsGetCurrentContext(), kCGInterpolationNone);
[img drawRect:r];
UIImage * bigImg = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();