Raw RGB values to JPEG - c++

I have an array with raw RGB values in it, and I need to write these values to a JPEG file. Is there an easy way to do this?
I tried:
std::ofstream ofs("./image.JPG", std::ios::out | std::ios::binary);
for (unsigned i = 0; i < width * height; ++i) {
ofs << (int)(std::min(1.0f, image[i].x) * 255) << (int)(std::min(1.0f, image[i].y) * 255) << (int)(std::min(1.0f, image[i].z) * 255);
}
but the format isn't recognized.

If you're trying to produce an image file you might look at Netpbm. You could write the intermediate format (PBM or PAM) fairly simply from what you have. There are then a large number of already written programs that will generate many types of images from your intermediate file.

WOH THERE!
JPEG is MUCH more complicated than raw RBG values. You are going to need to use a library, like LIBJPEG, to store the data as JPEG.
If you wrote it yourself you'd have to:
Convert from RGB to YCbCr
Sample the image
Divide into 8x8 blocks.
Perform the DCT on each block.
Run-length/huffman encode the values
Write these values in properly formatted JPEG blocks.

You could use Boost GIL: it's free, portable (it's part of Boost libraries), usable across a broad spectrum of operating systems (including Windows).
Popular Linux and Unix distributions such as Fedora, Debian and NetBSD include pre-built Boost packages.
The code is quite simple:
#include <boost/gil/extension/io/jpeg_io.hpp>
const unsigned width = 320;
const unsigned height = 200;
// Raw data.
unsigned char r[width * height]; // red
unsigned char g[width * height]; // green
unsigned char b[width * height]; // blue
int main()
{
boost::gil::rgb8c_planar_view_t view =
boost::gil::planar_rgb_view(width, height, r, g, b, width);
boost::gil::jpeg_write_view("out.jpg", view);
return 0;
}
jpeg_write_view saves the currently instantiated view to a jpeg file specified by the name (throws std::ios_base::failure if it fails to create the file).
Remember to link your program with -ljpeg.

Related

AVFrame buf size calculation

I am having trouble encoding a video with the ffmpeg libraries as i am getting segfaults and/or out of bound memory writing when i am writing raw video data to an AVFrame. I therefore just wanted to ask if one of my assumptions was right.
Am i right to assume that the size of AVFrame.data[i] is always equal to AVFrame.linesize[i]*AVFrame.Height? Or could there be scenarios where that is not the case, and if so how can i then reliably calculate the size of AVFrame.data[i]?
It work in most of the cases, but I wouldn't relay on that for all different formats:-
AVFrame.linesize[i] = AVFrame.Width * PixelSize (where PixelSize eg. RGBA = 4bytes)
BufferSize = AVFrame.linesize[i] * AVFrame.Height
The best way, it should be by using FFmpeg's official av_image_get_buffer_size
int buffer_size = av_image_get_buffer_size(AVPixelFormat.AV_PIX_FMT_RGBA, codecCtx->width, codecCtx->height, 1);
It depends on the pixel format. For example YUV 4:4:4, yes every plane is linesizeheight. But for 4:2:2 it is linesizeheight for the Y plane, but linesize*height/2 for the U and V planes.

Can't display a PNG using Glut or OpenGL

Code is here:
void readOIIOImage( const char* fname, float* img)
{
int xres, yres;
ImageInput *in = ImageInput::create (fname);
if (! in) {return;}
ImageSpec spec;
in->open (fname, spec);
xres = spec.width;
yres = spec.height;
iwidth = spec.width;
iheight = spec.height;
channels = spec.nchannels;
cout << "\n";
pixels = new float[xres*yres*channels];
in->read_image (TypeDesc::FLOAT, pixels);
long index = 0;
for( int j=0;j<yres;j++)
{
for( int i=0;i<xres;i++ )
{
for( int c=0;c<channels;c++ )
{
img[ (i + xres*(yres - j - 1))*channels + c ] = pixels[index++];
}
}
}
in->close ();
delete in;
}
Currently, my code produces JPG files fine. It has the ability to read the file's information, and display it fine. However, when I try reading in a PNG file, it doesn't display correctly at all. Usually, it kind of displays the same distorted version of the image in three separate columns on the display. It's very strange. Any idea why this is happening with the given code?
Additionally, the JPG files all have 3 channels. The PNG has 2.
fname is simply a filename, and img is `new float[3*size];
Any help would be great. Thanks.`
Usually, it kind of displays the same distorted version of the image in three separate columns on the display. It's very strange. Any idea why this is happening with the given code?
This reads a lot like the output you get from the decoder is in row-planar format. Planar means, that you get individual rows one for every channel one-after another. The distortion and the discrepancy between number of channels in PNG and apparent count of channels are likely due to alignment mismatch. Now you didn't specify which image decoder library you're using exactly, so I can't look up information in how it communicates the layout of the pixel buffer. I suppose you can read the necessary information from ImageSpec.
Anyway, you'll have to rearrange your pixel buffer rearrangement loop indexing a bit so that consecutive row-planes are interleaved into channel-tuples.
Of course you could as well use a ready to use imagefile-to-OpenGL reader library. DevIL is thrown around a lot, but it's not very well maintained. SOIL seems to be a popular choice these days.

embedding a PNG image in C++ as a vector or array

Can anyone point me in a direction so that I can take PNG images that I have and read them in and then store the data of the PNG Image into an array or vector (or some other data structure) so I can actually use that in my code instead of having to read in the PNG, etc?
I know I can use libPNG to read the image but once read in, I am a little stumped in how to take what is read in and convert it to a data structure I can use in my game.
So my thought is I can write a simple console program that I feed a list of PNG's, it reads them and spits out to a file the data for me to hardcode into a data structure in my actual game.
After you have read the data in like Jason has said you could use a struct to contain the data for each pixel .
struct RGBAQUAD
{
int Red;
int Green;
int Blue;
int Alpha;
}
And then you could create a 2D array like such to represent all of the pixel structs as one contiguous image. But a drawback in this is having to manage memory.
RGBQUAD **arr = 0;
arr = new RGBQUAD *[y];
for(int i = 0; i < y ; i++)
arr[i] = new RGBQUAD[x];
alternatively you could pack a pixel into a single int to conserve ram space.
Load the image; take its width, height and bit depth (and possibly more info, if you need); read its color data, and save the information you extracted.
Example of what a data structure could be:
int width;
int height;
int bits;
std::vector<int> data[width*height*bits];
When you use that data structure:
/*
Before these lines, you must get image dimensions and save them to width, height, depth variables. (either read them from somewhere or simply assign e.g. width=1024 if you know the width (I assume you do, as you're going to hardcode the image)).
*/
std::vector<int> sprite(width*height*depth);
for (int i = 0; i < width*height*depth; i++)
// sprite[i] = (read pixel i from somewhere...);

Setting individual pixels of an RGB frame for ffmpeg encoding

I'm trying to change the test pattern of an ffmpeg streamer, Trouble syncing libavformat/ffmpeg with x264 and RTP , into familiar RGB format. My broader goal is to compute frames of a streamed video on the fly.
So I replaced its AV_PIX_FMT_MONOWHITE with AV_PIX_FMT_RGB24, which is "packed RGB 8:8:8, 24bpp, RGBRGB..." according to http://libav.org/doxygen/master/pixfmt_8h.html .
To stuff its pixel array called data, I've tried many variations on
for (int y=0; y<HEIGHT; ++y) {
for (int x=0; x<WIDTH; ++x) {
uint8_t* rgb = data + ((y*WIDTH + x) *3);
const double i = x/double(WIDTH);
// const double j = y/double(HEIGHT);
rgb[0] = 255*i;
rgb[1] = 0;
rgb[2] = 255*(1-i);
}
}
At HEIGHTxWIDTH= 80x60, this version yields
, when I expect a single blue-to-red horizontal gradient.
640x480 yields the same 4-column pattern, but with far more horizontal stripes.
640x640, 160x160, etc, yield three columns, cyan-ish / magenta-ish / yellow-ish, with the same kind of horizontal stripiness.
Vertical gradients behave even more weirdly.
Appearance was unaffected by an AV_PIX_FMT_RGBA attempt (4 not 3 bytes per pixel, alpha=255). Also unaffected by a port from C to C++.
The argument srcStrides passed to sws_scale() is a length-1 array, containing the single int HEIGHT.
Access each Pixel of AVFrame asks the same question in less detail, so far unanswered.
The streamer emits one warning, which I doubt affects appearance:
[rtp # 0x269c0a0] Encoder did not produce proper pts, making some up.
So. How do you set the RGB value of a pixel in a frame to be sent to sws_scale() (and then to x264_encoder_encode() and av_interleaved_write_frame())?
Use avpicture_fill() as described in Encoding a screenshot into a video using FFMPEG .
Instead of passing data directly to sws_scale(), do this:
AVFrame* pic = avcodec_alloc_frame();
avpicture_fill((AVPicture *)pic, data, AV_PIX_FMT_RGB24, WIDTH, HEIGHT);
and then replace the 2nd and 3rd args of sws_scale() with
pic->data, pic->linesize,
Then the gradients above work properly, at many resolutions.
The argument srcStrides passed to sws_scale() is a length-1 array, containing the single int HEIGHT.
Stride (AKA linesize) is the distance in bytes between two lines. For various reasons having mostly to do with optimization it is often larger than simply width in bytes, so there is padding on the end of each line.
In your case, without any padding, stride should be width * 3.

Creating BMP File

I've been working for a while on image processing and I've noticed weird things.
I'm reading a BMP file, using simple methods like ReadFile and stuff, and using Microsoft's BMP structures.
Here is the code:
ReadFile(_bmpFile,&bmpfh,sizeof(bfh),&data,NULL);
ReadFile(_bmpFile, &bmpih, sizeof(bih), &data, NULL);
imagesize = bih.biWidth*bih.biHeight;
image = new RGBQUAD[imagesize];
ReadFile(_bmpFile,image, imagesize*sizeof(RGBQUAD),&written,NULL);
That is how I read the file and then I'm turning it into gray scale using a simple for-loop.
for (int i = 0; i < imagesize; i++)
{
RED = image[i].rgbRed;
GREEN = image[i].rgbGreen;
BLUE = image[i].rgbBlue;
avg = (RED + GREEN + BLUE ) / 3;
image[i].rgbRed = avg;
image[i].rgbGreen = avg;
image[i].rgbBlue = avg;
}
Now when I write the file using this code:
#pragma pack(push, 1)
WriteFile(_bmpFile, &bmpfh, sizeof(bfh), &data, NULL);
WriteFile(_bmpFile, &bmpih, sizeof(bih), &data, NULL);
WriteFile(_bmpFile, image, imagesize*sizeof(RGBQUAD), &written, NULL);
#pragma pack(pop)
The file is getting much bigger(30MB -> 40MB).
The reason it happens is because I'm using RGBQUAD instead RGBTRIPLE, but if i'm using RGBTRIPLE I have a problem converting small pictures into
gray scale - can't open the picture after creating it(says it's not in the right structure).
Also the file size is missing one byte, (1174kb and after 1173kb)
Has anybody seen this before (it only occurs with small pictures)?
In a BMP file, every scan line has to be padded out so the next scan line starts on a 32-bit boundary. If you do 32 bits per pixel, that happens automatically, but if you use 24 bits per pixel, you'll need to add code to do it explicitly.
You are ignoring stride (Jerry's comment) and the pixel format of the bitmap. Which is 24bpp judging by the file size increase, you are writing it as though it is 32bpp. Your grayscale conversion is wrong, the human eye isn't equally sensitive to red, green and blue.
Consider using GDI+, you #include <gdiplus.h> in your code to use the Bitmap class. Its LockBits() method gives you access to the bitmap bits. The ColorMatrixEffect class lets you apply a color transformation in a single operation. Check this answer for the color matrix you need to get a grayscale image. The MSDN docs start here.
Each horizontal row in a BMP must be a multiple of 4 bytes long.
If the pixel data does not take up a multiple of 4 bytes, then 0x00 bytes are added at the end of the row. For a 24-bpp image, the number of bytes per row is (imageWidth*3 + 3) & ~3. The number of padding bytes is ((imageWidth*3 + 3) & ~3) - (imageWidth*3).
This was answered by immibis.
I would like to add that the size of array is ((imageWidth*3 + 3) & ~3)*imageHeight.
I hope this helps