FFmpeg audio encoder new encode function - c++

I would like to update an AV Audio encoder using function avcodec_encode_audio (deprecated) to avcodec_encode_audio2, without modifying the structure of existing encoder:
outBytes = avcodec_encode_audio(m_handle, dst, sizeBytes, (const short int*)m_samBuf);
where:
1) m_handle AVCodecContext
2) dst, uint8_t * destination buffer
3) sizeBytes, uint32_t size of the destination buffer
4) m_samBuf void * to the input chunk of data to encode (this is casted to: const short int*)
is there a simply way to do it?
Im tryng with:
int gotPack = 1;
memset (&m_Packet, 0, sizeof (m_Packet));
m_Frame = av_frame_alloc();
av_init_packet(&m_Packet);
m_Packet.data = dst;
m_Packet.size = sizeBytes;
uint8_t* buffer = (uint8_t*)m_samBuf;
m_Frame->nb_samples = m_handle->frame_size;
avcodec_fill_audio_frame(m_Frame,m_handle->channels,m_handle->sample_fmt,buffer,m_FrameSize,1);
outBytes = avcodec_encode_audio2(m_handle, &m_Packet, m_Frame, &gotPack);
char error[256];
av_strerror(outBytes,error,256);
if (outBytes<0){
m_server->log(1,1,"Input data: %d, encode function call error: %s \n",gotPack, error);
return AUDIOWRAPPER_ERROR;
}
av_frame_free(&m_Frame);
it compiles but it does not encode anything, i dont here audio at the output if I pipe the output stream on mplayer, wich was warking prior to the upgrade.
What am I doing wrong?
The encoder accept only two sample formats:
AV_SAMPLE_FMT_S16, ///< signed 16 bits
AV_SAMPLE_FMT_FLT, ///< float
here is how the buffer is allocated:
free(m_samBuf);
int bps = 2;
if(m_handle->codec->sample_fmts[0] == AV_SAMPLE_FMT_FLT) {
bps = 4;
}
m_FrameSize = bps * m_handle->frame_size * m_handle->channels;
m_samBuf = malloc(m_FrameSize);
m_numSam = 0;

avcodec_fill_audio_frame should get you there
memset (&m_Packet, 0, sizeof (m_Packet));
memset (&m_Frame, 0, sizeof (m_Frame));
av_init_packet(&m_Packet);
m_Packet.data = dst;
m_Packet.size = sizeBytes;
m_Frame->nb_samples = //you need to get this value from somewhere, it is the number of samples (per channel) this frame represents
avcodec_fill_audio_frame(m_Frame, m_handle->channels, m_handle->sample_fmt,
buffer,
sizeBytes, 1);
int gotPack = 1;
avcodec_encode_audio2(m_handle, &m_Packet, &m_Frame, &gotPack);

Related

ffmpeg hevc encoding failure

I use ffmpeg to h265 encode yuv data, but the image after encoding is always incorrect, as shown below:
However, the following command can be used to encode correctly:ffmpeg -f rawvideo -s 480x256 -pix_fmt yuv420p -i origin.yuv -c:v hevc -f hevc -x265-params keyint=1:crf=18 out.h265, image below:
here my code:
void H265ImageCodec::InitCPUEncoder() {
avcodec_register_all();
AVCodec* encoder = avcodec_find_encoder(AV_CODEC_ID_H265);
CHECK(encoder) << "Can not find encoder with h265.";
// context
encode_context_ = avcodec_alloc_context3(encoder);
CHECK(encode_context_) << "Could not allocate video codec context.";
encode_context_->codec_id = AV_CODEC_ID_H265;
encode_context_->profile = FF_PROFILE_HEVC_MAIN;
encode_context_->codec_type = AVMEDIA_TYPE_VIDEO;
encode_context_->width = width_; // it's 480
encode_context_->height = height_; // it's 256
encode_context_->bit_rate = 384 * 1024;
encode_context_->pix_fmt = AVPixelFormat::AV_PIX_FMT_YUV420P;
encode_context_->time_base = (AVRational){1, 25};
encode_context_->framerate = (AVRational){25, 1};
AVDictionary* options = NULL;
av_dict_set(&options, "preset", "ultrafast", 0);
av_dict_set(&options, "tune", "zero-latency", 0);
av_opt_set(encode_context_->priv_data, "x265-params", "keyint=1:crf=18",
0); // crf: Quality-controlled variable bitrate
avcodec_open2(encode_context_, encoder, &options);
encode_frame_ = av_frame_alloc();
encode_frame_->format = encode_context_->pix_fmt;
encode_frame_->width = encode_context_->width;
encode_frame_->height = encode_context_->height;
av_frame_get_buffer(encode_frame_, 0);
// packet init
encode_packet_ = av_packet_alloc();
}
std::string H265ImageCodec::EncodeImage(std::string_view raw_image) {
av_packet_unref(encode_packet_);
av_frame_make_writable(encode_frame_);
const int64 y_size = width_ * height_;
int64 offset = 0;
memcpy(encode_frame_->data[0], raw_image.data() + offset, y_size);
offset += y_size;
memcpy(encode_frame_->data[1], raw_image.data() + offset, y_size / 4);
offset += y_size / 4;
memcpy(encode_frame_->data[2], raw_image.data() + offset, y_size / 4);
avcodec_send_frame(encode_context_, encode_frame_);
int ret = avcodec_receive_packet(encode_context_, encode_packet_);
CHECK_EQ(ret, 0) << "receive encode packet ret: " << ret;
std::string h265_frame(reinterpret_cast<char*>(encode_packet_->data),
encode_packet_->size);
return h265_frame;
}
Any idea what might cause this?
As commented, the issue is that rows of U and V buffers in encode_frame_ are not continuous in memory.
When executing encode_frame_ = av_frame_alloc() the steps are as follows:
encode_frame_->linesize[0] = 480
The value is equal to the width, so Y channel in continuous in memory.
encode_frame_->linesize[1] = 256 (not equal 480/2).
encode_frame_->linesize[2] = 256 (not equal 480/2).
The rows of U and V channels are not continuous in memory.
Illustration for destination U channel in memory:
<----------- 256 bytes ----------->
<------- 240 elements ------->
^ uuuuuuuuuuuuuuuuuuuuuuuuuuuuuu xxxx
| uuuuuuuuuuuuuuuuuuuuuuuuuuuuuu xxxx
128 rows uuuuuuuuuuuuuuuuuuuuuuuuuuuuuu xxxx
| uuuuuuuuuuuuuuuuuuuuuuuuuuuuuu xxxx
V uuuuuuuuuuuuuuuuuuuuuuuuuuuuuu xxxx
For checking we may print linesize:
printf("encode_frame_->linesize[0] = %d\n", encode_frame_->linesize[0]); //480
printf("encode_frame_->linesize[1] = %d\n", encode_frame_->linesize[1]); //256 (not 240)
printf("encode_frame_->linesize[2] = %d\n", encode_frame_->linesize[2]); //256 (not 240)
Inspired by cudaMemcpy2D, we may implement the function memcpy2D:
//memcpy from src to dst with optional source "pitch" and destination "pitch".
//The "pitch" is the step in bytes between two rows.
//The function interface is based on cudaMemcpy2D.
static void memcpy2D(void* dst,
size_t dpitch,
const void* src,
size_t spitch,
size_t width,
size_t height)
{
const unsigned char* I = (unsigned char*)src;
unsigned char* J = (unsigned char*)dst;
for (size_t y = 0; y < height; y++)
{
const unsigned char* I0 = I + y*spitch; //Pointer to the beggining of the source row
unsigned char* J0 = J + y*dpitch; //Pointer to the beggining of the destination row
memcpy(J0, I0, width); //Copy width bytes from row I0 to row J0
}
}
Use memcpy2D instead of memcpy for copy data to destination frame that may not be continuous in memory:
//Copy Y channel:
memcpy2D(encode_frame_->data[0], //void* dst,
encode_frame_->linesize[0], //size_t dpitch,
raw_image.data() + offset, //const void* src,
width_, //size_t spitch,
width_, //size_t width,
height_); //size_t height)
offset += y_size;
//Copy U channel:
memcpy2D(encode_frame_->data[1], //void* dst,
encode_frame_->linesize[1], //size_t dpitch,
raw_image.data() + offset, //const void* src,
width_/2, //size_t spitch,
width_/2, //size_t width,
height_/2); //size_t height)
offset += y_size / 4;
//Copy V channel:
memcpy2D(encode_frame_->data[2], //void* dst,
encode_frame_->linesize[2], //size_t dpitch,
raw_image.data() + offset, //const void* src,
width_/2, //size_t spitch,
width_/2, //size_t width,
height_/2); //size_t height)

VP8 C/C++ source, how to encode frames in ARGB format to frame instead of from file

I'm trying to get started with the VP8 library, I'm not building it in the standard way they tell you to, I just loaded all of the main files and the "encoder" folder into a new Visual Studio C++ DLL project, and just included the C files in an extern "C" dll export function, which so far builds fine etc., I just have no idea where to start with the C++ API to encode, say, 3 frames of ARGB data into a very basic video, just to get started
The only example I could find is in the examples folder called simple_encoder.c, although their premise is that they are loading in another file already and parsing its frames then converting it, so it seems a bit complicated, I just want to be able to pass in a byte array of a few ARGB frames and have it output a very simple VP8 video
I've seen How to encode series of images into VP8 using WebM VP8 Encoder API? (C/C++) but the accepted answer just links to the build instructions and references the general specification of the vp8 format, the closest I could find there is the example encoding parameters but I just want to do everything from C++ and I can't seem to find any other examples, besides for the default one simple_encoder.c?
Just to cite some of the relevant parts I think I understand, but still need more help on
//in int main...
...
vpx_image_t raw;
if (!vpx_img_alloc(&raw, VPX_IMG_FMT_I420, info.frame_width,
info.frame_height, 1)) {
//"Failed to allocate image." error
}
So that part I think I understand for the most part, VPX_IMG_FMT_I420 is the only part that's not made in this file itself, but its in vpx_image.h, first as
#define VPX_IMG_FMT_PLANAR
//then after...
typedef enum vpx_img_fmt {
VPX_IMG_FMT_NONE,
VPX_IMG_FMT_RGB24, /**< 24 bit per pixel packed RGB */
///some other formats....
VPX_IMG_FMT_ARGB, /**< 32 bit packed ARGB, alpha=255 */
VPX_IMG_FMT_YV12 = VPX_IMG_FMT_PLANAR | VPX_IMG_FMT_UV_FLIP | 1, /**< planar YVU */
VPX_IMG_FMT_I420 = VPX_IMG_FMT_PLANAR | 2,
} vpx_img_fmt_t; /**< alias for enum vpx_img_fmt */
So I guess part of my question is answered already just from writing this, that one of the formats is VPX_IMG_FMT_ARGB, although I don't where where it's defined, but I'm guessing in the above code I would replace it with
const VpxInterface *encoder = get_vpx_encoder_by_name("v8");
vpx_image_t raw;
VpxVideoInfo info = { 0, 0, 0, { 0, 0 } };
info.frame_width = 1920;
info.frame_height = 1080;
info.codec_fourcc = encoder->fourcc;
info.time_base.numerator = 1;
info.time_base.denominator = 24;
bool didIt = vpx_img_alloc(&raw, VPX_IMG_FMT_ARGB,
info.frame_width, info.frame_height/*example width and height*/, 1)
//check didIt..
vpx_codec_enc_cfg_t cfg;
vpx_codec_ctx_t codec;
vpx_codec_err_t res;
res = vpx_codec_enc_config_default(encoder->codec_interface(), &cfg, 0);
//check if !res for error
cfg.g_w = info.frame_width;
cfg.g_h = info.frame_height;
cfg.g_timebase.num = info.time_base.numerator;
cfg.g_timebase.den = info.time_base.denominator;
cfg.rc_target_bitrate = 200;
VpxVideoWriter *writer = NULL;
writer = vpx_video_writer_open(outfile_arg, kContainerIVF, &info);
//check if !writer for error
bool startIt = vpx_codec_enc_init(&codec, encoder->codec_interface(), &cfg, 0);
//not even sure where codec was set actually..
//check !startIt for error starting
//now the next part in the original is where it reads from the input file, but instead
//I need to pass in an array of some ARGB byte arrays..
//thing is, in the next step they use a while loop for
//vpx_img_read(&raw, fopen("path/to/YV12formatVideo", "rb"))
//to set the contents of the raw vpx image allocated earlier, then
//they call another program that writes it to the writer object,
//but I don't know how to read the actual ARGB data directly into the raw image
//without using fopen, so that's one question (review at end)
//so I'll just put a placeholder here for the **question**
//assuming I have an array of byte arrays stored individually
//for simplicity sake
int size = 1920 * 1080 * 4;
uint8_t imgOne[size] = {/*some big byte array*/};
uint8_t imgTwo[size] = {/*some big byte array*/};
uint8_t imgThree[size] = {/*some big byte array*/};
uint8_t *images[] = {imgOne, imgTwo, imgThree};
int framesDone = 0;
int maxFrames = 3;
//so now I can replace the while loop with a filler function
//until I find out how to set the raw image with ARGB data
while(framesDone < maxFrames) {
magicalFunctionToSetARGBOfRawImage(&raw, images[framesDone]);
encode_frame(&codec, &raw, framesDone, 0, writer);
framesDone++;
}
//now apparently it needs to be flushed after
while(encode_frame(&codec, 0, -1, 0, writer)){}
vpx_img_free(&raw);
bool isDestroyed = vpx_codec_destroy(&codec);
//check if !isDestroyed for error
//now we gotta define the encode_Frames function, but simpler
//(and make it above other function for reference purposes
//or in header
static int encode_frame(
vpx_codex_ctx_t *coydek,
vpx_image_t pic,
int currentFrame,
int flags,
VpxVideoWriter *koysayv/*writer*/
) {
//now to substitute their encodeFrame function for
//the actual raw calls to simplify things
const DidIt = vpx_codec_encode(
coydek,
pic,
currentFrame,
1,//duration I think
flags,//whatever that is
VPX_DL_REALTIME//different than simlpe_encoder
);
if(!DidIt) return;//error here
vpx_codec_iter_t iter = 0;
const vpx_codec_cx_pkt_t *pkt = 0;
int gotThings = 0;
while(
(pkt = vpx_codec_get_cx_data(
coydek,
&iter
)) != 0
) {
gotThings = 1;
if(
pkt->kind
== VPX_CODEC_CX_FRAME_PKT //don't exactly
//understand this part
) {
const
int
keyframe = (
pkt
->
data
.frame
.flags
&
VPX_FRAME_IS_KEY
) != 0; //don'texactly understand the
//& operator here or how it gets the keyframe
bool wroteFrame = vpx_video_writer_write_frame(
koysayv,
pkt->data.frame.buf
//I'm guessing this is the encoded
//frame data
,
pkt->data.frame.sz,
pkt->data.frame.pts
);
if(!wroteFrame) return; //error
}
}
return gotThings;
}
Thing is though, I don't know how to actually read the
ARGB data into the RAW image buffer itself, as mentioned
above, in the original example, they use
vpx_img_read(&raw, fopen("path/to/file", "rb"))
but if I'm starting off with the byte arrays themselves
then what function do I use for that instead of the file?
I have a feeling it can be solved by the source code for the vpx_img_read found in tools_common.c function:
int vpx_img_read(vpx_image_t *img, FILE *file) {
int plane;
for (plane = 0; plane < 3; ++plane) {
unsigned char *buf = img->planes[plane];
const int stride = img->stride[plane];
const int w = vpx_img_plane_width(img, plane) *
((img->fmt & VPX_IMG_FMT_HIGHBITDEPTH) ? 2 : 1);
const int h = vpx_img_plane_height(img, plane);
int y;
for (y = 0; y < h; ++y) {
if (fread(buf, 1, w, file) != (size_t)w) return 0;
buf += stride;
}
}
return 1;
}
although I personally am not experienced enough to necessarily know how to get a single frames ARGB data in, I think the key part is fread(buf, 1, w, file) which seems to read parts of file into buf which represents img->planes[plane];, which I think then by reading into buf that automatically reads into img->planes[plane];, but I'm not sure if that is the case, and also not sure how to replace the fread from file to just take in a bye array that is alreasy loaded into memory...
VPX_IMG_FMT_ARGB is not defined because not supported by libvpx (as far as I have seen). To compress an image using this library, you must first convert it to one of the supported format, like I420 (VPX_IMG_FMT_I420). The code here (not mine) : https://gist.github.com/racerxdl/8164330 do it well for the RGB format. If you don't want to use libswscale to make the conversion from RGB to I420, you can do things like this (this code convert a RGBA array of bytes to a I420 vpx_image that can be use by libvpx):
unsigned int tx = <width of your image>
unsigned int ty = <height of your image>
unsigned char *image = <array of bytes : RGBARGBA... of size ty*tx*4>
vpx_image_t *imageVpx = <result that must have been properly initialized by libvpx>
imageVpx->stride[VPX_PLANE_U ] = tx/2;
imageVpx->stride[VPX_PLANE_V ] = tx/2;
imageVpx->stride[VPX_PLANE_Y ] = tx;
imageVpx->stride[VPX_PLANE_ALPHA] = tx;
imageVpx->planes[VPX_PLANE_U ] = new unsigned char[ty*tx/4];
imageVpx->planes[VPX_PLANE_V ] = new unsigned char[ty*tx/4];
imageVpx->planes[VPX_PLANE_Y ] = new unsigned char[ty*tx ];
imageVpx->planes[VPX_PLANE_ALPHA] = new unsigned char[ty*tx ];
unsigned char *planeY = imageVpx->planes[VPX_PLANE_Y ];
unsigned char *planeU = imageVpx->planes[VPX_PLANE_U ];
unsigned char *planeV = imageVpx->planes[VPX_PLANE_V ];
unsigned char *planeA = imageVpx->planes[VPX_PLANE_ALPHA];
for (unsigned int y=0; y<ty; y++)
{
if (!(y % 2))
{
for (unsigned int x=0; x<tx; x+=2)
{
int r = *image++;
int g = *image++;
int b = *image++;
int a = *image++;
*planeY++ = max(0, min(255, (( 66*r + 129*g + 25*b) >> 8) + 16));
*planeU++ = max(0, min(255, ((-38*r + -74*g + 112*b) >> 8) + 128));
*planeV++ = max(0, min(255, ((112*r + -94*g + -18*b) >> 8) + 128));
*planeA++ = a;
r = *image++;
g = *image++;
b = *image++;
a = *image++;
*planeA++ = a;
*planeY++ = max(0, min(255, ((66*r + 129*g + 25*b) >> 8) + 16));
}
}
else
{
for (unsigned int x=0; x<tx; x++)
{
int const r = *image++;
int const g = *image++;
int const b = *image++;
int const a = *image++;
*planeA++ = a;
*planeY++ = max(0, min(255, ((66*r + 129*g + 25*b) >> 8) + 16));
}
}
}

FreeImage wrong image color

I am trying to extract frames from a stream which I create with Gstreamer and trying to save them with FreeImage or QImage ( this one is for testing ).
GstMapInfo bufferInfo;
GstBuffer *sampleBuffer;
GstStructure *capsStruct;
GstSample *sample;
GstCaps *caps;
int width, height;
const int BitsPP = 32;
/* Retrieve the buffer */
g_signal_emit_by_name (sink, "pull-sample", &sample);
if (sample) {
sampleBuffer = gst_sample_get_buffer(sample);
gst_buffer_map(sampleBuffer,&bufferInfo,GST_MAP_READ);
if (!bufferInfo.data) {
g_printerr("Warning: could not map GStreamer buffer!\n");
throw;
}
caps = gst_sample_get_caps(sample);
capsStruct= gst_caps_get_structure(caps,0);
gst_structure_get_int(capsStruct,"width",&width);
gst_structure_get_int(capsStruct,"height",&height);
auto bitmap = FreeImage_Allocate(width, height, BitsPP,0,0,0);
memcpy( FreeImage_GetBits( bitmap ), bufferInfo.data, width * height * (BitsPP/8));
// int pitch = ((((BitsPP * width) + 31) / 32) * 4);
// auto bitmap = FreeImage_ConvertFromRawBits(bufferInfo.data,width,height,pitch,BitsPP,0, 0, 0);
FreeImage_FlipHorizontal(bitmap);
bitmap = FreeImage_RotateClassic(bitmap,180);
static int id = 0;
std::string name = "/home/stadmin/pic/sample" + std::to_string(id++) + ".png";
#ifdef FREE_SAVE
FreeImage_Save(FIF_PNG,bitmap,name.c_str());
#endif
#ifdef QT_SAVE
//Format_ARGB32
QImage image(bufferInfo.data,width,height,QImage::Format_ARGB32);
image.save(QString::fromStdString(name));
#endif
fibPipeline.push(bitmap);
gst_sample_unref(sample);
gst_buffer_unmap(sampleBuffer, &bufferInfo);
return GST_FLOW_OK;
The color output in FreeImage are totally wrong like when Qt - Format_ARGB32 [ greens like blue or blues like oranges etc.. ] but when I test with Qt - Format_RGBA8888 I can get correct output. I need to use FreeImage and I wish to learn how to correct this.
Since you say Qt succeeds using Format_RGBA8888, I can only guess: the gstreamer frame has bytes in RGBA order while FreeImage expects ARGB.
Quick fix:
//have a buffer the same length of the incoming bytes
size_t length = width * height * (BitsPP/8);
BYTE * bytes = (BYTE *) malloc(length);
//copy the incoming bytes to it, in the right order:
int index = 0;
while(index < length)
{
bytes[index] = bufferInfo.data[index + 2]; //B
bytes[index + 1] = bufferInfo.data[index + 1]; //G
bytes[index + 2] = bufferInfo.data[index]; //R
bytes[index + 3] = bufferInfo.data[index + 3]; //A
index += 4;
}
//fill the bitmap using the buffer
auto bitmap = FreeImage_Allocate(width, height, BitsPP,0,0,0);
memcpy( FreeImage_GetBits( bitmap ), bytes, length);
//don't forget to
free(bytes);

Memory leak in jpeg compression. Bug or my mistake?

I wrote an npm module for capturing webcam input on linux. The captured frame in yuyv format is converted to rgb24 and after compressed to a jpeg image. In the jpeg compression there appears to be a memory leak. So the usage of memory increases continuously.
Image* rgb24_to_jpeg(Image *img, Image *jpeg) { // img = RGB24
jpeg_compress_struct cinfo;
jpeg_error_mgr jerr;
cinfo.err = jpeg_std_error(&jerr);
jerr.trace_level = 10;
jpeg_create_compress(&cinfo);
unsigned char *imgd = new unsigned char[img->size];
long unsigned int size = 0;
jpeg_mem_dest(&cinfo, &imgd, &size);
cinfo.image_width = img->width;
cinfo.image_height = img->height;
cinfo.input_components = 3;
cinfo.in_color_space = JCS_RGB;
jpeg_set_defaults(&cinfo);
jpeg_set_quality(&cinfo, 100, true);
jpeg_start_compress(&cinfo, true);
int row_stride = cinfo.image_width * 3;
JSAMPROW row_pointer[1];
while (cinfo.next_scanline < cinfo.image_height) {
row_pointer[0] = &img->data[cinfo.next_scanline * row_stride];
jpeg_write_scanlines(&cinfo, row_pointer, 1);
}
jpeg_finish_compress(&cinfo);
jpeg_destroy_compress(&cinfo);
// size += 512; // TODO: actual value to expand jpeg buffer... JPEG header?
if (jpeg->data == NULL) {
jpeg->data = (unsigned char *) malloc(size);
} else {
jpeg->data = (unsigned char *) realloc(jpeg->data, size);
}
memcpy(jpeg->data, imgd, size);
delete[] imgd;
jpeg->size = size;
return jpeg;
}
The rgb24 and jpeg buffers are reallocated on every cycle. So it looks like the leak is inside libjpeg layer. Is this true or I simply made a mistake somewhere in the code?
Note: the compressed image shall not be saved as a file, since the data might be used for live streaming.
You are using the jpeg_mem_dest in a wrong way - the second parameter is pointer to pointer to char because it is actually set by the library and then you must free it after you are done. Now you are initializing it with a pointer, it gets overwritten and you free the memory region allocated by the library but the original memory region is leaked.
This is how you should change your function:
Image* rgb24_to_jpeg(Image *img, Image *jpeg) { // img = RGB24
jpeg_compress_struct cinfo;
jpeg_error_mgr jerr;
cinfo.err = jpeg_std_error(&jerr);
jerr.trace_level = 10;
jpeg_create_compress(&cinfo);
unsigned char *imgd = 0;
long unsigned int size = 0;
cinfo.image_width = img->width;
cinfo.image_height = img->height;
cinfo.input_components = 3;
cinfo.in_color_space = JCS_RGB;
jpeg_set_defaults(&cinfo);
jpeg_set_quality(&cinfo, 100, true);
jpeg_mem_dest(&cinfo, &imgd, &size); // imgd will be set by the library
jpeg_start_compress(&cinfo, true);
int row_stride = cinfo.image_width * 3;
JSAMPROW row_pointer[1];
while (cinfo.next_scanline < cinfo.image_height) {
row_pointer[0] = &img->data[cinfo.next_scanline * row_stride];
jpeg_write_scanlines(&cinfo, row_pointer, 1);
}
jpeg_finish_compress(&cinfo);
jpeg_destroy_compress(&cinfo);
// size += 512; // TODO: actual value to expand jpeg buffer... JPEG header?
if (jpeg->data == NULL) {
jpeg->data = (unsigned char *) malloc(size);
} else if (jpeg->size != size) {
jpeg->data = (unsigned char *) realloc(jpeg->data, size);
}
memcpy(jpeg->data, imgd, size);
free(imgd); // dispose of imgd when you are done
jpeg->size = size;
return jpeg;
}
This snippet form jpeg_mem_dest explains the memory management:
if (*outbuffer == NULL || *outsize == 0) {
/* Allocate initial buffer */
dest->newbuffer = *outbuffer = (unsigned char *) malloc(OUTPUT_BUF_SIZE);
if (dest->newbuffer == NULL)
ERREXIT1(cinfo, JERR_OUT_OF_MEMORY, 10);
*outsize = OUTPUT_BUF_SIZE;
}
So, if you pass a an empty pointer or a zero sized buffer the library will perform an allocation for you. Thus - another approach is also to set the size correctly and then you can use the originally supplied pointer
In my case I did not solve the issue with previous answer, there was no way to free the memory image pointer, the only way to do that was reserving enough memory to the image and that way the library will not reserve memory and I have the control over the memory and is on the same heap of my application and not on the library's heap, here is my example:
//previous code...
struct jpeg_compress_struct cinfo;
//reserving the enough memory for my image (width * height)
unsigned char* _image = (unsigned char*)malloc(Width * Height);
//putting the reserved size into _imageSize
_imageSize = Width * Height;
//call the function like this:
jpeg_mem_dest(&cinfo, &_image, &_imageSize);
................
//releasing the reserved memory
free(_image);
NOTE: if you put _imageSize = 0, the library will assume that you have not reserve memory and the own library will do it.. so you need to put in _imageSize the amount of bytes reserved in _image
That way you have total control over the reserved memory and you can release it whenever you want in your software..

encode x264(libx264) raw yuv frame data

I am trying to encode an MP4 video using raw YUV frames data, but I am not sure how can I fill the plane data (preferably without using other libraries like ffmpeg)
The frame data is already encoded in I420, and does not need conversion.
Here is what I am trying to do:
const char *frameData = /* Raw frame data */;
x264_t *encoder = x264_encoder_open(&param);
x264_picture_t imgInput, imgOutput;
x264_picture_alloc(&imgInput, X264_CSP_I420, width, height);
// how can I fill the struct data of imgInput
x264_nal_t *nals;
int i_nals;
int frameSize = x264_encoder_encode(encoder, &nals, &i_nals, &imgInput, &imgOutput);
The equivalent command line that I have found is :
x264 --output video.mp4 --fps 15 --input-res 1280x800 imgdata_01.raw
But I could not figure out how the app does it.
Thanks.
Look at libx264 API usage example. This example use fread() to fill frame allocated by x264_picture_alloc() with actual i420 data from stdin. If you already have i420 data in memory and want to skip memcpy step than instead of it you can:
Use x264_picture_init() instead of x264_picture_alloc() and x264_picture_clean(). Because you don't need allocate memory on heap for frame data.
Fill x264_picture_t.img struct fields:
i_csp = X264_CSP_I420;
i_plane = 3;
plane[0] = pointer to Y-plane;
i_stride[0] = stride in bytes for Y-plane;
plane[1] = pointer to U-plane;
i_stride[1] = stride in bytes for U-plane;
plane[2] = pointer to V-plane;
i_stride[2] = stride in bytes for V-plane;
To complete the answer above, this is an example to fill an x264_picture_t image.
int fillImage(uint8_t* buffer, int width, int height, x264_picture_t*pic){
int ret = x264_picture_alloc(pic, X264_CSP_I420, width, height);
if (ret < 0) return ret;
pic->img.i_plane = 3; // Y, U and V
pic->img.i_stride[0] = width;
// U and V planes are half the size of Y plane
pic->img.i_stride[1] = width / 2;
pic->img.i_stride[2] = width / 2;
int uvsize = ((width + 1) >> 1) * ((height + 1) >> 1);
pic->img.plane[0] = buffer; // Y Plane pointer
pic->img.plane[1] = buffer + (width * height); // U Plane pointer
pic->img.plane[2] = pic->img.plane[1] + uvsize; // V Plane pointer
return ret;
}