Why does adding audio stream to ffmpeg's libavcodec output container cause a crash?

Why does adding audio stream to ffmpeg's libavcodec output container cause a crash? - c++

As it stands, my project correctly uses libavcodec to decode a video, where each frame is manipulated (it doesn't matter how) and output to a new video. I've cobbled this together from examples found online, and it works. The result is a perfect .mp4 of the manipulated frames, minus the audio.
My problem is, when I try to add an audio stream to the output container, I get a crash in mux.c that I can't explain. It's in static int compute_muxer_pkt_fields(AVFormatContext *s, AVStream *st, AVPacket *pkt). Where st->internal->priv_pts->val = pkt->dts; is attempted, priv_pts is nullptr.
I don't recall the version number, but this is from a November 4, 2020 ffmpeg build from git.
My MediaContentMgr is much bigger than what I have here. I'm stripping out everything to do with the frame manipulation, so if I'm missing anything, please let me know and I'll edit.
The code that, when added, triggers the nullptr exception, is called out inline
The .h:
#ifndef _API_EXAMPLE_H
#define _API_EXAMPLE_H
#include <glad/glad.h>
#include <GLFW/glfw3.h>
#include "glm/glm.hpp"
extern "C" {
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavutil/avutil.h>
#include <libavutil/opt.h>
#include <libswscale/swscale.h>
}
#include "shader_s.h"
class MediaContainerMgr {
public:
MediaContainerMgr(const std::string& infile, const std::string& vert, const std::string& frag,
const glm::vec3* extents);
~MediaContainerMgr();
void render();
bool recording() { return m_recording; }
// Major thanks to "shi-yan" who helped make this possible:
// https://github.com/shi-yan/videosamples/blob/master/libavmp4encoding/main.cpp
bool init_video_output(const std::string& video_file_name, unsigned int width, unsigned int height);
bool output_video_frame(uint8_t* buf);
bool finalize_output();
private:
AVFormatContext* m_format_context;
AVCodec* m_video_codec;
AVCodec* m_audio_codec;
AVCodecParameters* m_video_codec_parameters;
AVCodecParameters* m_audio_codec_parameters;
AVCodecContext* m_codec_context;
AVFrame* m_frame;
AVPacket* m_packet;
uint32_t m_video_stream_index;
uint32_t m_audio_stream_index;
void init_rendering(const glm::vec3* extents);
int decode_packet();
// For writing the output video:
void free_output_assets();
bool m_recording;
AVOutputFormat* m_output_format;
AVFormatContext* m_output_format_context;
AVCodec* m_output_video_codec;
AVCodecContext* m_output_video_codec_context;
AVFrame* m_output_video_frame;
SwsContext* m_output_scale_context;
AVStream* m_output_video_stream;
AVCodec* m_output_audio_codec;
AVStream* m_output_audio_stream;
AVCodecContext* m_output_audio_codec_context;
};
#endif
And, the hellish .cpp:
#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>
#include <inttypes.h>
#include "media_container_manager.h"
MediaContainerMgr::MediaContainerMgr(const std::string& infile, const std::string& vert, const std::string& frag,
const glm::vec3* extents) :
m_video_stream_index(-1),
m_audio_stream_index(-1),
m_recording(false),
m_output_format(nullptr),
m_output_format_context(nullptr),
m_output_video_codec(nullptr),
m_output_video_codec_context(nullptr),
m_output_video_frame(nullptr),
m_output_scale_context(nullptr),
m_output_video_stream(nullptr)
{
// AVFormatContext holds header info from the format specified in the container:
m_format_context = avformat_alloc_context();
if (!m_format_context) {
throw "ERROR could not allocate memory for Format Context";
}
// open the file and read its header. Codecs are not opened here.
if (avformat_open_input(&m_format_context, infile.c_str(), NULL, NULL) != 0) {
throw "ERROR could not open input file for reading";
}
printf("format %s, duration %lldus, bit_rate %lld\n", m_format_context->iformat->name, m_format_context->duration, m_format_context->bit_rate);
//read avPackets (?) from the avFormat (?) to get stream info. This populates format_context->streams.
if (avformat_find_stream_info(m_format_context, NULL) < 0) {
throw "ERROR could not get stream info";
}
for (unsigned int i = 0; i < m_format_context->nb_streams; i++) {
AVCodecParameters* local_codec_parameters = NULL;
local_codec_parameters = m_format_context->streams[i]->codecpar;
printf("AVStream->time base before open coded %d/%d\n", m_format_context->streams[i]->time_base.num, m_format_context->streams[i]->time_base.den);
printf("AVStream->r_frame_rate before open coded %d/%d\n", m_format_context->streams[i]->r_frame_rate.num, m_format_context->streams[i]->r_frame_rate.den);
printf("AVStream->start_time %" PRId64 "\n", m_format_context->streams[i]->start_time);
printf("AVStream->duration %" PRId64 "\n", m_format_context->streams[i]->duration);
printf("duration(s): %lf\n", (float)m_format_context->streams[i]->duration / m_format_context->streams[i]->time_base.den * m_format_context->streams[i]->time_base.num);
AVCodec* local_codec = NULL;
local_codec = avcodec_find_decoder(local_codec_parameters->codec_id);
if (local_codec == NULL) {
throw "ERROR unsupported codec!";
}
if (local_codec_parameters->codec_type == AVMEDIA_TYPE_VIDEO) {
if (m_video_stream_index == -1) {
m_video_stream_index = i;
m_video_codec = local_codec;
m_video_codec_parameters = local_codec_parameters;
}
m_height = local_codec_parameters->height;
m_width = local_codec_parameters->width;
printf("Video Codec: resolution %dx%d\n", m_width, m_height);
}
else if (local_codec_parameters->codec_type == AVMEDIA_TYPE_AUDIO) {
if (m_audio_stream_index == -1) {
m_audio_stream_index = i;
m_audio_codec = local_codec;
m_audio_codec_parameters = local_codec_parameters;
}
printf("Audio Codec: %d channels, sample rate %d\n", local_codec_parameters->channels, local_codec_parameters->sample_rate);
}
printf("\tCodec %s ID %d bit_rate %lld\n", local_codec->name, local_codec->id, local_codec_parameters->bit_rate);
}
m_codec_context = avcodec_alloc_context3(m_video_codec);
if (!m_codec_context) {
throw "ERROR failed to allocate memory for AVCodecContext";
}
if (avcodec_parameters_to_context(m_codec_context, m_video_codec_parameters) < 0) {
throw "ERROR failed to copy codec params to codec context";
}
if (avcodec_open2(m_codec_context, m_video_codec, NULL) < 0) {
throw "ERROR avcodec_open2 failed to open codec";
}
m_frame = av_frame_alloc();
if (!m_frame) {
throw "ERROR failed to allocate AVFrame memory";
}
m_packet = av_packet_alloc();
if (!m_packet) {
throw "ERROR failed to allocate AVPacket memory";
}
}
MediaContainerMgr::~MediaContainerMgr() {
avformat_close_input(&m_format_context);
av_packet_free(&m_packet);
av_frame_free(&m_frame);
avcodec_free_context(&m_codec_context);
glDeleteVertexArrays(1, &m_VAO);
glDeleteBuffers(1, &m_VBO);
}
bool MediaContainerMgr::advance_frame() {
while (true) {
if (av_read_frame(m_format_context, m_packet) < 0) {
// Do we actually need to unref the packet if it failed?
av_packet_unref(m_packet);
continue;
//return false;
}
else {
if (m_packet->stream_index == m_video_stream_index) {
//printf("AVPacket->pts %" PRId64 "\n", m_packet->pts);
int response = decode_packet();
av_packet_unref(m_packet);
if (response != 0) {
continue;
//return false;
}
return true;
}
else {
printf("m_packet->stream_index: %d\n", m_packet->stream_index);
printf(" m_packet->pts: %lld\n", m_packet->pts);
printf(" mpacket->size: %d\n", m_packet->size);
if (m_recording) {
int err = 0;
//err = avcodec_send_packet(m_output_video_codec_context, m_packet);
printf(" encoding error: %d\n", err);
}
}
}
// We're done with the packet (it's been unpacked to a frame), so deallocate & reset to defaults:
/*
if (m_frame == NULL)
return false;
if (m_frame->data[0] == NULL || m_frame->data[1] == NULL || m_frame->data[2] == NULL) {
printf("WARNING: null frame data");
continue;
}
*/
}
}
int MediaContainerMgr::decode_packet() {
// Supply raw packet data as input to a decoder
// https://ffmpeg.org/doxygen/trunk/group__lavc__decoding.html#ga58bc4bf1e0ac59e27362597e467efff3
int response = avcodec_send_packet(m_codec_context, m_packet);
if (response < 0) {
char buf[256];
av_strerror(response, buf, 256);
printf("Error while receiving a frame from the decoder: %s\n", buf);
return response;
}
// Return decoded output data (into a frame) from a decoder
// https://ffmpeg.org/doxygen/trunk/group__lavc__decoding.html#ga11e6542c4e66d3028668788a1a74217c
response = avcodec_receive_frame(m_codec_context, m_frame);
if (response == AVERROR(EAGAIN) || response == AVERROR_EOF) {
return response;
} else if (response < 0) {
char buf[256];
av_strerror(response, buf, 256);
printf("Error while receiving a frame from the decoder: %s\n", buf);
return response;
} else {
printf(
"Frame %d (type=%c, size=%d bytes) pts %lld key_frame %d [DTS %d]\n",
m_codec_context->frame_number,
av_get_picture_type_char(m_frame->pict_type),
m_frame->pkt_size,
m_frame->pts,
m_frame->key_frame,
m_frame->coded_picture_number
);
}
return 0;
}
bool MediaContainerMgr::init_video_output(const std::string& video_file_name, unsigned int width, unsigned int height) {
if (m_recording)
return true;
m_recording = true;
advance_to(0L); // I've deleted the implmentation. Just seeks to beginning of vid. Works fine.
if (!(m_output_format = av_guess_format(nullptr, video_file_name.c_str(), nullptr))) {
printf("Cannot guess output format.\n");
return false;
}
int err = avformat_alloc_output_context2(&m_output_format_context, m_output_format, nullptr, video_file_name.c_str());
if (err < 0) {
printf("Failed to allocate output context.\n");
return false;
}
//TODO(P0): Break out the video and audio inits into their own methods.
m_output_video_codec = avcodec_find_encoder(m_output_format->video_codec);
if (!m_output_video_codec) {
printf("Failed to create video codec.\n");
return false;
}
m_output_video_stream = avformat_new_stream(m_output_format_context, m_output_video_codec);
if (!m_output_video_stream) {
printf("Failed to find video format.\n");
return false;
}
m_output_video_codec_context = avcodec_alloc_context3(m_output_video_codec);
if (!m_output_video_codec_context) {
printf("Failed to create video codec context.\n");
return(false);
}
m_output_video_stream->codecpar->codec_id = m_output_format->video_codec;
m_output_video_stream->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
m_output_video_stream->codecpar->width = width;
m_output_video_stream->codecpar->height = height;
m_output_video_stream->codecpar->format = AV_PIX_FMT_YUV420P;
// Use the same bit rate as the input stream.
m_output_video_stream->codecpar->bit_rate = m_format_context->streams[m_video_stream_index]->codecpar->bit_rate;
m_output_video_stream->avg_frame_rate = m_format_context->streams[m_video_stream_index]->avg_frame_rate;
avcodec_parameters_to_context(m_output_video_codec_context, m_output_video_stream->codecpar);
m_output_video_codec_context->time_base = m_format_context->streams[m_video_stream_index]->time_base;
//TODO(P1): Set these to match the input stream?
m_output_video_codec_context->max_b_frames = 2;
m_output_video_codec_context->gop_size = 12;
m_output_video_codec_context->framerate = m_format_context->streams[m_video_stream_index]->r_frame_rate;
//m_output_codec_context->refcounted_frames = 0;
if (m_output_video_stream->codecpar->codec_id == AV_CODEC_ID_H264) {
av_opt_set(m_output_video_codec_context, "preset", "ultrafast", 0);
} else if (m_output_video_stream->codecpar->codec_id == AV_CODEC_ID_H265) {
av_opt_set(m_output_video_codec_context, "preset", "ultrafast", 0);
} else {
av_opt_set_int(m_output_video_codec_context, "lossless", 1, 0);
}
avcodec_parameters_from_context(m_output_video_stream->codecpar, m_output_video_codec_context);
m_output_audio_codec = avcodec_find_encoder(m_output_format->audio_codec);
if (!m_output_audio_codec) {
printf("Failed to create audio codec.\n");
return false;
}
I've commented out all of the audio stream init beyond this next line, because this is where
the trouble begins. Creating this output stream causes the null reference I mentioned. If I
uncomment everything below here, I still get the null deref. If I comment out this line, the
deref exception vanishes. (IOW, I commented out more and more code until I found that this
was the trigger that caused the problem.)
I assume that there's something I'm doing wrong in the rest of the commented out code, that,
when fixed, will fix the nullptr and give me a working audio stream.
m_output_audio_stream = avformat_new_stream(m_output_format_context, m_output_audio_codec);
if (!m_output_audio_stream) {
printf("Failed to find audio format.\n");
return false;
}
/*
m_output_audio_codec_context = avcodec_alloc_context3(m_output_audio_codec);
if (!m_output_audio_codec_context) {
printf("Failed to create audio codec context.\n");
return(false);
}
m_output_audio_stream->codecpar->codec_id = m_output_format->audio_codec;
m_output_audio_stream->codecpar->codec_type = AVMEDIA_TYPE_AUDIO;
m_output_audio_stream->codecpar->format = m_format_context->streams[m_audio_stream_index]->codecpar->format;
m_output_audio_stream->codecpar->bit_rate = m_format_context->streams[m_audio_stream_index]->codecpar->bit_rate;
m_output_audio_stream->avg_frame_rate = m_format_context->streams[m_audio_stream_index]->avg_frame_rate;
avcodec_parameters_to_context(m_output_audio_codec_context, m_output_audio_stream->codecpar);
m_output_audio_codec_context->time_base = m_format_context->streams[m_audio_stream_index]->time_base;
*/
//TODO(P2): Free assets that have been allocated.
err = avcodec_open2(m_output_video_codec_context, m_output_video_codec, nullptr);
if (err < 0) {
printf("Failed to open codec.\n");
return false;
}
if (!(m_output_format->flags & AVFMT_NOFILE)) {
err = avio_open(&m_output_format_context->pb, video_file_name.c_str(), AVIO_FLAG_WRITE);
if (err < 0) {
printf("Failed to open output file.");
return false;
}
}
err = avformat_write_header(m_output_format_context, NULL);
if (err < 0) {
printf("Failed to write header.\n");
return false;
}
av_dump_format(m_output_format_context, 0, video_file_name.c_str(), 1);
return true;
}
//TODO(P2): make this a member. (Thanks to https://emvlo.wordpress.com/2016/03/10/sws_scale/)
void PrepareFlipFrameJ420(AVFrame* pFrame) {
for (int i = 0; i < 4; i++) {
if (i)
pFrame->data[i] += pFrame->linesize[i] * ((pFrame->height >> 1) - 1);
else
pFrame->data[i] += pFrame->linesize[i] * (pFrame->height - 1);
pFrame->linesize[i] = -pFrame->linesize[i];
}
}
This is where we take an altered frame and write it to the output container. This works fine
as long as we haven't set up an audio stream in the output container.
bool MediaContainerMgr::output_video_frame(uint8_t* buf) {
int err;
if (!m_output_video_frame) {
m_output_video_frame = av_frame_alloc();
m_output_video_frame->format = AV_PIX_FMT_YUV420P;
m_output_video_frame->width = m_output_video_codec_context->width;
m_output_video_frame->height = m_output_video_codec_context->height;
err = av_frame_get_buffer(m_output_video_frame, 32);
if (err < 0) {
printf("Failed to allocate output frame.\n");
return false;
}
}
if (!m_output_scale_context) {
m_output_scale_context = sws_getContext(m_output_video_codec_context->width, m_output_video_codec_context->height,
AV_PIX_FMT_RGB24,
m_output_video_codec_context->width, m_output_video_codec_context->height,
AV_PIX_FMT_YUV420P, SWS_BICUBIC, nullptr, nullptr, nullptr);
}
int inLinesize[1] = { 3 * m_output_video_codec_context->width };
sws_scale(m_output_scale_context, (const uint8_t* const*)&buf, inLinesize, 0, m_output_video_codec_context->height,
m_output_video_frame->data, m_output_video_frame->linesize);
PrepareFlipFrameJ420(m_output_video_frame);
//TODO(P0): Switch m_frame to be m_input_video_frame so I don't end up using the presentation timestamp from
// an audio frame if I threadify the frame reading.
m_output_video_frame->pts = m_frame->pts;
printf("Output PTS: %d, time_base: %d/%d\n", m_output_video_frame->pts,
m_output_video_codec_context->time_base.num, m_output_video_codec_context->time_base.den);
err = avcodec_send_frame(m_output_video_codec_context, m_output_video_frame);
if (err < 0) {
printf(" ERROR sending new video frame output: ");
switch (err) {
case AVERROR(EAGAIN):
printf("AVERROR(EAGAIN): %d\n", err);
break;
case AVERROR_EOF:
printf("AVERROR_EOF: %d\n", err);
break;
case AVERROR(EINVAL):
printf("AVERROR(EINVAL): %d\n", err);
break;
case AVERROR(ENOMEM):
printf("AVERROR(ENOMEM): %d\n", err);
break;
}
return false;
}
AVPacket pkt;
av_init_packet(&pkt);
pkt.data = nullptr;
pkt.size = 0;
pkt.flags |= AV_PKT_FLAG_KEY;
int ret = 0;
if ((ret = avcodec_receive_packet(m_output_video_codec_context, &pkt)) == 0) {
static int counter = 0;
printf("pkt.key: 0x%08x, pkt.size: %d, counter:\n", pkt.flags & AV_PKT_FLAG_KEY, pkt.size, counter++);
uint8_t* size = ((uint8_t*)pkt.data);
printf("sizes: %d %d %d %d %d %d %d %d %d\n", size[0], size[1], size[2], size[2], size[3], size[4], size[5], size[6], size[7]);
av_interleaved_write_frame(m_output_format_context, &pkt);
}
printf("push: %d\n", ret);
av_packet_unref(&pkt);
return true;
}
bool MediaContainerMgr::finalize_output() {
if (!m_recording)
return true;
AVPacket pkt;
av_init_packet(&pkt);
pkt.data = nullptr;
pkt.size = 0;
for (;;) {
avcodec_send_frame(m_output_video_codec_context, nullptr);
if (avcodec_receive_packet(m_output_video_codec_context, &pkt) == 0) {
av_interleaved_write_frame(m_output_format_context, &pkt);
printf("final push:\n");
} else {
break;
}
}
av_packet_unref(&pkt);
av_write_trailer(m_output_format_context);
if (!(m_output_format->flags & AVFMT_NOFILE)) {
int err = avio_close(m_output_format_context->pb);
if (err < 0) {
printf("Failed to close file. err: %d\n", err);
return false;
}
}
return true;
}
EDIT
The call stack on the crash (which I should have included in the original question):
avformat-58.dll!compute_muxer_pkt_fields(AVFormatContext * s, AVStream * st, AVPacket * pkt) Line 630 C
avformat-58.dll!write_packet_common(AVFormatContext * s, AVStream * st, AVPacket * pkt, int interleaved) Line 1122 C
avformat-58.dll!write_packets_common(AVFormatContext * s, AVPacket * pkt, int interleaved) Line 1186 C
avformat-58.dll!av_interleaved_write_frame(AVFormatContext * s, AVPacket * pkt) Line 1241 C
CamBot.exe!MediaContainerMgr::output_video_frame(unsigned char * buf) Line 553 C++
CamBot.exe!main() Line 240 C++
If I move the call to avformat_write_header so it's immediately before the audio stream initialization, I still get a crash, but in a different place. The crash happens on line 6459 of movenc.c, where we have:
/* Non-seekable output is ok if using fragmentation. If ism_lookahead
* is enabled, we don't support non-seekable output at all. */
if (!(s->pb->seekable & AVIO_SEEKABLE_NORMAL) && // CRASH IS HERE
(!(mov->flags & FF_MOV_FLAG_FRAGMENT) || mov->ism_lookahead)) {
av_log(s, AV_LOG_ERROR, "muxer does not support non seekable output\n");
return AVERROR(EINVAL);
}
The exception is a nullptr exception, where s->pb is NULL. The call stack is:
avformat-58.dll!mov_init(AVFormatContext * s) Line 6459 C
avformat-58.dll!init_muxer(AVFormatContext * s, AVDictionary * * options) Line 407 C
[Inline Frame] avformat-58.dll!avformat_init_output(AVFormatContext *) Line 489 C
avformat-58.dll!avformat_write_header(AVFormatContext * s, AVDictionary * * options) Line 512 C
CamBot.exe!MediaContainerMgr::init_video_output(const std::string & video_file_name, unsigned int width, unsigned int height) Line 424 C++
CamBot.exe!main() Line 183 C++

Please note that you should always try to provide a self-contained minimal working example to make it easier for others to help. With the actual code, the matching FFmpeg version, and an input video that triggers the segmentation fault (to be sure), the issue would be a matter of analyzing the control flow to identify why st->internal->priv_pts was not allocated. Without the full scenario, I have to report to making assumptions that may or may not correspond to your actual code.
Based on your description, I attempted to reproduce the issue by cloning https://github.com/FFmpeg/FFmpeg.git and creating a new branch from commit b52e0d95 (November 4, 2020) to approximate your FFmpeg version.
I recreated your scenario using the provided code snippets by
including the avformat_new_stream() call for the audio stream
keeping the remaining audio initialization commented out
including the original avformat_write_header() call site (unchanged order)
With that scenario, the video write with MP4 video/audio input fails in avformat_write_header():
[mp4 # 0x2b39f40] sample rate not set 0
The call stack of the error location:
#0 0x00007ffff75253d7 in raise () from /lib64/libc.so.6
#1 0x00007ffff7526ac8 in abort () from /lib64/libc.so.6
#2 0x000000000094feca in init_muxer (s=0x2b39f40, options=0x0) at libavformat/mux.c:309
#3 0x00000000009508f4 in avformat_init_output (s=0x2b39f40, options=0x0) at libavformat/mux.c:490
#4 0x0000000000950a10 in avformat_write_header (s=0x2b39f40, options=0x0) at libavformat/mux.c:514
[...]
In init_muxer(), the sample rate in the stream parameters is checked unconditionally:
case AVMEDIA_TYPE_AUDIO:
if (par->sample_rate <= 0) {
av_log(s, AV_LOG_ERROR, "sample rate not set %d\n", par->sample_rate); abort();
ret = AVERROR(EINVAL);
goto fail;
}
That condition has been in effect since 2014-06-18 at the very least (didn't go back any further) and still exists. With a version from November 2020, the check must be active and the parameter must be set accordingly.
If I uncomment the remaining audio initialization, the situation remains unchanged (as expected). So, satisfy the condition, I added the missing parameter as follows:
m_output_audio_stream->codecpar->sample_rate =
m_format_context->streams[m_audio_stream_index]->codecpar->sample_rate;
With that, the check succeeds, avformat_write_header() succeeds, and the actual video write succeeds.
As you indicated in your question, the segmentation fault is caused by st->internal->priv_pts being NULL at this location:
#0 0x00000000009516db in compute_muxer_pkt_fields (s=0x2b39f40, st=0x2b3a580, pkt=0x7fffffffe2d0) at libavformat/mux.c:632
#1 0x0000000000953128 in write_packet_common (s=0x2b39f40, st=0x2b3a580, pkt=0x7fffffffe2d0, interleaved=1) at libavformat/mux.c:1125
#2 0x0000000000953473 in write_packets_common (s=0x2b39f40, pkt=0x7fffffffe2d0, interleaved=1) at libavformat/mux.c:1188
#3 0x0000000000953634 in av_interleaved_write_frame (s=0x2b39f40, pkt=0x7fffffffe2d0) at libavformat/mux.c:1243
[...]
In the FFmpeg code base, the allocation of priv_pts is handled by init_pts() for all streams referenced by the context. init_pts() has two call sites:
libavformat/mux.c:496:
if (s->oformat->init && ret) {
if ((ret = init_pts(s)) < 0)
return ret;
return AVSTREAM_INIT_IN_INIT_OUTPUT;
}
libavformat/mux.c:530:
if (!s->internal->streams_initialized) {
if ((ret = init_pts(s)) < 0)
goto fail;
}
In both cases, the calls are triggered by avformat_write_header() (indirectly via avformat_init_output() for the first, directly for the second). According to control flow analysis, there's no success case that would leave priv_pts unallocated.
Considering a high probability that our versions of FFmpeg are compatible in terms of behavior, I have to assume that 1) the sample rate must be provided for audio streams and 1) priv_pts is always allocated by avformat_write_header() in the absence of errors. Therefore, two possible root causes come to mind:
Your stream is not an audio stream (unlikely; the type is based on the codec, which in turn is based on the output file extension - assuming mp4)
You do not call avformat_write_header() (unlikely) or do not handle the error in the caller of your C++ member function (the return value of avformat_write_header() is checked but I do not have code corresponding to the caller of the C++ member function; your actual code might differ significantly from the code provided, so it's possible and the only plausible conclusion that can be drawn from available data)
The solution: Ensure that processing does not continue if avformat_write_header() fails. By adding the audio stream, avformat_write_header() starts to fail unless you set the stream sample rate. If the error is ignored, av_interleaved_write_frame() triggers a segmentation fault by accessing the unallocated st->internal->priv_pts.
As mentioned initially, scenario is incomplete. If you do call avformat_write_header() and stop processing in case of an error (meaning you do not call av_interleaved_write_frame()), more information is needed. As it stands now, that is unlikely. For further analysis, the executable output (stdout, stderr) is required to see your traces and FFmpeg log messages. If that does not reveal new information, a self-contained minimal working example and the video input are needed to get all the full picture.

Related

ffmpeg api alternate transcoding and remuxing for same file

Context
Hello !
I'm currently working on the development of a small library allowing to cut an h.264 video on any frame, but without re-encoding (transcoding) the whole video. The idea is to re-encode only the GOP on which we want to cut, and to rewrite (remux) directly the others GOP.
The avcut project (https://github.com/anyc/avcut) allows to do that, but requires a systematic decoding of each package, and seems to not work with the recent versions of ffmpeg from the tests I could do and from the recent feedbacks in the github issues.
As a beginner, I started from the code examples provided in the ffmpeg documentation, in particular: transcoding.c and remuxing.c.
Problem encountered
The problem I'm having is that I can't get both transcoding and remuxing to work properly at the same time. In particular, depending on the method I use to initialize the AVCodecParameters of the output video stream, transcoding works, or remuxing works:
avcodec_parameters_copy works well for remuxing
avcodec_parameters_from_context works well for transcoding
In case I choose avcodec_parameters_from_context, the transcoded GOP are correctly read by my video player (parole), but the remuxed packets are not read, and ffprobe does not show/detect them.
In case I choose avcodec_parameters_from_context, the remuxing GOP are correctly read by my video player, but the transcoding key_frame are bugged (I have the impression that the b-frame and p-frame are ok), and ffprobe -i return an error about the NAL of the key-frames:
[h264 # 0x55ec8a079300] sps_id 32 out of range
[h264 # 0x55ec8a079300] Invalid NAL unit size (1677727148 > 735).
[h264 # 0x55ec8a079300] missing picture in access unit with size 744
I suspect that the problem is related to the extradata of the packets. Through some experiments on the different attributes of the output AVCodecParameters, it seems that it is the extradata and extradata_size attributes that are responsible for the functioning of one method or the other.
Version
ffmpeg development branch retrieved on 2022-05-17 from https://github.com/FFmpeg/FFmpeg.
Compiled with --enable-libx264 --enable-gpl --enable-decoder=png --enable-encoder=png
Code
My code is written in c++ and is based on two classes: a class defining the parameters and methods on the input file (InputContexts) and a class defining them for the output file (OutputContexts). The code of these two classes is defined in the following files:
contexts.h
contexts.cpp
The code normally involved in the problem is the following:
stream initialization
int OutputContexts::init(const char* out_filename, InputContexts* input_contexts){
int ret;
int stream_index = 0;
avformat_alloc_output_context2(&ofmt_ctx, NULL, NULL, out_filename);
if (!ofmt_ctx) {
fprintf(stderr, "Could not create output context\n");
ret = AVERROR_UNKNOWN;
return ret;
}
av_dump_format(ofmt_ctx, 0, out_filename, 1);
encoders.resize(input_contexts->ifmt_ctx->nb_streams, nullptr);
codecs.resize(input_contexts->ifmt_ctx->nb_streams, nullptr);
// stream mapping
for (int i = 0; i < input_contexts->ifmt_ctx->nb_streams; i++) {
AVStream *out_stream;
AVStream *in_stream = input_contexts->ifmt_ctx->streams[i];
AVCodecContext* decoder_ctx = input_contexts->decoders[i];
// add new stream to output context
out_stream = avformat_new_stream(ofmt_ctx, NULL);
if (!out_stream) {
fprintf(stderr, "Failed allocating output stream\n");
ret = AVERROR_UNKNOWN;
return ret;
}
// from avcut blog
av_dict_copy(&out_stream->metadata, in_stream->metadata, 0);
out_stream->time_base = in_stream->time_base;
// encoder
if (decoder_ctx->codec_type == AVMEDIA_TYPE_VIDEO){
ret = prepare_encoder_video(i, input_contexts);
if (ret < 0){
fprintf(stderr, "Error while preparing encoder for stream #%u\n", i);
return ret;
}
// from avcut
out_stream->sample_aspect_ratio = in_stream->sample_aspect_ratio;
// works well for remuxing
ret = avcodec_parameters_copy(out_stream->codecpar, in_stream->codecpar);
if (ret < 0) {
fprintf(stderr, "Failed to copy codec parameters\n");
return ret;
}
// works well for transcoding
// ret = avcodec_parameters_from_context(out_stream->codecpar, encoders[i]);
// if (ret < 0) {
// av_log(NULL, AV_LOG_ERROR, "Failed to copy encoder parameters to output stream #%u\n", i);
// return ret;
// }
} else if (decoder_ctx->codec_type == AVMEDIA_TYPE_AUDIO) {
...
} else {
...
}
// TODO useful ???
// set current stream position to 0
// out_stream->codecpar->codec_tag = 0;
}
// opening output file in write mode with the ouput context
if (!(ofmt_ctx->oformat->flags & AVFMT_NOFILE)) {
ret = avio_open(&ofmt_ctx->pb, out_filename, AVIO_FLAG_WRITE);
if (ret < 0) {
fprintf(stderr, "Could not open output file '%s'", out_filename);
return ret;
}
}
// write headers from output context in output file
ret = avformat_write_header(ofmt_ctx, NULL);
if (ret < 0) {
fprintf(stderr, "Error occurred when opening output file\n");
return ret;
}
return ret;
}
AVCodecContext initialization for encoder
int OutputContexts::prepare_encoder_video(int stream_index, InputContexts* input_contexts){
int ret;
const AVCodec* encoder;
AVCodecContext* decoder_ctx = input_contexts->decoders[stream_index];
AVCodecContext* encoder_ctx;
if (video_index >= 0){
fprintf(stderr, "Impossible to mark stream #%u as video, stream #%u is already registered as video stream.\n",
stream_index, video_index);
return -1; //TODO change this value for correct error code
}
video_index = stream_index;
if(decoder_ctx->codec_id == AV_CODEC_ID_H264){
encoder = avcodec_find_encoder_by_name("libx264");
if (!encoder) {
av_log(NULL, AV_LOG_FATAL, "Encoder libx264 not found\n");
return AVERROR_INVALIDDATA;
}
fmt::print("Encoder libx264 will be used for stream {}.\n", stream_index);
} else {
std::string s = fmt::format("No video encoder found for the given codec_id: {}\n", avcodec_get_name(decoder_ctx->codec_id));
av_log(NULL, AV_LOG_FATAL, s.c_str());
return AVERROR_INVALIDDATA;
}
encoder_ctx = avcodec_alloc_context3(encoder);
if (!encoder_ctx) {
av_log(NULL, AV_LOG_FATAL, "Failed to allocate the encoder context\n");
return AVERROR(ENOMEM);
}
// from avcut
encoder_ctx->time_base = decoder_ctx->time_base;
encoder_ctx->ticks_per_frame = decoder_ctx->ticks_per_frame;
encoder_ctx->delay = decoder_ctx->delay;
encoder_ctx->width = decoder_ctx->width;
encoder_ctx->height = decoder_ctx->height;
encoder_ctx->pix_fmt = decoder_ctx->pix_fmt;
encoder_ctx->sample_aspect_ratio = decoder_ctx->sample_aspect_ratio;
encoder_ctx->color_primaries = decoder_ctx->color_primaries;
encoder_ctx->color_trc = decoder_ctx->color_trc;
encoder_ctx->colorspace = decoder_ctx->colorspace;
encoder_ctx->color_range = decoder_ctx->color_range;
encoder_ctx->chroma_sample_location = decoder_ctx->chroma_sample_location;
encoder_ctx->profile = decoder_ctx->profile;
encoder_ctx->level = decoder_ctx->level;
encoder_ctx->thread_count = 1; // spawning more threads causes avcodec_close to free threads multiple times
encoder_ctx->codec_tag = 0;
// correct values ???
encoder_ctx->qmin = 16;
encoder_ctx->qmax = 26;
encoder_ctx->max_qdiff = 4;
// end from avcut
// according to avcut, should not be set
// if (ofmt_ctx->oformat->flags & AVFMT_GLOBALHEADER){
// encoder_ctx->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
// }
ret = avcodec_open2(encoder_ctx, encoder, NULL);
if (ret < 0) {
av_log(NULL, AV_LOG_ERROR, "Cannot open video encoder for stream #%u\n", stream_index);
return ret;
}
codecs[stream_index] = encoder;
encoders[stream_index] = encoder_ctx;
return ret;
}
Example
To illustrate my problem, I provide here a test code using the two classes that alternates between transcoding and remuxing at each key-frame encountered in the file using my classes.
trans_remux.cpp
To compile the code:
g++ -o trans_remux trans_remux.cpp contexts.cpp -D__STDC_CONSTANT_MACROS `pkg-config --libs libavfilter` -lfmt -g
Currently the code is using avcodec_parameters_copy (contexts.cpp:333), so it works well for remuxing. If you want to test the version with avcodec_parameters_from_context, pls comment from line 333 to 337 in contexts.cpp and uncomment from line 340 to 344 and recompile.

Xcode app for macOS. This is how I setup to get audio from usb mic input. Worked a year ago, now doesn't. Why

Here is my audio init code. My app responds when queue buffers are ready, but all data in buffer is zero. Checking sound in system preferences shows that USB Audio CODEC in sound input dialog is active. AudioInit() is called right after app launches.
{
#pragma mark user data struct
typedef struct MyRecorder
{
AudioFileID recordFile;
SInt64 recordPacket;
Float32 *pSampledData;
MorseDecode *pMorseDecoder;
} MyRecorder;
#pragma mark utility functions
void CheckError(OSStatus error, const char *operation)
{
if(error == noErr) return;
char errorString[20];
// see if it appears to be a 4 char code
*(UInt32*)(errorString + 1) = CFSwapInt32HostToBig(error);
if (isprint(errorString[1]) && isprint(errorString[2]) &&
isprint(errorString[3]) && isprint(errorString[4]))
{
errorString[0] = errorString[5] = '\'';
errorString[6] = '\0';
}
else
{
sprintf(errorString, "%d", (int)error);
}
fprintf(stderr, "Error: %s (%s)\n", operation, errorString);
}
OSStatus MyGetDefaultInputDeviceSampleRate(Float64 *outSampleRate)
{
OSStatus error;
AudioDeviceID deviceID = 0;
AudioObjectPropertyAddress propertyAddress;
UInt32 propertySize;
propertyAddress.mSelector = kAudioHardwarePropertyDefaultInputDevice;
propertyAddress.mScope = kAudioObjectPropertyScopeGlobal;
propertyAddress.mElement = 0;
propertySize = sizeof(AudioDeviceID);
error = AudioObjectGetPropertyData(kAudioObjectSystemObject,
&propertyAddress,
0,
NULL,
&propertySize,
&deviceID);
if(error)
return error;
propertyAddress.mSelector = kAudioDevicePropertyNominalSampleRate;
propertyAddress.mScope = kAudioObjectPropertyScopeGlobal;
propertyAddress.mElement = 0;
propertySize = sizeof(Float64);
error = AudioObjectGetPropertyData(deviceID,
&propertyAddress,
0,
NULL,
&propertySize,
outSampleRate);
return error;
}
static int MyComputeRecordBufferSize(const AudioStreamBasicDescription *format,
AudioQueueRef queue,
float seconds)
{
int packets, frames, bytes;
frames = (int)ceil(seconds * format->mSampleRate);
if(format->mBytesPerFrame > 0)
{
bytes = frames * format->mBytesPerFrame;
}
else
{
UInt32 maxPacketSize;
if(format->mBytesPerPacket > 0)
{
// constant packet size
maxPacketSize = format->mBytesPerPacket;
}
else
{
// get the largest single packet size possible
UInt32 propertySize = sizeof(maxPacketSize);
CheckError(AudioQueueGetProperty(queue,
kAudioConverterPropertyMaximumOutputPacketSize,
&maxPacketSize,
&propertySize),
"Couldn't get queues max output packet size");
}
if(format->mFramesPerPacket > 0)
packets = frames / format->mFramesPerPacket;
else
// worst case scenario: 1 frame in a packet
packets = frames;
// sanity check
if(packets == 0)
packets = 1;
bytes = packets * maxPacketSize;
}
return bytes;
}
extern void bridgeToMainThread(MorseDecode *pDecode);
static int callBacks = 0;
// ---------------------------------------------
static void MyAQInputCallback(void *inUserData,
AudioQueueRef inQueue,
AudioQueueBufferRef inBuffer,
const AudioTimeStamp *inStartTime,
UInt32 inNumPackets,
const AudioStreamPacketDescription *inPacketDesc)
{
MyRecorder *recorder = (MyRecorder*)inUserData;
Float32 *pAudioData = (Float32*)(inBuffer->mAudioData);
recorder->pMorseDecoder->pBuffer = pAudioData;
recorder->pMorseDecoder->bufferSize = inNumPackets;
bridgeToMainThread(recorder->pMorseDecoder);
CheckError(AudioQueueEnqueueBuffer(inQueue,
inBuffer,
0,
NULL),
"AudioQueueEnqueueBuffer failed");
printf("packets = %ld, bytes = %ld\n",(long)inNumPackets,(long)inBuffer->mAudioDataByteSize);
callBacks++;
//printf("\ncallBacks = %d\n",callBacks);
//if(callBacks == 0)
//audioStop();
}
static AudioQueueRef queue = {0};
static MyRecorder recorder = {0};
static AudioStreamBasicDescription recordFormat;
void audioInit()
{
// set up format
memset(&recordFormat,0,sizeof(recordFormat));
recordFormat.mFormatID = kAudioFormatLinearPCM;
recordFormat.mChannelsPerFrame = 2;
recordFormat.mBitsPerChannel = 32;
recordFormat.mBytesPerPacket = recordFormat.mBytesPerFrame = recordFormat.mChannelsPerFrame * sizeof(Float32);
recordFormat.mFramesPerPacket = 1;
//recordFormat.mFormatFlags = kAudioFormatFlagsCanonical;
recordFormat.mFormatFlags = kAudioFormatFlagsNativeFloatPacked;
MyGetDefaultInputDeviceSampleRate(&recordFormat.mSampleRate);
UInt32 propSize = sizeof(recordFormat);
CheckError(AudioFormatGetProperty(kAudioFormatProperty_FormatInfo,
0,
NULL,
&propSize,
&recordFormat),
"AudioFormatProperty failed");
recorder.pMorseDecoder = MorseDecode::pInstance();
recorder.pMorseDecoder->m_sampleRate = recordFormat.mSampleRate;
// recorder.pMorseDecoder->setCircularBuffer();
//set up queue
CheckError(AudioQueueNewInput(&recordFormat,
MyAQInputCallback,
&recorder,
NULL,
kCFRunLoopCommonModes,
0,
&queue),
"AudioQueueNewInput failed");
UInt32 size = sizeof(recordFormat);
CheckError(AudioQueueGetProperty(queue,
kAudioConverterCurrentOutputStreamDescription,
&recordFormat,
&size), "Couldn't get queue's format");
// set up buffers and enqueue
const int kNumberRecordBuffers = 3;
int bufferByteSize = MyComputeRecordBufferSize(&recordFormat, queue, AUDIO_BUFFER_DURATION);
for(int bufferIndex = 0; bufferIndex < kNumberRecordBuffers; bufferIndex++)
{
AudioQueueBufferRef buffer;
CheckError(AudioQueueAllocateBuffer(queue,
bufferByteSize,
&buffer),
"AudioQueueAllocateBuffer failed");
CheckError(AudioQueueEnqueueBuffer(queue,
buffer,
0,
NULL),
"AudioQueueEnqueueBuffer failed");
}
}
void audioRun()
{
CheckError(AudioQueueStart(queue, NULL), "AudioQueueStart failed");
}
void audioStop()
{
CheckError(AudioQueuePause(queue), "AudioQueuePause failed");
}
}

This sounds like the new macOS 'microphone privacy' setting, which, if set to 'no access' for your app, will cause precisely this behaviour. So:
Open the System Preferences pane.
Click on 'Security and Privacy'.
Select the Privacy tab.
Click on 'Microphone' in the left-hand pane.
Locate your app in the right-hand pane and tick the checkbox next to it.
Then restart your app and test it.
Tedious, no?
Edit: As stated in the comments, you can't directly request microphone access, but you can detect whether it has been granted to your app or not by calling [AVCaptureDevice authorizationStatusForMediaType: AVMediaTypeAudio].

FFMPEG buffer underflow

I'm recording a video with FFMPEG and getting some wierd message in the process
[mpeg # 01011c80] packet too large, ignoring buffer limits to mux it
[mpeg # 01011c80] buffer underflow st=0 bufi=236198 size=412405
[mpeg # 01011c80] buffer underflow st=0 bufi=238239 size=412405
and I have no idea how to deal with it. Here's my code for adding frames
void ofxFFMPEGVideoWriter::addFrame(const uint8_t* pixels)
{
memcpy(picture_rgb24->data[0], pixels, size);
sws_scale(swsContext, picture_rgb24->data, picture_rgb24->linesize, 0, codecContext->height, picture->data, picture->linesize);
AVPacket packet = { 0 };
int got_packet;
av_init_packet(&packet);
int ret = avcodec_encode_video2(codecContext, &packet, picture, &got_packet);
if (ret < 0) qDebug() << "Error encoding video frame: " << ret;
if (!ret && got_packet && packet.size)
{
packet.stream_index = videoStream->index;
ret = av_interleaved_write_frame(formatContext, &packet);
}
picture->pts += av_rescale_q(1, videoStream->codec->time_base, videoStream->time_base);
}
The file itself seems to be fine, and readable, but that message is really bugging me. Does anybody know how to fix it?

Losing quality when encoding with ffmpeg

I am using the c libraries of ffmpeg to read frames from a video and create an output file that is supposed to be identical to the input.
However, somewhere during this process some quality gets lost and the result is "less sharp". My guess is that the problem is the encoding and that the frames are too compressed (also because the size of the file decreases quite significantly). Is there some parameter in the encoder that allows me to control the quality of the result? I found that AVCodecContext has a compression_level member, but changing it that does not seem to have any effect.
I post here part of my code in case it could help. I would say that something must be changed in the init function of OutputVideoBuilder when I set the codec. The AVCodecContext that is passed to the method is the same of InputVideoHandler.
Here are the two main classes that I created to wrap the ffmpeg functionalities:
// This class opens the video files and sets the decoder
class InputVideoHandler {
public:
InputVideoHandler(char* name);
~InputVideoHandler();
AVCodecContext* getCodecContext();
bool readFrame(AVFrame* frame, int* success);
private:
InputVideoHandler();
void init(char* name);
AVFormatContext* formatCtx;
AVCodec* codec;
AVCodecContext* codecCtx;
AVPacket packet;
int streamIndex;
};
void InputVideoHandler::init(char* name) {
streamIndex = -1;
int numStreams;
if (avformat_open_input(&formatCtx, name, NULL, NULL) != 0)
throw std::exception("Invalid input file name.");
if (avformat_find_stream_info(formatCtx, NULL)<0)
throw std::exception("Could not find stream information.");
numStreams = formatCtx->nb_streams;
if (numStreams < 0)
throw std::exception("No streams in input video file.");
for (int i = 0; i < numStreams; i++) {
if (formatCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO) {
streamIndex = i;
break;
}
}
if (streamIndex < 0)
throw std::exception("No video stream in input video file.");
// find decoder using id
codec = avcodec_find_decoder(formatCtx->streams[streamIndex]->codec->codec_id);
if (codec == nullptr)
throw std::exception("Could not find suitable decoder for input file.");
// copy context from input stream
codecCtx = avcodec_alloc_context3(codec);
if (avcodec_copy_context(codecCtx, formatCtx->streams[streamIndex]->codec) != 0)
throw std::exception("Could not copy codec context from input stream.");
if (avcodec_open2(codecCtx, codec, NULL) < 0)
throw std::exception("Could not open decoder.");
}
// frame must be initialized with av_frame_alloc() before!
// Returns true if there are other frames, false if not.
// success == 1 if frame is valid, 0 if not.
bool InputVideoHandler::readFrame(AVFrame* frame, int* success) {
*success = 0;
if (av_read_frame(formatCtx, &packet) < 0)
return false;
if (packet.stream_index == streamIndex) {
avcodec_decode_video2(codecCtx, frame, success, &packet);
}
av_free_packet(&packet);
return true;
}
// This class opens the output and write frames to it
class OutputVideoBuilder{
public:
OutputVideoBuilder(char* name, AVCodecContext* inputCtx);
~OutputVideoBuilder();
void writeFrame(AVFrame* frame);
void writeVideo();
private:
OutputVideoBuilder();
void init(char* name, AVCodecContext* inputCtx);
void logMsg(AVPacket* packet, AVRational* tb);
AVFormatContext* formatCtx;
AVCodec* codec;
AVCodecContext* codecCtx;
AVStream* stream;
};
void OutputVideoBuilder::init(char* name, AVCodecContext* inputCtx) {
if (avformat_alloc_output_context2(&formatCtx, NULL, NULL, name) < 0)
throw std::exception("Could not determine file extension from provided name.");
codec = avcodec_find_encoder(inputCtx->codec_id);
if (codec == nullptr) {
throw std::exception("Could not find suitable encoder.");
}
codecCtx = avcodec_alloc_context3(codec);
if (avcodec_copy_context(codecCtx, inputCtx) < 0)
throw std::exception("Could not copy output codec context from input");
codecCtx->time_base = inputCtx->time_base;
codecCtx->compression_level = 0;
if (avcodec_open2(codecCtx, codec, NULL) < 0)
throw std::exception("Could not open encoder.");
stream = avformat_new_stream(formatCtx, codec);
if (stream == nullptr) {
throw std::exception("Could not allocate stream.");
}
stream->id = formatCtx->nb_streams - 1;
stream->codec = codecCtx;
stream->time_base = codecCtx->time_base;
av_dump_format(formatCtx, 0, name, 1);
if (!(formatCtx->oformat->flags & AVFMT_NOFILE)) {
if (avio_open(&formatCtx->pb, name, AVIO_FLAG_WRITE) < 0) {
throw std::exception("Could not open output file.");
}
}
if (avformat_write_header(formatCtx, NULL) < 0) {
throw std::exception("Error occurred when opening output file.");
}
}
void OutputVideoBuilder::writeFrame(AVFrame* frame) {
AVPacket packet = { 0 };
int success;
av_init_packet(&packet);
if (avcodec_encode_video2(codecCtx, &packet, frame, &success))
throw std::exception("Error encoding frames");
if (success) {
av_packet_rescale_ts(&packet, codecCtx->time_base, stream->time_base);
packet.stream_index = stream->index;
logMsg(&packet,&stream->time_base);
av_interleaved_write_frame(formatCtx, &packet);
}
av_free_packet(&packet);
}
This is the part of the main function that reads and write frames:
while (inputHandler->readFrame(frame,&gotFrame)) {
if (gotFrame) {
try {
outputBuilder->writeFrame(frame);
}
catch (std::exception e) {
std::cout << e.what() << std::endl;
return -1;
}
}
}

Your qmin/qmax answer is partially correct, but it misses the point, in that the quality indeed goes up, but the compression ratio (in terms of quality per bit) will suffer significantly as you restrict the qmin/qmax range - i.e. you will spend many more bits to accomplish the same quality than should really be necessary if you used the encoder optimally.
To increase quality without hurting the compression ratio, you need to actually increase the quality target. How you do this differs a little depending on the codec, but you typically increase the target CRF value or target bitrate. For commandline options, see e.g. the H264 docs. There's identical docs for HEVC/VP9 also. To use these options in the C API, use av_opt_set() with the same option names/values.

In case this could be useful to someone else, I add the answer that damjeux suggested, which worked for me. AVCodecContex has two members qmin and qmax which control the QP (quantization parameter) of the encoder. By default in my case qmin is 2 and qmax is 31. By setting qmax to a lower value the quality of the output improves.

Where is the iteration running in this code?

I'm using NatNet SDK to receiving data from the cameras through tcp connection. In their example code below, it can print the stream data using DataHandler function while waiting for a while(c=_getchar()) loop. I have no idea why the code can both waiting for keyboard input and at the same time printing the camera's data in the DataHandler function in real time. I didn't see the code run in multi threads.
My second question is that how I can get the stream data in a for loop in the main function. If someone can tell me how to use that DataHandler to get data in the main function, that will be great.
//=============================================================================
// Copyright ?2014 NaturalPoint, Inc. All Rights Reserved.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall NaturalPoint, Inc. or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//=============================================================================
/*
SampleClient.cpp
This program connects to a NatNet server, receives a data stream, and writes that data stream
to an ascii file. The purpose is to illustrate using the NatNetClient class.
Usage [optional]:
SampleClient [ServerIP] [LocalIP] [OutputFilename]
[ServerIP] IP address of the server (e.g. 192.168.0.107) ( defaults to local machine)
[OutputFilename] Name of points file (pts) to write out. defaults to Client-output.pts
*/
#include <stdio.h>
#include <tchar.h>
#include <conio.h>
#include <winsock2.h>
#include "NatNetTypes.h"
#include "NatNetClient.h"
#pragma warning( disable : 4996 )
void _WriteHeader(FILE* fp, sDataDescriptions* pBodyDefs);
void _WriteFrame(FILE* fp, sFrameOfMocapData* data);
void _WriteFooter(FILE* fp);
void __cdecl DataHandler(sFrameOfMocapData* data, void* pUserData); // receives data from the server
void __cdecl MessageHandler(int msgType, char* msg); // receives NatNet error mesages
void resetClient();
int CreateClient(int iConnectionType);
unsigned int MyServersDataPort = 3130;
unsigned int MyServersCommandPort = 3131;
int iConnectionType = ConnectionType_Multicast;
//int iConnectionType = ConnectionType_Unicast;
NatNetClient* theClient;
FILE* fp;
char szMyIPAddress[128] = "";
char szServerIPAddress[128] = "";
int analogSamplesPerMocapFrame = 0;
int _tmain(int argc, _TCHAR* argv[])
{
int iResult;
// parse command line args
if(argc>1)
{
strcpy(szServerIPAddress, argv[1]); // specified on command line
printf("Connecting to server at %s...\n", szServerIPAddress);
}
else
{
strcpy(szServerIPAddress, ""); // not specified - assume server is local machine
printf("Connecting to server at LocalMachine\n");
}
if(argc>2)
{
strcpy(szMyIPAddress, argv[2]); // specified on command line
printf("Connecting from %s...\n", szMyIPAddress);
}
else
{
strcpy(szMyIPAddress, ""); // not specified - assume server is local machine
printf("Connecting from LocalMachine...\n");
}
// Create NatNet Client
iResult = CreateClient(iConnectionType);
if(iResult != ErrorCode_OK)
{
printf("Error initializing client. See log for details. Exiting");
return 1;
}
else
{
printf("Client initialized and ready.\n");
}
// send/receive test request
printf("[SampleClient] Sending Test Request\n");
void* response;
int nBytes;
iResult = theClient->SendMessageAndWait("TestRequest", &response, &nBytes);
if (iResult == ErrorCode_OK)
{
printf("[SampleClient] Received: %s", (char*)response);
}
// Retrieve Data Descriptions from server
printf("\n\n[SampleClient] Requesting Data Descriptions...");
sDataDescriptions* pDataDefs = NULL;
int nBodies = theClient->GetDataDescriptions(&pDataDefs);
if(!pDataDefs)
{
printf("[SampleClient] Unable to retrieve Data Descriptions.");
}
else
{
printf("[SampleClient] Received %d Data Descriptions:\n", pDataDefs->nDataDescriptions );
for(int i=0; i < pDataDefs->nDataDescriptions; i++)
{
printf("Data Description # %d (type=%d)\n", i, pDataDefs->arrDataDescriptions[i].type);
if(pDataDefs->arrDataDescriptions[i].type == Descriptor_MarkerSet)
{
// MarkerSet
sMarkerSetDescription* pMS = pDataDefs->arrDataDescriptions[i].Data.MarkerSetDescription;
printf("MarkerSet Name : %s\n", pMS->szName);
for(int i=0; i < pMS->nMarkers; i++)
printf("%s\n", pMS->szMarkerNames[i]);
}
else if(pDataDefs->arrDataDescriptions[i].type == Descriptor_RigidBody)
{
// RigidBody
sRigidBodyDescription* pRB = pDataDefs->arrDataDescriptions[i].Data.RigidBodyDescription;
printf("RigidBody Name : %s\n", pRB->szName);
printf("RigidBody ID : %d\n", pRB->ID);
printf("RigidBody Parent ID : %d\n", pRB->parentID);
printf("Parent Offset : %3.2f,%3.2f,%3.2f\n", pRB->offsetx, pRB->offsety, pRB->offsetz);
}
else if(pDataDefs->arrDataDescriptions[i].type == Descriptor_Skeleton)
{
// Skeleton
sSkeletonDescription* pSK = pDataDefs->arrDataDescriptions[i].Data.SkeletonDescription;
printf("Skeleton Name : %s\n", pSK->szName);
printf("Skeleton ID : %d\n", pSK->skeletonID);
printf("RigidBody (Bone) Count : %d\n", pSK->nRigidBodies);
for(int j=0; j < pSK->nRigidBodies; j++)
{
sRigidBodyDescription* pRB = &pSK->RigidBodies[j];
printf(" RigidBody Name : %s\n", pRB->szName);
printf(" RigidBody ID : %d\n", pRB->ID);
printf(" RigidBody Parent ID : %d\n", pRB->parentID);
printf(" Parent Offset : %3.2f,%3.2f,%3.2f\n", pRB->offsetx, pRB->offsety, pRB->offsetz);
}
}
else if(pDataDefs->arrDataDescriptions[i].type == Descriptor_ForcePlate)
{
// Force Plate
sForcePlateDescription* pFP = pDataDefs->arrDataDescriptions[i].Data.ForcePlateDescription;
printf("Force Plate ID : %d\n", pFP->ID);
printf("Force Plate Serial : %s\n", pFP->strSerialNo);
printf("Force Plate Width : %3.2f\n", pFP->fWidth);
printf("Force Plate Length : %3.2f\n", pFP->fLength);
printf("Force Plate Electrical Center Offset (%3.3f, %3.3f, %3.3f)\n", pFP->fOriginX,pFP->fOriginY, pFP->fOriginZ);
for(int iCorner=0; iCorner<4; iCorner++)
printf("Force Plate Corner %d : (%3.4f, %3.4f, %3.4f)\n", iCorner, pFP->fCorners[iCorner][0],pFP->fCorners[iCorner][1],pFP->fCorners[iCorner][2]);
printf("Force Plate Type : %d\n", pFP->iPlateType);
printf("Force Plate Data Type : %d\n", pFP->iChannelDataType);
printf("Force Plate Channel Count : %d\n", pFP->nChannels);
for(int iChannel=0; iChannel<pFP->nChannels; iChannel++)
printf("\tChannel %d : %s\n", iChannel, pFP->szChannelNames[iChannel]);
}
else
{
printf("Unknown data type.");
// Unknown
}
}
}
// Create data file for writing received stream into
char szFile[MAX_PATH];
char szFolder[MAX_PATH];
GetCurrentDirectory(MAX_PATH, szFolder);
if(argc > 3)
sprintf(szFile, "%s\\%s", szFolder, argv[3]);
else
sprintf(szFile, "%s\\Client-output.pts",szFolder);
fp = fopen(szFile, "w");
if(!fp)
{
printf("error opening output file %s. Exiting.", szFile);
exit(1);
}
if(pDataDefs)
_WriteHeader(fp, pDataDefs);
// Ready to receive marker stream!
printf("\nClient is connected to server and listening for data...\n");
int c;
bool bExit = false;
while(c =_getch())
{
switch(c)
{
case 'q':
bExit = true;
break;
case 'r':
resetClient();
break;
case 'p':
sServerDescription ServerDescription;
memset(&ServerDescription, 0, sizeof(ServerDescription));
theClient->GetServerDescription(&ServerDescription);
if(!ServerDescription.HostPresent)
{
printf("Unable to connect to server. Host not present. Exiting.");
return 1;
}
break;
case 'f':
{
sFrameOfMocapData* pData = theClient->GetLastFrameOfData();
printf("Most Recent Frame: %d", pData->iFrame);
}
break;
case 'm': // change to multicast
iConnectionType = ConnectionType_Multicast;
iResult = CreateClient(iConnectionType);
if(iResult == ErrorCode_OK)
printf("Client connection type changed to Multicast.\n\n");
else
printf("Error changing client connection type to Multicast.\n\n");
break;
case 'u': // change to unicast
iConnectionType = ConnectionType_Unicast;
iResult = CreateClient(iConnectionType);
if(iResult == ErrorCode_OK)
printf("Client connection type changed to Unicast.\n\n");
else
printf("Error changing client connection type to Unicast.\n\n");
break;
case 'c' : // connect
iResult = CreateClient(iConnectionType);
break;
case 'd' : // disconnect
// note: applies to unicast connections only - indicates to Motive to stop sending packets to that client endpoint
iResult = theClient->SendMessageAndWait("Disconnect", &response, &nBytes);
if (iResult == ErrorCode_OK)
printf("[SampleClient] Disconnected");
break;
default:
break;
}
if(bExit)
break;
}
// Done - clean up.
theClient->Uninitialize();
_WriteFooter(fp);
fclose(fp);
return ErrorCode_OK;
}
// Establish a NatNet Client connection
int CreateClient(int iConnectionType)
{
// release previous server
if(theClient)
{
theClient->Uninitialize();
delete theClient;
}
// create NatNet client
theClient = new NatNetClient(iConnectionType);
// set the callback handlers
theClient->SetVerbosityLevel(Verbosity_Warning);
theClient->SetMessageCallback(MessageHandler);
theClient->SetDataCallback( DataHandler, theClient ); // this function will receive data from the server
// [optional] use old multicast group
//theClient->SetMulticastAddress("224.0.0.1");
// print version info
unsigned char ver[4];
theClient->NatNetVersion(ver);
printf("NatNet Sample Client (NatNet ver. %d.%d.%d.%d)\n", ver[0], ver[1], ver[2], ver[3]);
// Init Client and connect to NatNet server
// to use NatNet default port assignments
int retCode = theClient->Initialize(szMyIPAddress, szServerIPAddress);
// to use a different port for commands and/or data:
//int retCode = theClient->Initialize(szMyIPAddress, szServerIPAddress, MyServersCommandPort, MyServersDataPort);
if (retCode != ErrorCode_OK)
{
printf("Unable to connect to server. Error code: %d. Exiting", retCode);
return ErrorCode_Internal;
}
else
{
// get # of analog samples per mocap frame of data
void* pResult;
int ret = 0;
int nBytes = 0;
ret = theClient->SendMessageAndWait("AnalogSamplesPerMocapFrame", &pResult, &nBytes);
if (ret == ErrorCode_OK)
{
analogSamplesPerMocapFrame = *((int*)pResult);
printf("Analog Samples Per Mocap Frame : %d", analogSamplesPerMocapFrame);
}
// print server info
sServerDescription ServerDescription;
memset(&ServerDescription, 0, sizeof(ServerDescription));
theClient->GetServerDescription(&ServerDescription);
if(!ServerDescription.HostPresent)
{
printf("Unable to connect to server. Host not present. Exiting.");
return 1;
}
printf("[SampleClient] Server application info:\n");
printf("Application: %s (ver. %d.%d.%d.%d)\n", ServerDescription.szHostApp, ServerDescription.HostAppVersion[0],
ServerDescription.HostAppVersion[1],ServerDescription.HostAppVersion[2],ServerDescription.HostAppVersion[3]);
printf("NatNet Version: %d.%d.%d.%d\n", ServerDescription.NatNetVersion[0], ServerDescription.NatNetVersion[1],
ServerDescription.NatNetVersion[2], ServerDescription.NatNetVersion[3]);
printf("Client IP:%s\n", szMyIPAddress);
printf("Server IP:%s\n", szServerIPAddress);
printf("Server Name:%s\n\n", ServerDescription.szHostComputerName);
}
return ErrorCode_OK;
}
// DataHandler receives data from the server
void __cdecl DataHandler(sFrameOfMocapData* data, void* pUserData)
{
NatNetClient* pClient = (NatNetClient*) pUserData;
if(fp)
_WriteFrame(fp,data);
int i=0;
printf("FrameID : %d\n", data->iFrame);
printf("Timestamp : %3.2lf\n", data->fTimestamp);
printf("Latency : %3.2lf\n", data->fLatency);
// FrameOfMocapData params
bool bIsRecording = ((data->params & 0x01)!=0);
bool bTrackedModelsChanged = ((data->params & 0x02)!=0);
if(bIsRecording)
printf("RECORDING\n");
if(bTrackedModelsChanged)
printf("Models Changed.\n");
// timecode - for systems with an eSync and SMPTE timecode generator - decode to values
int hour, minute, second, frame, subframe;
bool bValid = pClient->DecodeTimecode(data->Timecode, data->TimecodeSubframe, &hour, &minute, &second, &frame, &subframe);
// decode to friendly string
char szTimecode[128] = "";
pClient->TimecodeStringify(data->Timecode, data->TimecodeSubframe, szTimecode, 128);
printf("Timecode : %s\n", szTimecode);
// Other Markers
printf("Other Markers [Count=%d]\n", data->nOtherMarkers);
for(i=0; i < data->nOtherMarkers; i++)
{
printf("Other Marker %d : %3.2f\t%3.2f\t%3.2f\n",
i,
data->OtherMarkers[i][0],
data->OtherMarkers[i][1],
data->OtherMarkers[i][2]);
}
// Rigid Bodies
printf("Rigid Bodies [Count=%d]\n", data->nRigidBodies);
for(i=0; i < data->nRigidBodies; i++)
{
// params
// 0x01 : bool, rigid body was successfully tracked in this frame
bool bTrackingValid = data->RigidBodies[i].params & 0x01;
printf("Rigid Body [ID=%d Error=%3.2f Valid=%d]\n", data->RigidBodies[i].ID, data->RigidBodies[i].MeanError, bTrackingValid);
printf("\tx\ty\tz\tqx\tqy\tqz\tqw\n");
printf("\t%3.2f\t%3.2f\t%3.2f\t%3.2f\t%3.2f\t%3.2f\t%3.2f\n",
data->RigidBodies[i].x,
data->RigidBodies[i].y,
data->RigidBodies[i].z,
data->RigidBodies[i].qx,
data->RigidBodies[i].qy,
data->RigidBodies[i].qz,
data->RigidBodies[i].qw);
printf("\tRigid body markers [Count=%d]\n", data->RigidBodies[i].nMarkers);
for(int iMarker=0; iMarker < data->RigidBodies[i].nMarkers; iMarker++)
{
printf("\t\t");
if(data->RigidBodies[i].MarkerIDs)
printf("MarkerID:%d", data->RigidBodies[i].MarkerIDs[iMarker]);
if(data->RigidBodies[i].MarkerSizes)
printf("\tMarkerSize:%3.2f", data->RigidBodies[i].MarkerSizes[iMarker]);
if(data->RigidBodies[i].Markers)
printf("\tMarkerPos:%3.2f,%3.2f,%3.2f\n" ,
data->RigidBodies[i].Markers[iMarker][0],
data->RigidBodies[i].Markers[iMarker][1],
data->RigidBodies[i].Markers[iMarker][2]);
}
}
// skeletons
printf("Skeletons [Count=%d]\n", data->nSkeletons);
for(i=0; i < data->nSkeletons; i++)
{
sSkeletonData skData = data->Skeletons[i];
printf("Skeleton [ID=%d Bone count=%d]\n", skData.skeletonID, skData.nRigidBodies);
for(int j=0; j< skData.nRigidBodies; j++)
{
sRigidBodyData rbData = skData.RigidBodyData[j];
printf("Bone %d\t%3.2f\t%3.2f\t%3.2f\t%3.2f\t%3.2f\t%3.2f\t%3.2f\n",
rbData.ID, rbData.x, rbData.y, rbData.z, rbData.qx, rbData.qy, rbData.qz, rbData.qw );
printf("\tRigid body markers [Count=%d]\n", rbData.nMarkers);
for(int iMarker=0; iMarker < rbData.nMarkers; iMarker++)
{
printf("\t\t");
if(rbData.MarkerIDs)
printf("MarkerID:%d", rbData.MarkerIDs[iMarker]);
if(rbData.MarkerSizes)
printf("\tMarkerSize:%3.2f", rbData.MarkerSizes[iMarker]);
if(rbData.Markers)
printf("\tMarkerPos:%3.2f,%3.2f,%3.2f\n" ,
data->RigidBodies[i].Markers[iMarker][0],
data->RigidBodies[i].Markers[iMarker][1],
data->RigidBodies[i].Markers[iMarker][2]);
}
}
}
// labeled markers
bool bOccluded; // marker was not visible (occluded) in this frame
bool bPCSolved; // reported position provided by point cloud solve
bool bModelSolved; // reported position provided by model solve
printf("Labeled Markers [Count=%d]\n", data->nLabeledMarkers);
for(i=0; i < data->nLabeledMarkers; i++)
{
bOccluded = ((data->LabeledMarkers[i].params & 0x01)!=0);
bPCSolved = ((data->LabeledMarkers[i].params & 0x02)!=0);
bModelSolved = ((data->LabeledMarkers[i].params & 0x04)!=0);
sMarker marker = data->LabeledMarkers[i];
int modelID, markerID;
theClient->DecodeID(marker.ID, &modelID, &markerID);
printf("Labeled Marker [ModelID=%d, MarkerID=%d, Occluded=%d, PCSolved=%d, ModelSolved=%d] [size=%3.2f] [pos=%3.2f,%3.2f,%3.2f]\n",
modelID, markerID, bOccluded, bPCSolved, bModelSolved, marker.size, marker.x, marker.y, marker.z);
}
// force plates
if(data->nForcePlates==0)
{
printf("No Plates\n");
}
printf("Force Plate [Count=%d]\n", data->nForcePlates);
for(int iPlate=0; iPlate < data->nForcePlates; iPlate++)
{
printf("Force Plate %d\n", data->ForcePlates[iPlate].ID);
for(int iChannel=0; iChannel < data->ForcePlates[iPlate].nChannels; iChannel++)
{
printf("\tChannel %d:\t", iChannel);
if(data->ForcePlates[iPlate].ChannelData[iChannel].nFrames == 0)
{
printf("\tEmpty Frame\n");
}
else if(data->ForcePlates[iPlate].ChannelData[iChannel].nFrames != analogSamplesPerMocapFrame)
{
printf("\tPartial Frame [Expected:%d Actual:%d]\n", analogSamplesPerMocapFrame, data->ForcePlates[iPlate].ChannelData[iChannel].nFrames);
}
for(int iSample=0; iSample < data->ForcePlates[iPlate].ChannelData[iChannel].nFrames; iSample++)
printf("%3.2f\t", data->ForcePlates[iPlate].ChannelData[iChannel].Values[iSample]);
printf("\n");
}
}
}
// MessageHandler receives NatNet error/debug messages
void __cdecl MessageHandler(int msgType, char* msg)
{
printf("\n%s\n", msg);
}
/* File writing routines */
void _WriteHeader(FILE* fp, sDataDescriptions* pBodyDefs)
{
int i=0;
if(!pBodyDefs->arrDataDescriptions[0].type == Descriptor_MarkerSet)
return;
sMarkerSetDescription* pMS = pBodyDefs->arrDataDescriptions[0].Data.MarkerSetDescription;
fprintf(fp, "<MarkerSet>\n\n");
fprintf(fp, "<Name>\n%s\n</Name>\n\n", pMS->szName);
fprintf(fp, "<Markers>\n");
for(i=0; i < pMS->nMarkers; i++)
{
fprintf(fp, "%s\n", pMS->szMarkerNames[i]);
}
fprintf(fp, "</Markers>\n\n");
fprintf(fp, "<Data>\n");
fprintf(fp, "Frame#\t");
for(i=0; i < pMS->nMarkers; i++)
{
fprintf(fp, "M%dX\tM%dY\tM%dZ\t", i, i, i);
}
fprintf(fp,"\n");
}
void _WriteFrame(FILE* fp, sFrameOfMocapData* data)
{
fprintf(fp, "%d", data->iFrame);
for(int i =0; i < data->MocapData->nMarkers; i++)
{
fprintf(fp, "\t%.5f\t%.5f\t%.5f", data->MocapData->Markers[i][0], data->MocapData->Markers[i][1], data->MocapData->Markers[i][2]);
}
fprintf(fp, "\n");
}
void _WriteFooter(FILE* fp)
{
fprintf(fp, "</Data>\n\n");
fprintf(fp, "</MarkerSet>\n");
}
void resetClient()
{
int iSuccess;
printf("\n\nre-setting Client\n\n.");
iSuccess = theClient->Uninitialize();
if(iSuccess != 0)
printf("error un-initting Client\n");
iSuccess = theClient->Initialize(szMyIPAddress, szServerIPAddress);
if(iSuccess != 0)
printf("error re-initting Client\n");
}

Very likely NatNetClient() or client->Initialize() is creating a new thread in the background. Since I do not have access to their code, I can't validate reliably.
You can either create a synchronous look in the main and always send and receive messages via NatNetClient::SendMessageAndWait
or
To receive data asyncronously, you need to figure out a way to poll on stdin (for getch) and also wait on a lock simultaneously. I am not sure how to do this on Microsoft console. If you figure that out, then when the handler gets control, de-queue the message into a link-list and wakup a mutex to notify the main.

The NatNet sample spins off a different thread for handling the data using the DataHandler function. The following line is where this happens:
theClient->SetDataCallback( DataHandler, theClient );
I would recommend doing the processing strictly in the DataHandler function. However, if talking between threads in necessary, then it's worth looking into the C++ 11 libraries
#include <threads>
#include <mutex>
Specifically mutex is used to lock and unlock threads in C++. You could create a global variable and write to it using the DataHandler function, then use that global variable in your main function. The problem is that you may read and write the data at the same time, because main and DataHandler are on different threads. So you can use the mutex library to lock a thread while reading or writing that global variable (meaning only that thread is allowed to run while doing that process).
Note: The NatNet sample actually uses a UDP transmission not TCP.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Why does adding audio stream to ffmpeg's libavcodec output container cause a crash? - c++

Related

ffmpeg api alternate transcoding and remuxing for same file

Xcode app for macOS. This is how I setup to get audio from usb mic input. Worked a year ago, now doesn't. Why

FFMPEG buffer underflow

Losing quality when encoding with ffmpeg

Where is the iteration running in this code?

Categories

Resources