C++/C FFmpeg artifact build up across video frames - c++

Context:
I am building a recorder for capturing video and audio in separate threads (using Boost thread groups) using FFmpeg 2.8.6 on Ubuntu 16.04. I followed the demuxing_decoding example here: https://www.ffmpeg.org/doxygen/2.8/demuxing_decoding_8c-example.html
Video capture specifics:
I am reading H264 off a Logitech C920 webcam and writing the video to a raw file. The issue I notice with the video is that there seems to be a build-up of artifacts across frames until a particular frame resets. Here is my frame grabbing, and decoding functions:
// Used for injecting decoding functions for different media types, allowing
// for a generic decode loop
typedef std::function<int(AVPacket*, int*, int)> PacketDecoder;
/**
* Decodes a video packet.
* If the decoding operation is successful, returns the number of bytes decoded,
* else returns the result of the decoding process from ffmpeg
*/
int decode_video_packet(AVPacket *packet,
int *got_frame,
int cached){
int ret = 0;
int decoded = packet->size;
*got_frame = 0;
//Decode video frame
ret = avcodec_decode_video2(video_decode_context,
video_frame, got_frame, packet);
if (ret < 0) {
//FFmpeg users should use av_err2str
char errbuf[128];
av_strerror(ret, errbuf, sizeof(errbuf));
std::cerr << "Error decoding video frame " << errbuf << std::endl;
decoded = ret;
} else {
if (*got_frame) {
video_frame->pts = av_frame_get_best_effort_timestamp(video_frame);
//Write to log file
AVRational *time_base = &video_decode_context->time_base;
log_frame(video_frame, time_base,
video_frame->coded_picture_number, video_log_stream);
#if( DEBUG )
std::cout << "Video frame " << ( cached ? "(cached)" : "" )
<< " coded:" << video_frame->coded_picture_number
<< " pts:" << pts << std::endl;
#endif
/*Copy decoded frame to destination buffer:
*This is required since rawvideo expects non aligned data*/
av_image_copy(video_dest_attr.video_destination_data,
video_dest_attr.video_destination_linesize,
(const uint8_t **)(video_frame->data),
video_frame->linesize,
video_decode_context->pix_fmt,
video_decode_context->width,
video_decode_context->height);
//Write to rawvideo file
fwrite(video_dest_attr.video_destination_data[0],
1,
video_dest_attr.video_destination_bufsize,
video_out_file);
//Unref the refcounted frame
av_frame_unref(video_frame);
}
}
return decoded;
}
/**
* Grabs frames in a loop and decodes them using the specified decoding function
*/
int process_frames(AVFormatContext *context,
PacketDecoder packet_decoder) {
int ret = 0;
int got_frame;
AVPacket packet;
//Initialize packet, set data to NULL, let the demuxer fill it
av_init_packet(&packet);
packet.data = NULL;
packet.size = 0;
// read frames from the file
for (;;) {
ret = av_read_frame(context, &packet);
if (ret < 0) {
if (ret == AVERROR(EAGAIN)) {
continue;
} else {
break;
}
}
//Convert timing fields to the decoder timebase
unsigned int stream_index = packet.stream_index;
av_packet_rescale_ts(&packet,
context->streams[stream_index]->time_base,
context->streams[stream_index]->codec->time_base);
AVPacket orig_packet = packet;
do {
ret = packet_decoder(&packet, &got_frame, 0);
if (ret < 0) {
break;
}
packet.data += ret;
packet.size -= ret;
} while (packet.size > 0);
av_free_packet(&orig_packet);
if(stop_recording == true) {
break;
}
}
//Flush cached frames
std::cout << "Flushing frames" << std::endl;
packet.data = NULL;
packet.size = 0;
do {
packet_decoder(&packet, &got_frame, 1);
} while (got_frame);
av_log(0, AV_LOG_INFO, "Done processing frames\n");
return ret;
}
Questions:
How do I go about debugging the underlying issue?
Is it possible that running the decoding code in a thread other than the one in which the decoding context was opened is causing the problem?
Am I doing something wrong in the decoding code?
Things I have tried/found:
I found this thread that is about the same problem here: FFMPEG decoding artifacts between keyframes
(I cannot post samples of my corrupted frames due to privacy issues, but the image linked to in that question depicts the same issue I have)
However, the answer to the question is posted by the OP without specific details about how the issue was fixed. The OP only mentions that he wasn't 'preserving the packets correctly', but nothing about what was wrong or how to fix it. I do not have enough reputation to post a comment seeking clarification.
I was initially passing the packet into the decoding function by value, but switched to passing by pointer on the off chance that the packet freeing was being done incorrectly.
I found another question about debugging decoding issues, but couldn't find anything conclusive: How is video decoding corruption debugged?
I'd appreciate any insight. Thanks a lot!
[EDIT] In response to Ronald's answer, I am adding a little more information that wouldn't fit in a comment:
I am only calling decode_video_packet() from the thread processing video frames; the other thread processing audio frames calls a similar decode_audio_packet() function. So only one thread calls the function. I should mention that I have set the thread_count in the decoding context to 1, failing which I would get a segfault in malloc.c while flushing the cached frames.
I can see this being a problem if the process_frames and the frame decoder function were run on separate threads, which is not the case. Is there a specific reason why it would matter if the freeing is done within the function, or after it returns? I believe the freeing function is passed a copy of the original packet because multiple decode calls would be required for audio packet in case the decoder doesnt decode the entire audio packet.
A general problem is that the corruption does not occur all the time. I can debug better if it is deterministic. Otherwise, I can't even say if a solution works or not.

A few things to check:
are you running multiple threads that are calling decode_video_packet()? If you are: don't do that! FFmpeg has built-in support for multi-threaded decoding, and you should let FFmpeg do threading internally and transparently.
you are calling av_free_packet() right after calling the frame decoder function, but at that point it may not yet have had a chance to copy the contents. You should probably let decode_video_packet() free the packet instead, after calling avcodec_decode_video2().
General debugging advice:
run it without any threading and see if that works;
if it does, and with threading it fails, use thread debuggers such as tsan or helgrind to help in finding race conditions that point to your code.
it can also help to know whether the output you're getting is reproduceable (this suggests a non-threading-related bug in your code) or changes from one run to the other (this suggests a race condition in your code).
And yes, the periodic clean-ups are because of keyframes.

Related

Update parameters during ffmpeg encoding process is running

I want to update parameters like fps, bitrate, gop of video encoder which were already passed to AVCodecContext structure previously.I want to get it's reflection at same time whenever I update any parameters.
One thing can be done, is that need to close codec using av codec close and again open it.
But I think that is not good way.
Here is my ffmpeg's source code for video encoding:
int got_output = 0, ret = 0;
//av_init_packet(&pkt);
pkt.data = NULL; // packet data will be allocated by the encoder
pkt.size = 0;
ret = avcodec_encode_video2(c, &pkt, frame, &got_output);
if (ret < 0)
{
cerr << "Error sending a frame for encoding\n";
exit(1);
}
Is there any FFMPEG's API that can be used to reload encoding parameters?
No, FFmpeg does not have an API for a running process. It is something you would need to develop yourself.

Replacing av_read_frame() to reduce delay

I am implementing a (very) low latency video streaming C++ application using ffmpeg. The client receives a video which is encoded with x264’s zerolatency preset, so there is no need for buffering. As described here, if you use av_read_frame() to read packets of the encoded video stream, you will always have at least one frame delay because of internal buffering done in ffmpeg. So when I call av_read_frame() after frame n+1 has been sent to the client, the function will return frame n.
Getting rid of this buffering by setting the AVFormatContext flags AVFMT_FLAG_NOPARSE | AVFMT_FLAG_NOFILLIN as suggested in the source disables packet parsing and therefore breaks decoding, as noted in the source.
Therefore, I am writing my own packet receiver and parser. First, here are the relevant steps of the working solution (including one frame delay) using av_read_frame():
AVFormatContext *fctx;
AVCodecContext *cctx;
AVPacket *pkt;
AVFrame *frm;
//Initialization of AV structures
//…
//Main Loop
while(true){
//Receive packet
av_read_frame(fctx, pkt);
//Decode:
avcodec_send_packet(cctx, pkt);
avcodec_receive_frame(cctx, frm);
//Display frame
//…
}
And below is my solution, which mimics the behavior of av_read_frame(), as far as I could reproduce it. I was able to track the source code of av_read_frame() down to ff_read_packet(),but I cannot find the source of AVInputformat.read_packet().
int tcpsocket;
AVCodecContext *cctx;
AVPacket *pkt;
AVFrame *frm;
uint8_t recvbuf[(int)10e5];
memset(recvbuf,0,10e5);
int pos = 0;
AVCodecParserContext * parser = av_parser_init(AV_CODEC_ID_H264);
parser->flags |= PARSER_FLAG_COMPLETE_FRAMES;
parser->flags |= PARSER_FLAG_USE_CODEC_TS;
//Initialization of AV structures and the tcpsocket
//…
//Main Loop
while(true){
//Receive packet
int length = read(tcpsocket, recvbuf, 10e5);
if (length >= 0) {
//Creating temporary packet
AVPacket * tempPacket = new AVPacket;
av_init_packet(tempPacket);
av_new_packet(tempPacket, length);
memcpy(tempPacket->data, recvbuf, length);
tempPacket->pos = pos;
pos += length;
memset(recvbuf,0,length);
//Parsing temporary packet into pkt
av_init_packet(pkt);
av_parser_parse2(parser, cctx,
&(pkt->data), &(pkt->size),
tempPacket->data, tempPacket->size,
tempPacket->pts, tempPacket->dts, tempPacket->pos
);
pkt->pts = parser->pts;
pkt->dts = parser->dts;
pkt->pos = parser->pos;
//Set keyframe flag
if (parser->key_frame == 1 ||
(parser->key_frame == -1 &&
parser->pict_type == AV_PICTURE_TYPE_I))
pkt->flags |= AV_PKT_FLAG_KEY;
if (parser->key_frame == -1 && parser->pict_type == AV_PICTURE_TYPE_NONE && (pkt->flags & AV_PKT_FLAG_KEY))
pkt->flags |= AV_PKT_FLAG_KEY;
pkt->duration = 96000; //Same result as in av_read_frame()
//Decode:
avcodec_send_packet(cctx, pkt);
avcodec_receive_frame(cctx, frm);
//Display frame
//…
}
}
I checked the fields of the resulting packet (pkt) just before avcodec_send_packet() in both solutions. They are as far as I can tell identical. The only difference might be the actual content of pkt->data. My solution decodes I-Frames fine, but the references in P-Frames seem to be broken, causing heavy artifacts and error messages such as “invalid level prefix”, “error while decoding MB xx”, and similar.
I would be very grateful for any hints.
Edit 1: I have developed a workaround for the time being: in the video server, after sending the packet containing the encoded data of a frame, I send one dummy packet which only contains the delimiters marking beginning and end of the packet. This way, I push the actual video data frames through av_read_frame(). I discard the dummy packets immediately after av_frame_read().
Edit 2: Solved here by rom1v, as written in his comment to this question.
av_parser_parse2() does not neccessarily consume your tempPacket in one go. You have to call it in another loop and check its return value, like in the API docs.

SDL2 & SMPEG2 - Empty sound buffer trying to read a MP3

I'm trying to load a MP3 in a buffer using the SMPEG2 library, which comes with the SDL2. Every SMPEG function calls returns without error, but when I'm done, the sound buffer is full of zeros.
Here's the code :
bool LoadMP3(char* filename)
{
bool success = false;
const Uint32 Mp3ChunkLen = 4096;
SMPEG* mp3;
SMPEG_Info infoMP3;
Uint8 * ChunkBuffer;
Uint32 MP3Length = 0;
// Allocate a chunk buffer
ChunkBuffer = (Uint8*)malloc(Mp3ChunkLen);
SDL_RWops *mp3File = SDL_RWFromFile(filename, "rb");
if (mp3File != NULL)
{
mp3 = SMPEG_new_rwops(mp3File, &infoMP3, 1, 0);
if(mp3 != NULL)
{
if(infoMP3.has_audio)
{
Uint32 readLen;
// Inform the MP3 of the output audio specifications
SMPEG_actualSpec(mp3, &asDeviceSpecs); // static SDL_AudioSpec asDeviceSpecs; containing valid values after a call to SDL_OpenAudioDevice
// Enable the audio and disable the video.
SMPEG_enableaudio(mp3, 1);
SMPEG_enablevideo(mp3, 0);
// Play the MP3 once to get the size of the needed finale buffer
SMPEG_play(mp3);
while ((readLen = SMPEG_playAudio(mp3, ChunkBuffer, Mp3ChunkLen)) > 0)
{
MP3Length += readLen;
}
SMPEG_stop(mp3);
if(MP3Length > 0)
{
// Reallocate the buffer with the new length (if needed)
if (MP3Length != Mp3ChunkLen)
{
ChunkBuffer = (Uint8*)realloc(ChunkBuffer, MP3Length);
}
// Replay the entire MP3 into the new ChunkBuffer.
SMPEG_rewind(mp3);
SMPEG_play(mp3);
bool readBackSuccess = (MP3Length == SMPEG_playAudio(mp3, ChunkBuffer, MP3Length));
SMPEG_stop(mp3);
if(readBackSuccess)
{
// !!! Here, ChunkBuffer contains only zeros !!!
success = true;
}
}
}
SMPEG_delete(mp3);
mp3 = NULL;
}
SDL_RWclose(mp3File);
mp3File = NULL;
}
free(ChunkBuffer);
return success;
}
The code's widely based on SDL_Mixer, which I cannot use for my projet, based on its limitations.
I know Ogg Vorbis would be a better choice of file format, but I'm porting a very old project, and it worked entirely with MP3s.
I'm sure the sound system is initialized correctly because I can play WAV files just fine. It's intialized with a frequency of 44100, 2 channels, 1024 samples, and the AUDIO_S16SYS format (the latter which is, as I understood from the SMPEG source, mandatory).
I've calculated the anticipated buffer size, based on the bitrate, the amount of data in the MP3 and the OpenAudioDevice audio specs, and everything is consistent.
I cannot figure why everything but the buffer data seems to be working.
UPDATE #1
Still trying to figure out what's wrong, I thought the support for MP3 might not be working, so I created the following function :
SMPEG *mpeg;
SMPEG_Info info;
mpeg = SMPEG_new(filename,&info, 1);
SMPEG_play(mpeg);
do { SDL_Delay(50); } while(SMPEG_status(mpeg) == SMPEG_PLAYING);
SMPEG_delete(mpeg);
The MP3 played. So, the decoding should actually be working. But that's not what I need ; I really need the sound buffer data so I can send it to my mixer.
After much tinkering, research and digging through the SMPEG source code, I realized that I had to pass 1 as the SDLAudio parameter to SMPEG_new_rwops function.
The comment found in smpeg.h is misleading :
The sdl_audio parameter indicates if SMPEG should initialize the SDL audio subsystem. If not, you will have to use the SMPEG_playaudio() function below to extract the decoded data.
Since the audio subsystem was already initialized and I was using the SMPEG_playaudio() function, I had no reason to think I needed this parameter to be non-zero. In SMPEG, this parameter triggers the audio decompression at opening time, but even though I called SMPEG_enableaudio(mp3, 1); the data is never reparsed. This might be a bug/a shady feature.
I had another problem with the freesrc parameter which needed to be 0, since I freed the SDL_RWops object myself.
For future reference, once ChunkBuffer has the MP3 data, it needs to pass through SDL_BuildAudioCVT/SDL_ConvertAudio if it's to be played through an already opened audio device.
The final working code is :
// bool ReadMP3ToBuffer(char* filename)
bool success = false;
const Uint32 Mp3ChunkLen = 4096;
SDL_AudioSpec mp3Specs;
SMPEG* mp3;
SMPEG_Info infoMP3;
Uint8 * ChunkBuffer;
Uint32 MP3Length = 0;
// Allocate a chunk buffer
ChunkBuffer = (Uint8*)malloc(Mp3ChunkLen);
memset(ChunkBuffer, 0, Mp3ChunkLen);
SDL_RWops *mp3File = SDL_RWFromFile(filename, "rb"); // filename is a char* passed to the function.
if (mp3File != NULL)
{
mp3 = SMPEG_new_rwops(mp3File, &infoMP3, 0, 1);
if(mp3 != NULL)
{
if(infoMP3.has_audio)
{
Uint32 readLen;
// Get the MP3 audio specs for later conversion
SMPEG_wantedSpec(mp3, &mp3Specs);
SMPEG_enablevideo(mp3, 0);
// Play the MP3 once to get the size of the needed buffer in relation with the audio specs
SMPEG_play(mp3);
while ((readLen = SMPEG_playAudio(mp3, ChunkBuffer, Mp3ChunkLen)) > 0)
{
MP3Length += readLen;
}
SMPEG_stop(mp3);
if(MP3Length > 0)
{
// Reallocate the buffer with the new length (if needed)
if (MP3Length != Mp3ChunkLen)
{
ChunkBuffer = (Uint8*)realloc(ChunkBuffer, MP3Length);
memset(ChunkBuffer, 0, MP3Length);
}
// Replay the entire MP3 into the new ChunkBuffer.
SMPEG_rewind(mp3);
SMPEG_play(mp3);
bool readBackSuccess = (MP3Length == SMPEG_playAudio(mp3, ChunkBuffer, MP3Length));
SMPEG_stop(mp3);
if(readBackSuccess)
{
SDL_AudioCVT convertedSound;
// NOTE : static SDL_AudioSpec asDeviceSpecs; containing valid values after a call to SDL_OpenAudioDevice
if(SDL_BuildAudioCVT(&convertedSound, mp3Specs.format, mp3Specs.channels, mp3Specs.freq, asDeviceSpecs.format, asDeviceSpecs.channels, asDeviceSpecs.freq) >= 0)
{
Uint32 newBufferLen = MP3Length*convertedSound.len_mult;
// Make sure the audio length is a multiple of a sample size to avoid sound clicking
int sampleSize = ((asDeviceSpecs.format & 0xFF)/8)*asDeviceSpecs.channels;
newBufferLen &= ~(sampleSize-1);
// Allocate the new buffer and proceed with the actual conversion.
convertedSound.buf = (Uint8*)malloc(newBufferLen);
memcpy(convertedSound.buf, ChunkBuffer, MP3Length);
convertedSound.len = MP3Length;
if(SDL_ConvertAudio(&convertedSound) == 0)
{
// Save convertedSound.buf and convertedSound.len_cvt for future use in your mixer code.
// Dont forget to free convertedSound.buf once it's not used anymore.
success = true;
}
}
}
}
}
SMPEG_delete(mp3);
mp3 = NULL;
}
SDL_RWclose(mp3File);
mp3File = NULL;
}
free(ChunkBuffer);
return success;
NOTE : Some MP3 files I tried lost a few milliseconds and cutoff too early during playback when I resampled them with this code. Some others didn't. I could reproduce the same behaviour in Audacity, so I'm not sure what's going on. There may still have a bug with my code, a bug in SMPEG, or it maybe a known issue with the MP3 format itself. If someone can provide and explanation in the comments, that would be great!

How to read YUV8 data from avi file?

I have avi file that contains uncompressed gray video data. I need to extract frames from it. The size of file is 22 Gb.
How do i do that?
I have already tried ffmpeg, but it gives me "could not find codec parameters for video stream" message - because there is no codec at work, just frames.
Since Opencv just uses ffmpeg to read video, that rules out opencv as well.
The only path that seems to be left is to try and dig into the raw data, but i do not know how.
Edit: this is the code i use to read from the file with opencv. The failure occurs inside the second if. Running ffmpeg binary on the file also fails with the message above (could not find codec aprameters etc)
/* register all formats and codecs */
av_register_all();
/* open input file, and allocate format context */
if (avformat_open_input(&fmt_ctx, src_filename, NULL, NULL) < 0) {
fprintf(stderr, "Could not open source file %s\n", src_filename);
ret = 1;
goto end;
}
fmt_ctx->seek2any = true;
/* retrieve stream information */
int res = avformat_find_stream_info(fmt_ctx, NULL);
if (res < 0) {
fprintf(stderr, "Could not find stream information\n");
ret = 1;
goto end;
}
Edit:
Here is sample code i have tried to make the extraction: pastebin. The result i get is an unchanging buffer after every call to AVIStreamRead.
If you do not need cross platform functionality Video for Windows (VFW) API is a good alternative (http://msdn.microsoft.com/en-us/library/windows/desktop/dd756808(v=vs.85).aspx), i will not put an entire code block, since there's quite much to do, but you should be able to figure it out from the reference link. Basically, you do a AVIFileOpen, then get the video stream via AVIFileGetStream with streamtypeVIDEO, or alternatively do it at once with AVIStreamOpenFromFile and then read samples from the stream with AVIStreamRead. If you get to a point where you fail I can try to help, but it should be pretty straightforward.
Also, not sure why ffmpeg is failing, I have been doing raw AVI reading with ffmpeg without any codecs involved, can you post what call to ffpeg actually fails?
EDIT:
For the issue that you are seeing when the read data size is 0. The AVI file has N slots for frames in each second where N is the fps of the video. In real life the samples won't come exactly at that speed (e.g. IP surveillance cameras) so the actual data sample indexes can be non continuous like 1,5,11,... and VFW would insert empty samples between them (that is from where you read a sample with a zero size). What you have to do is call AVIStreamRead with NULL as buffer and 0 as size until the bRead is not 0 or you run past last sample. When you get an actual size, then you can again call AVIStreamRead on that sample index with the buffer pointer and size. I usually do compressed video so i don't use the suggested size, but at least according to your code snipplet I would do something like this:
...
bRead = 0;
do
{
aviOpRes = AVIStreamRead(ppavi,smpS,1,NULL,0,&bRead,&smpN);
} while (bRead == 0 && ++smpS < si.dwLength + si.dwStart);
if(smpS >= si.dwLength + si.dwStart)
break;
PUCHAR tempBuffer = new UCHAR[bRead];
aviOpRes = AVIStreamRead(ppavi,smpS,1,tempBuffer,bRead,&bRead,&smpN);
/* do whatever you need */
delete tempBuffer;
...
EDIT 2:
Since this may come in handy to someone or yourself to make a choice between VFW and FFMPEG I also updated your FFMPEG example so that it parsed the same file (sorry for the code quality since it lacks error checking but i guess you can see the logical flow):
/* register all formats and codecs */
av_register_all();
AVFormatContext* fmt_ctx = NULL;
/* open input file, and allocate format context */
const char *src_filename = "E:\\Output.avi";
if (avformat_open_input(&fmt_ctx, src_filename, NULL, NULL) < 0) {
fprintf(stderr, "Could not open source file %s\n", src_filename);
abort();
}
/* retrieve stream information */
int res = avformat_find_stream_info(fmt_ctx, NULL);
if (res < 0) {
fprintf(stderr, "Could not find stream information\n");
abort();
}
int video_stream_index = 0; /* video stream is usualy 0 but still better to lookup in case it's not present */
for(; video_stream_index < fmt_ctx->nb_streams; ++video_stream_index)
{
if(fmt_ctx->streams[video_stream_index]->codec->codec_type == AVMEDIA_TYPE_VIDEO)
break;
}
if(video_stream_index == fmt_ctx->nb_streams)
abort();
AVPacket *packet = new AVPacket;
while(av_read_frame(fmt_ctx, packet) == 0)
{
if (packet->stream_index == video_stream_index)
printf("Sample nr %d\n", packet->pts);
av_free_packet(packet);
}
Basically you open the context and read packets from it. You will get both audio and video packets so you should check if the packet belongs to the stream of interest. FFMPEG will save you the trouble with empty frames and give only those samples that have data in them.

how do i create a stereo mp3 file with latest version of ffmpeg?

I'm updating my code from the older version of ffmpeg (53) to the newer (54/55). Code that did work has now been deprecated or removed so i'm having problems updating it.
Previously I could create a stereo MP3 file using a sample format called:
SAMPLE_FMT_S16
That matched up perfectly with my source stream. This has now been replace with
AV_SAMPLE_FMT_S16
Which works fine for mono recordings but when I try to create a stereo MP3 file it bugs out at avcodec_open2 with:
"Specified sample_fmt is not supported."
Through trial and error I've found that using
AV_SAMPLE_FMT_S16P
...is accepted by avcodec_open2 but when I get through and create the MP3 file the sound is very distorted - it sounds about 2 octaves lower than usual with a massive hum in the background - here's an example recording:
http://hosting.ispyconnect.com/example.mp3
I've been told by the ffmpeg guys that this is because I now need to manually deinterleave my byte stream before calling:
avcodec_fill_audio_frame
How do I do that? I've tried using the swrescale library without success and i've tried manually feeding in L/R data into avcodec_fill_audio_frame but the results i'm getting are sounding exactly the same as without interleaving.
Here is my code for encoding:
void add_audio_sample( AudioWriterPrivateData^ data, BYTE* soundBuffer, int soundBufferSize)
{
libffmpeg::AVCodecContext* c = data->AudioStream->codec;
memcpy(data->AudioBuffer + data->AudioBufferSizeCurrent, soundBuffer, soundBufferSize);
data->AudioBufferSizeCurrent += soundBufferSize;
uint8_t* pSoundBuffer = (uint8_t *)data->AudioBuffer;
DWORD nCurrentSize = data->AudioBufferSizeCurrent;
libffmpeg::AVFrame *frame;
int got_packet;
int ret;
int size = libffmpeg::av_samples_get_buffer_size(NULL, c->channels,
data->AudioInputSampleSize,
c->sample_fmt, 1);
while( nCurrentSize >= size) {
frame=libffmpeg::avcodec_alloc_frame();
libffmpeg::avcodec_get_frame_defaults(frame);
frame->nb_samples = data->AudioInputSampleSize;
ret = libffmpeg::avcodec_fill_audio_frame(frame, c->channels, c->sample_fmt, pSoundBuffer, size, 1);
if (ret<0)
{
throw gcnew System::IO::IOException("error filling audio");
}
//audio_pts = (double)audio_st->pts.val * audio_st->time_base.num / audio_st->time_base.den;
libffmpeg::AVPacket pkt = { 0 };
libffmpeg::av_init_packet(&pkt);
ret = libffmpeg::avcodec_encode_audio2(c, &pkt, frame, &got_packet);
if (ret<0)
throw gcnew System::IO::IOException("error encoding audio");
if (got_packet) {
pkt.stream_index = data->AudioStream->index;
if (pkt.pts != AV_NOPTS_VALUE)
pkt.pts = libffmpeg::av_rescale_q(pkt.pts, c->time_base, c->time_base);
if (pkt.duration > 0)
pkt.duration = av_rescale_q(pkt.duration, c->time_base, c->time_base);
pkt.flags |= AV_PKT_FLAG_KEY;
if (libffmpeg::av_interleaved_write_frame(data->FormatContext, &pkt) != 0)
throw gcnew System::IO::IOException("unable to write audio frame.");
}
nCurrentSize -= size;
pSoundBuffer += size;
}
memcpy(data->AudioBuffer, data->AudioBuffer + data->AudioBufferSizeCurrent - nCurrentSize, nCurrentSize);
data->AudioBufferSizeCurrent = nCurrentSize;
}
Would love to hear any ideas - I've been trying to get this working for 3 days now :(
you don't want to increase pSoundBuffer if a frame hasn't been fully encoded (e.g. got_packet isn't set to true) as no memory has been written yet. Also, you are allocating a frame during each loop: there's no need for that, you can re-use the same AVFrame over an over. Your code is also leaking as you never free the AVFrame.
I wrote a code as part of MythTV that encode audio to AC3.
This also do what you were looking for: deinterleave the content.
https://github.com/MythTV/mythtv/blob/476b2a826d43fca5e658ebe787c3cb1ec2334f98/mythtv/libs/libmyth/audio/audiooutputdigitalencoder.cpp#L178
I know this question is old, but for posterity: I'm working on some audio resampling code, and after I arrived at an audio sounding very similar to the mp3 the author linked, I identified the cause as being a mismatch in audio sampling rate between the input the resampler expects and the actual data.