Portaudio + Opus encoding / decoding audio input - c++

I'm working on a VOIP client using Portaudio and opus.
I read from the microphone in a frame
-encode each frame with Opus and put it in a list
-pop the first element from the list and decode it
-read it with portaudio
If i do the same thing without encoding my sound it works great. But when I use Opus my sound is bad, I can't understand the voice (which is bad for a voip client)
HandlerOpus::HandlerOpus(int sample_rate, int num_channels)
{
this->num_channels = num_channels;
this->enc = opus_encoder_create(sample_rate, num_channels, OPUS_APPLICATION_VOIP, &this->error);
this->dec = opus_decoder_create(sample_rate, num_channels, &this->error);
opus_int32 rate;
opus_encoder_ctl(enc, OPUS_GET_BANDWIDTH(&rate));
this->encoded_data_size = rate;
}
HandlerOpus::~HandlerOpus(void)
{
opus_encoder_destroy(this->enc);
opus_decoder_destroy(this->dec);
}
unsigned char *HandlerOpus::encodeFrame(const float *frame, int frame_size)
{
unsigned char *compressed_buffer;
int ret;
compressed_buffer = new (unsigned char[this->encoded_data_size]);
ret = opus_encode_float(this->enc, frame, frame_size, compressed_buffer, this->encoded_data_size);
return (compressed_buffer);
}
float *HandlerOpus::decodeFrame(const unsigned char *data, int frame_size)
{
int ret;
float *frame = new (float[frame_size * this->num_channels]);
opus_packet_get_nb_channels(data);
ret = opus_decode_float(this->dec, data, this->encoded_data_size, frame, frame_size, 0);
return (frame);
}
I can't change the library I have to use Opus.
The sample rate is 48000 and the frames per buffer is 480 and I tried in mono and stereo.
What am I doing wrong?

I solved the problem myself I changed the config : The sample rate to 24000 and the frames per buffer is still 480.

It's 6 years later, but I'm gonna post an answer for future googlers like me:
I had very similiar problem and fixed it by changing PortAudio sample type to paInt32 and switched from opus_decode_float to just opus_decode

Related

Sound playback using FFmpeg and libsoundio in c++

I am trying to make a video player desktop application in c++ using primarily FFmpeg and Qt6. As of for now, I can decode and play video frames correctly at the right speed, that is not a problem. I am now trying to get to playback audio, which is much harder than I expected it to be. I am using libsoundio for my audio library but the documentation is really poor and there are not many examples/tutorials on it. I am also a beginner when it comes to audio programming, although I understand the basics. First off, if anyone can recommend an audio library for this type of job let me know, but I would like to use open source libraries. Anyways, here is how I decode my audio data with FFmpeg. I'm not sure if I am doing it correctly as I could barely find documentation on that as well...
I have a struct that contains all the information which is initiated through a function:
struct VideoReader
{
bool valid;
int width, height;
int video_stream_index;
int audio_stream_index;
AVRational time_base;
AVFormatContext* av_format_ctx;
AVCodecContext* av_vi_codec_ctx;
AVCodecContext* av_au_codec_ctx;
AVPacket* packet;
AVFrame* frame;
SwsContext* sws_ctx;
SwrContext* swr_ctx;
};
The function that initiates it is quite long and is not necessary to share but it populates all those values except for the sws_ctx and the swr_ctx.
Here is how I decode packets, this function is simplified, I left the video decoding out of it, ill take care of syncing once I can properly playback audio:
bool video_reader_read_au_frame(VideoReader *video_reader, unsigned char **frame_buffer)
{
// Unpack video_reader
auto& av_format_ctx = video_reader->av_format_ctx;
auto& av_codec_ctx = video_reader->av_au_codec_ctx;
auto& av_packet = video_reader->packet;
auto& av_frame = video_reader->frame;
auto& swr_ctx = video_reader->swr_ctx;
int& audio_stream_index = video_reader->audio_stream_index;
// Decode the video frame data
int response;
while (av_read_frame(av_format_ctx, av_packet) >= 0)
{
last_frame = false;
if (av_packet->stream_index != audio_stream_index)
{
av_packet_unref(av_packet);
continue;
}
response = avcodec_send_packet(av_codec_ctx, av_packet);
if (response < 0)
{
Logger::error("Could not decode packet.");
return false;
}
response = avcodec_receive_frame(av_codec_ctx, av_frame);
if (response == AVERROR(EAGAIN) || response == AVERROR_EOF)
{
av_packet_unref(av_packet);
continue;
}
else if (response < 0)
{
Logger::error("Could not decode packet.");
return false;
}
av_packet_unref(av_packet);
break;
}
// Initialize SwrContext
if (!swr_ctx) {
swr_ctx = swr_alloc_set_opts(nullptr,
av_codec_ctx->channel_layout, AV_SAMPLE_FMT_FLT,
av_codec_ctx->sample_rate, av_codec_ctx->channel_layout,
av_codec_ctx->sample_fmt, av_codec_ctx->sample_rate,
0, nullptr);
if (!swr_ctx)
{
Logger::error("Could not create SwrContext.");
return false;
}
if (swr_init(swr_ctx) < 0)
{
Logger::error("Could not initialize SwrContext.");
return false;
}
}
const int MAX_BUFFER_SIZE = av_samples_get_buffer_size(nullptr, av_frame->channels, av_frame->nb_samples, AV_SAMPLE_FMT_FLT, 1);
*frame_buffer = (unsigned char*)av_malloc(MAX_BUFFER_SIZE);
swr_convert(swr_ctx, frame_buffer, av_frame->nb_samples,
(const unsigned char**)av_frame->data, av_frame->nb_samples);
av_frame_unref(av_frame);
return true;
}
Here is how I would normally call this function:
VideoReader vr{};
if(!video_reader_open(&vr, "C:/Path/to/file.mp4"))
{
Logger::error("Could not initialize VideoReader.");
return 1;
}
unsigned char* buffer;
if(!video_reader_read_au_frame(&vr, &buffer))
{
Logger::error("Could not read audio data.");
return 1;
}
play_audio(&buffer); <-- Find a way to play audio once buffer has data in it
video_reader_close(&vr);
return 0;
Obviously I will loop over video_reader_read_au_frame(&vr, &buffer) to playback the whole video.
I believe my code puts the samples from the decoded frame in buffer, but I am really not sure.. I am unsure as well if I need to convert to AV_SAMPLE_FMT_FLT audio format or something else or just leave it as it is. For libsoundio, I kind of understand this example: http://libsound.io/ but I'm not sure I fully understand how this library works, especially the callback function. I know I have to pass buffer in outstream->userdata as a void pointer, but I don't know how to use it in the callback function. Any help or guidance would be greatly appreciated. Note that later on in this project I might want to send this data over a network to play the video on another computer in sync.

Crash trying to convert PCM to MP3 using AudioKit

I am trying to convert in real time the audio from my iPhone mic to MP3.
I have it setup as such:
let format = AVAudioFormat(commonFormat: AVAudioCommonFormat.pcmFormatInt16,
sampleRate: 44100.0,
channels: 1,
interleaved: true)
mic.avAudioUnitOrNode.installTap(onBus: 0, bufferSize: AVAudioFrameCount((format?.sampleRate)!), format: format, block: { (buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
let audioBuffer : AVAudioBuffer = buffer
self.audioProcessor?.processBuffer( audioBuffer.mutableAudioBufferList)
})
-(void)processBuffer: (AudioBufferList*) audioBufferList;
{
const int PCM_SIZE = 8192;
const int MP3_SIZE = 8192;
short int pcm_buffer[PCM_SIZE*2];
unsigned char mp3_buffer[MP3_SIZE];
int write = lame_encode_buffer_interleaved(mLame, pcm_buffer,(int*) audioBufferList->mBuffers[0].mData, mp3_buffer, MP3_SIZE);
//some other stuff
}
but I am getting a crash as soon as I get to the encoding portion.
EDIT:
I got it to stop crashing, but the audio quality is pretty harsh:
int size = audioBufferList->mBuffers[0].mDataByteSize / 2;
unsigned char mp3_buffer[size * 4];
int write = lame_encode_buffer(mLame, audioBufferList->mBuffers[0].mData, audioBufferList->mBuffers[0].mData, size, mp3_buffer, size*4);
There was a mismatch on the sampling rates between the source audio and the encoder.

Playing audio without freezing draw loop in openGL

I'm working on a project in openGL and it needs to be able to play simple sounds (mp3) from file while not interrupting the draw loop.
I've been playing around with a few different libraries (openAL, portaudio) and eventually settled on mpg123 (to load the mp3) and libao to play the mp3 back.
The current playsound function works but it blocks the openGL draw loop (ie. freezes the game) until the audio has completed playing. I have tried messing around with std::thread but it still blocked the draw loop.
Here is the audio playback function I've been testing with:
void playSound() {
mpg123_handle *mh;
unsigned char *buffer;
size_t buffer_size;
size_t done;
int err;
int driver;
ao_device *dev;
ao_sample_format format;
int channels, encoding;
long rate;
/* initializations */
ao_initialize();
driver = ao_default_driver_id();
mpg123_init();
mh = mpg123_new(NULL, &err);
buffer_size = mpg123_outblock(mh);
buffer = (unsigned char*) malloc(buffer_size * sizeof(unsigned char));
/* open the file and get the decoding format */
mpg123_open(mh, "sounds/door.mp3");
mpg123_getformat(mh, &rate, &channels, &encoding);
/* set the output format and open the output device */
format.bits = mpg123_encsize(encoding) * 8;
format.rate = rate;
format.channels = channels;
format.byte_format = AO_FMT_NATIVE;
format.matrix = 0;
dev = ao_open_live(driver, &format, NULL);
/* decode and play */
while (mpg123_read(mh, buffer, buffer_size, &done) == MPG123_OK)
ao_play(dev, (char*)buffer, done);
/* clean up */
free(buffer);
ao_close(dev);
mpg123_close(mh);
mpg123_delete(mh);
mpg123_exit();
ao_shutdown();
}
How would I go about fixing this so that my game continues to run smoothly and the audio plays in the background?
You should unpack small amount of audio data and feed it to an audio device every frame.
The main trick is to find out how many samples was played by device already. I'm not sure how you can do this with libao, but it pretty simple with OpenAL.
You can check details here Play stream in OpenAL library
Also, you always can use additional thread. It'll be overkill, but very simple to do and can work fine for a small/demo project.

FFmpeg + OpenAL - playback streaming sound from video won't work

I am decoding an OGG video (theora & vorbis as codecs) and want to show it on the screen (using Ogre 3D) while playing its sound. I can decode the image stream just fine and the video plays perfectly with the correct frame rate, etc.
However, I cannot get the sound to play at all with OpenAL.
Edit: I managed to make the playing sound resemble the actual audio in the video at least somewhat. Updated sample code.
Edit 2: I was able to get "almost" correct sound now. I had to set OpenAL to use AL_FORMAT_STEREO_FLOAT32 (after initializing the extension) instead of just STEREO16. Now the sound is "only" extremely high pitched and stuttering, but at the correct speed.
Here is how I decode audio packets (in a background thread, the equivalent works just fine for the image stream of the video file):
//------------------------------------------------------------------------------
int decodeAudioPacket( AVPacket& p_packet, AVCodecContext* p_audioCodecContext, AVFrame* p_frame,
FFmpegVideoPlayer* p_player, VideoInfo& p_videoInfo)
{
// Decode audio frame
int got_frame = 0;
int decoded = avcodec_decode_audio4(p_audioCodecContext, p_frame, &got_frame, &p_packet);
if (decoded < 0)
{
p_videoInfo.error = "Error decoding audio frame.";
return decoded;
}
// Frame is complete, store it in audio frame queue
if (got_frame)
{
int bufferSize = av_samples_get_buffer_size(NULL, p_audioCodecContext->channels, p_frame->nb_samples,
p_audioCodecContext->sample_fmt, 0);
int64_t duration = p_frame->pkt_duration;
int64_t dts = p_frame->pkt_dts;
if (staticOgreLog)
{
staticOgreLog->logMessage("Audio frame bufferSize / duration / dts: "
+ boost::lexical_cast<std::string>(bufferSize) + " / "
+ boost::lexical_cast<std::string>(duration) + " / "
+ boost::lexical_cast<std::string>(dts), Ogre::LML_NORMAL);
}
// Create the audio frame
AudioFrame* frame = new AudioFrame();
frame->dataSize = bufferSize;
frame->data = new uint8_t[bufferSize];
if (p_frame->channels == 2)
{
memcpy(frame->data, p_frame->data[0], bufferSize >> 1);
memcpy(frame->data + (bufferSize >> 1), p_frame->data[1], bufferSize >> 1);
}
else
{
memcpy(frame->data, p_frame->data, bufferSize);
}
double timeBase = ((double)p_audioCodecContext->time_base.num) / (double)p_audioCodecContext->time_base.den;
frame->lifeTime = duration * timeBase;
p_player->addAudioFrame(frame);
}
return decoded;
}
So, as you can see, I decode the frame, memcpy it to my own struct, AudioFrame. Now, when the sound is played, I use these audio frame like this:
int numBuffers = 4;
ALuint buffers[4];
alGenBuffers(numBuffers, buffers);
ALenum success = alGetError();
if(success != AL_NO_ERROR)
{
CONSOLE_LOG("Error on alGenBuffers : " + Ogre::StringConverter::toString(success) + alGetString(success));
return;
}
// Fill a number of data buffers with audio from the stream
std::vector<AudioFrame*> audioBuffers;
std::vector<unsigned int> audioBufferSizes;
unsigned int numReturned = FFMPEG_PLAYER->getDecodedAudioFrames(numBuffers, audioBuffers, audioBufferSizes);
// Assign the data buffers to the OpenAL buffers
for (unsigned int i = 0; i < numReturned; ++i)
{
alBufferData(buffers[i], _streamingFormat, audioBuffers[i]->data, audioBufferSizes[i], _streamingFrequency);
success = alGetError();
if(success != AL_NO_ERROR)
{
CONSOLE_LOG("Error on alBufferData : " + Ogre::StringConverter::toString(success) + alGetString(success)
+ " size: " + Ogre::StringConverter::toString(audioBufferSizes[i]));
return;
}
}
// Queue the buffers into OpenAL
alSourceQueueBuffers(_source, numReturned, buffers);
success = alGetError();
if(success != AL_NO_ERROR)
{
CONSOLE_LOG("Error queuing streaming buffers: " + Ogre::StringConverter::toString(success) + alGetString(success));
return;
}
}
alSourcePlay(_source);
The format and frequency I give to OpenAL are AL_FORMAT_STEREO_FLOAT32 (it is a stereo sound stream, and I did initialize the FLOAT32 extension) and 48000 (which is the sample rate of the AVCodecContext of the audio stream).
And during playback, I do the following to refill OpenAL's buffers:
ALint numBuffersProcessed;
// Check if OpenAL is done with any of the queued buffers
alGetSourcei(_source, AL_BUFFERS_PROCESSED, &numBuffersProcessed);
if(numBuffersProcessed <= 0)
return;
// Fill a number of data buffers with audio from the stream
std::vector<AudiFrame*> audioBuffers;
std::vector<unsigned int> audioBufferSizes;
unsigned int numFilled = FFMPEG_PLAYER->getDecodedAudioFrames(numBuffersProcessed, audioBuffers, audioBufferSizes);
// Assign the data buffers to the OpenAL buffers
ALuint buffer;
for (unsigned int i = 0; i < numFilled; ++i)
{
// Pop the oldest queued buffer from the source,
// fill it with the new data, then re-queue it
alSourceUnqueueBuffers(_source, 1, &buffer);
ALenum success = alGetError();
if(success != AL_NO_ERROR)
{
CONSOLE_LOG("Error Unqueuing streaming buffers: " + Ogre::StringConverter::toString(success));
return;
}
alBufferData(buffer, _streamingFormat, audioBuffers[i]->data, audioBufferSizes[i], _streamingFrequency);
success = alGetError();
if(success != AL_NO_ERROR)
{
CONSOLE_LOG("Error on re- alBufferData: " + Ogre::StringConverter::toString(success));
return;
}
alSourceQueueBuffers(_source, 1, &buffer);
success = alGetError();
if(success != AL_NO_ERROR)
{
CONSOLE_LOG("Error re-queuing streaming buffers: " + Ogre::StringConverter::toString(success) + " "
+ alGetString(success));
return;
}
}
// Make sure the source is still playing,
// and restart it if needed.
ALint playStatus;
alGetSourcei(_source, AL_SOURCE_STATE, &playStatus);
if(playStatus != AL_PLAYING)
alSourcePlay(_source);
As you can see, I do quite heavy error checking. But I do not get any errors, neither from OpenAL nor from FFmpeg.
Edit: What I hear somewhat resembles the actual audio from the video, but VERY high pitched and stuttering VERY much. Also, it seems to be playing on top of TV noise. Very strange. Plus, it is playing much slower than the correct audio would.
Edit: 2 After using AL_FORMAT_STEREO_FLOAT32, the sound plays at the correct speed, but is still very high pitched and stuttering (though less than before).
The video itself is not broken, it can be played fine on any player. OpenAL can also play *.way files just fine in the same application, so it is also working.
Any ideas what could be wrong here or how to do this correctly?
My only guess is that somehow, FFmpeg's decode function does not produce data OpenGL can read. But this is as far as the FFmpeg decode example goes, so I don't know what's missing. As I understand it, the decode_audio4 function decodes the frame to raw data. And OpenAL should be able to work with RAW data (or rather, doesn't work with anything else).
So, I finally figured out how to do it. Gee, what a mess. It was a hint from a user on the libav-users mailing list that put me on the correct path.
Here are my mistakes:
Using the wrong format in the alBufferData function. I used AL_FORMAT_STEREO16 (as that is what every single streaming example with OpenAL uses). I should have used AL_FORMAT_STEREO_FLOAT32, as the video I stream is Ogg and vorbis is stored in floating points. And using swr_convert to convert from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16 just crashes. No idea why.
Not using swr_convert to convert the decoded audio frame to the target format. After I was trying to use swr_convert to convert from FLTP to S16, and it would simply crash without a reason given, I assumed it was broken. But after figuring out my first mistake, I tried again, converting from FLTP to FLT (non-planar) and then it worked! So OpenAL uses interleaved format, not planar. Good to know.
So here is the decodeAudioPacket function that is working for me with Ogg video, vorbis audio stream:
int decodeAudioPacket( AVPacket& p_packet, AVCodecContext* p_audioCodecContext, AVFrame* p_frame,
SwrContext* p_swrContext, uint8_t** p_destBuffer, int p_destLinesize,
FFmpegVideoPlayer* p_player, VideoInfo& p_videoInfo)
{
// Decode audio frame
int got_frame = 0;
int decoded = avcodec_decode_audio4(p_audioCodecContext, p_frame, &got_frame, &p_packet);
if (decoded < 0)
{
p_videoInfo.error = "Error decoding audio frame.";
return decoded;
}
if(decoded <= p_packet.size)
{
/* Move the unread data to the front and clear the end bits */
int remaining = p_packet.size - decoded;
memmove(p_packet.data, &p_packet.data[decoded], remaining);
av_shrink_packet(&p_packet, remaining);
}
// Frame is complete, store it in audio frame queue
if (got_frame)
{
int outputSamples = swr_convert(p_swrContext,
p_destBuffer, p_destLinesize,
(const uint8_t**)p_frame->extended_data, p_frame->nb_samples);
int bufferSize = av_get_bytes_per_sample(AV_SAMPLE_FMT_FLT) * p_videoInfo.audioNumChannels
* outputSamples;
int64_t duration = p_frame->pkt_duration;
int64_t dts = p_frame->pkt_dts;
if (staticOgreLog)
{
staticOgreLog->logMessage("Audio frame bufferSize / duration / dts: "
+ boost::lexical_cast<std::string>(bufferSize) + " / "
+ boost::lexical_cast<std::string>(duration) + " / "
+ boost::lexical_cast<std::string>(dts), Ogre::LML_NORMAL);
}
// Create the audio frame
AudioFrame* frame = new AudioFrame();
frame->dataSize = bufferSize;
frame->data = new uint8_t[bufferSize];
memcpy(frame->data, p_destBuffer[0], bufferSize);
double timeBase = ((double)p_audioCodecContext->time_base.num) / (double)p_audioCodecContext->time_base.den;
frame->lifeTime = duration * timeBase;
p_player->addAudioFrame(frame);
}
return decoded;
}
And here is how I initialize the context and the destination buffer:
// Initialize SWR context
SwrContext* swrContext = swr_alloc_set_opts(NULL,
audioCodecContext->channel_layout, AV_SAMPLE_FMT_FLT, audioCodecContext->sample_rate,
audioCodecContext->channel_layout, audioCodecContext->sample_fmt, audioCodecContext->sample_rate,
0, NULL);
int result = swr_init(swrContext);
// Create destination sample buffer
uint8_t** destBuffer = NULL;
int destBufferLinesize;
av_samples_alloc_array_and_samples( &destBuffer,
&destBufferLinesize,
videoInfo.audioNumChannels,
2048,
AV_SAMPLE_FMT_FLT,
0);

how do i create a stereo mp3 file with latest version of ffmpeg?

I'm updating my code from the older version of ffmpeg (53) to the newer (54/55). Code that did work has now been deprecated or removed so i'm having problems updating it.
Previously I could create a stereo MP3 file using a sample format called:
SAMPLE_FMT_S16
That matched up perfectly with my source stream. This has now been replace with
AV_SAMPLE_FMT_S16
Which works fine for mono recordings but when I try to create a stereo MP3 file it bugs out at avcodec_open2 with:
"Specified sample_fmt is not supported."
Through trial and error I've found that using
AV_SAMPLE_FMT_S16P
...is accepted by avcodec_open2 but when I get through and create the MP3 file the sound is very distorted - it sounds about 2 octaves lower than usual with a massive hum in the background - here's an example recording:
http://hosting.ispyconnect.com/example.mp3
I've been told by the ffmpeg guys that this is because I now need to manually deinterleave my byte stream before calling:
avcodec_fill_audio_frame
How do I do that? I've tried using the swrescale library without success and i've tried manually feeding in L/R data into avcodec_fill_audio_frame but the results i'm getting are sounding exactly the same as without interleaving.
Here is my code for encoding:
void add_audio_sample( AudioWriterPrivateData^ data, BYTE* soundBuffer, int soundBufferSize)
{
libffmpeg::AVCodecContext* c = data->AudioStream->codec;
memcpy(data->AudioBuffer + data->AudioBufferSizeCurrent, soundBuffer, soundBufferSize);
data->AudioBufferSizeCurrent += soundBufferSize;
uint8_t* pSoundBuffer = (uint8_t *)data->AudioBuffer;
DWORD nCurrentSize = data->AudioBufferSizeCurrent;
libffmpeg::AVFrame *frame;
int got_packet;
int ret;
int size = libffmpeg::av_samples_get_buffer_size(NULL, c->channels,
data->AudioInputSampleSize,
c->sample_fmt, 1);
while( nCurrentSize >= size) {
frame=libffmpeg::avcodec_alloc_frame();
libffmpeg::avcodec_get_frame_defaults(frame);
frame->nb_samples = data->AudioInputSampleSize;
ret = libffmpeg::avcodec_fill_audio_frame(frame, c->channels, c->sample_fmt, pSoundBuffer, size, 1);
if (ret<0)
{
throw gcnew System::IO::IOException("error filling audio");
}
//audio_pts = (double)audio_st->pts.val * audio_st->time_base.num / audio_st->time_base.den;
libffmpeg::AVPacket pkt = { 0 };
libffmpeg::av_init_packet(&pkt);
ret = libffmpeg::avcodec_encode_audio2(c, &pkt, frame, &got_packet);
if (ret<0)
throw gcnew System::IO::IOException("error encoding audio");
if (got_packet) {
pkt.stream_index = data->AudioStream->index;
if (pkt.pts != AV_NOPTS_VALUE)
pkt.pts = libffmpeg::av_rescale_q(pkt.pts, c->time_base, c->time_base);
if (pkt.duration > 0)
pkt.duration = av_rescale_q(pkt.duration, c->time_base, c->time_base);
pkt.flags |= AV_PKT_FLAG_KEY;
if (libffmpeg::av_interleaved_write_frame(data->FormatContext, &pkt) != 0)
throw gcnew System::IO::IOException("unable to write audio frame.");
}
nCurrentSize -= size;
pSoundBuffer += size;
}
memcpy(data->AudioBuffer, data->AudioBuffer + data->AudioBufferSizeCurrent - nCurrentSize, nCurrentSize);
data->AudioBufferSizeCurrent = nCurrentSize;
}
Would love to hear any ideas - I've been trying to get this working for 3 days now :(
you don't want to increase pSoundBuffer if a frame hasn't been fully encoded (e.g. got_packet isn't set to true) as no memory has been written yet. Also, you are allocating a frame during each loop: there's no need for that, you can re-use the same AVFrame over an over. Your code is also leaking as you never free the AVFrame.
I wrote a code as part of MythTV that encode audio to AC3.
This also do what you were looking for: deinterleave the content.
https://github.com/MythTV/mythtv/blob/476b2a826d43fca5e658ebe787c3cb1ec2334f98/mythtv/libs/libmyth/audio/audiooutputdigitalencoder.cpp#L178
I know this question is old, but for posterity: I'm working on some audio resampling code, and after I arrived at an audio sounding very similar to the mp3 the author linked, I identified the cause as being a mismatch in audio sampling rate between the input the resampler expects and the actual data.