VLC huge buffering times over rtp for local H264 stream - c++

I'm outputting an H264 stream, encoded by my application using ffmpeg. I can display it using ffplay, but when trying to view the stream in VLC, I only get the first frame, or it looks like that's the case.
The messages output shows that it is "buffering", taking around a minute to get to 100% when the frame updates.
When using ffplay, the latency is about 50-100ms at worst.
I am sending to rtp://127.0.0.1:6666?pkt_size=1316 with the format rtp_mpegts.
I am new to this and it's highly likely I haven't set the frame up completely correctly. The process is (minus declarations and error checking)
codec_name = "libx264";
codec = avcodec_find_encoder_by_name(codec_name.c_str());
context = avcodec_alloc_context3(codec);
pkt = av_packet_alloc();
context->bit_rate = 5 * Mega;
context->width = info.DisplayWidth;
context->height = info.DisplayHeight;
context->time_base = { 1, FPS };
context->framerate = { FPS, 1 };
context->gop_size = 100;
context->max_b_frames = 1;
context->pix_fmt = AV_PIX_FMT_YUV420P;
if (codec->id == AV_CODEC_ID_H264)
{
check_ret("set option: preset", av_opt_set(context->priv_data, "preset", "fast", 0));
check_ret("set option: tune", av_opt_set(context->priv_data, "tune", "zerolatency", 0));
check_ret("set option: profile", av_opt_set(context->priv_data, "profile", "baseline", 0));
}
check_ret("open codec", avcodec_open2(context, codec, NULL));
// setup the stream
fmt = (AVOutputFormat*)av_guess_format("rtp_mpegts", NULL, NULL);
avformat_alloc_output_context2(&avfctx, fmt, fmt->name,
"rtp://127.0.0.1:6666?pkt_size=1316");
avio_open(&avfctx->pb, avfctx->url, AVIO_FLAG_WRITE);
AVStream* stream = avformat_new_stream(avfctx, codec);
avcodec_parameters_from_context(stream->codecpar, context);
stream->time_base.num = 1;
stream->time_base.den = FPS;
avformat_write_header(avfctx, NULL);
// then the encoding (in an output loop)
<not shown: get frame from swapchain, sws_scale from rgba to yuv>
yuvFrame->pts = i++; // i is incremented every frame
avcodec_send_frame(enc_ctx, yuvFrame);
while (ret >= 0) {
ret = avcodec_receive_packet(enc_ctx, pkt);
//ret = av_interleaved_write_frame(avfctx, pkt); was using this, don't seem to need it
ret = av_write_frame(avfctx, pkt);
av_packet_unref(pkt);
}
The VLC output looks like this:
main debug: using hw decoder module "d3d11va"
avcodec info: Using D3D11VA (NVIDIA GeForce RTX 2080 Super with Max-Q Design, vendor 10de(NVIDIA), device 1e93, revision a1) for hardware decoding
qt debug: Logical video size: 1280x720
main debug: resized to 1280x720
main debug: VoutDisplayEvent 'resize' 1280x720
main debug: Received first picture
main debug: Buffering 1%
main debug: Buffering 2%
main debug: Buffering 3%
main debug: auto hiding mouse cursor
main debug: Buffering 4%
main debug: Buffering 5%
main debug: Buffering 6%
main debug: Buffering 7%
main debug: Buffering 8%
main debug: Buffering 9%
main debug: Buffering 10%
main debug: auto hiding mouse cursor
main debug: Buffering 11%
rtp warning: 1 packet(s) lost
rtp warning: 1 packet(s) lost
rtp warning: 1 packet(s) lost
ts warning: discontinuity received 0x3 instead of 0xd (pid=256)
ts warning: discontinuity received 0x5 instead of 0xf (pid=256)
ts warning: discontinuity received 0x1 instead of 0xb (pid=256)
main debug: Buffering 12%
main debug: Buffering 13%
main debug: Buffering 14%
main debug: Buffering 15%
main debug: Buffering 16%
main debug: Buffering 17%
main debug: Buffering 18%
main debug: auto hiding mouse cursor
main debug: Buffering 19%
main debug: Buffering 20%

The problem with my approach above was that it was based on the ffmpeg example encode_video.c with some bits for stream output borrowed from google.
Thanks to #rotem I started putting together a standalone executable and stumbled on the example muxing.c in the ffmpeg examples.
This let me find the steps I was missing: set the stream index on the packet, and rescale the time:
av_packet_rescale_ts(pkt, context->time_base, stream->time_base);
pkt->stream_index = stream->index;
int ret = av_interleaved_write_frame(avfctx, pkt);

Related

FFMPEG library- transcode raw image to h264 stream, and the output file does not contains pts and dts info

I am trying using ffmpeg c++ library to convert several raw yuyv image to h264 stream, the image come from memory and passed as string about 24fps, i do the convention as the following steps:
init AVFormatContext,AVCodec,AVCodecContext and create new AVStream. this step i mainly refer to ffmpeg-libav-tutorial, and AVFormatContext use customize write_buffer() function(refer to simplest_ffmpeg_mem_handler)
receive raw frame data, set width and height(1920x1080), and set pts and dts. here i manually set the output fps to 24, and use a global counter to count num of frames, and the pts is calculated by this counter, code snippet(video_avs is AVStream,output_fps is 24 and time_base is 1/24):
input_frame->width = w; // 1920
input_frame->height = h; // 1080
input_frame->pkt_dts = input_frame->pts = global_pts;
global_pts += video_avs->time_base.den/video_avs->time_base.num / output_fps.num * output_fps.den;
convert it from yuyv to yuv422(because h264 does not support yuyv) and resize it from 1920x1080 to 640x480(because i need this resolution output), use sws_scale()
use avcodec_send_frame() and avcodec_receive_packet() to get the output packet. set output_packet duration and stream_index, then use av_write_frame() to write frame data.
AVPacket *output_packet = av_packet_alloc();
int response = avcodec_send_frame(encoder->video_avcc, frame);
while (response >= 0) {
response = avcodec_receive_packet(encoder->video_avcc, output_packet); // !! here output_packet.size is calculated
if (response == AVERROR(EAGAIN) || response == AVERROR_EOF) {
break;
}
else if (response < 0) {
printf("Error while sending packet to decoder"); // ??av_err2str(response)会报错
return response;
}
// duration = next_pts - this_pts = timescale / fps = 1 / timebase / fps
output_packet->duration = (encoder->video_avs->time_base.den / encoder->video_avs->time_base.num) / (output_fps.num / output_fps.den);
output_packet->stream_index = 0;
int response = av_write_frame(encoder->avfc, output_packet); // packet order are not ensure
if (response != 0) { printf("Error %d while receiving packet from decoder", response); return -1;}
}
av_packet_unref(output_packet);
av_packet_free(&output_packet);
in write_buffer() function, video stream output is stored to string variable, and then i write this string to file with ostream, and suffix mp4.
after all the above steps, the output.mp4 cannot be played, the ffprobe output.mp4 -show_frames output is
(image):
Input #0, h264, from '/Users/ming/code/dev/haomo/output.mp4':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: h264 (High 4:2:2), yuv422p(progressive), 640x480, 24.92 fps, 24 tbr, 1200k tbn, 48 tbc
[FRAME]
media_type=video
stream_index=0
key_frame=1
pkt_pts=N/A
pkt_pts_time=N/A
pkt_dts=N/A
pkt_dts_time=N/A
best_effort_timestamp=N/A
best_effort_timestamp_time=N/A
Note that before and after calling av_write_frame() in step 4, the passed argument output_packet contains correct pts and dts info, i cannot figure out why the output stream lost these info.
I figure it out, the output stream is a raw h264 stream, and I directly store this stream into a file with a suffix of ".mp4", so it is actually not a correct mp4 file.
Then it store the stream into output.h264 file, and use ffmpeg to convert it to mp4 file: ffmpeg -framerate 24 -i output.h264 -c copy output.mp4, finally this output.mp4 contains right pts data and can be played.

Higher Than Expected Inherent Latency in OpenAL Playback (Windows, C++)

TL/DR: A simple echo program that records and plays back audio immediately is showing higher than expected latency.
I am working on a real-time audio broadcasting application. I have decided to use OpenAL to both capture and playback audio samples. I am planning on sending UDP packets with raw PCM data across a LAN network. My ideal latency between recording on one machine and playing back on another machine is 30ms. (A lofty goal).
As a test, I made a small program that records samples from a microphone and immediately plays them back to the host speakers. I did this to test the baseline latency.
However, I'm seeing an inherent latency of about 65 - 70 ms from simply recording the audio and playing it back.
I have reduced the buffer size that openAL uses to 100 samples at 44100 samples per second. Ideally, this would yield a latency of 2 - 3 ms.
I have yet to try this on another platform (MacOS / Linux) to determine if this is an OpenAL issue or a Windows issue.
Here's the code:
using std::list;
#define FREQ 44100 // Sample rate
#define CAP_SIZE 100 // How much to capture at a time (affects latency)
#define NUM_BUFFERS 10
int main(int argC, char* argV[])
{
list<ALuint> bufferQueue; // A quick and dirty queue of buffer objects
ALuint helloBuffer[NUM_BUFFERS], helloSource;
ALCdevice* audioDevice = alcOpenDevice(NULL); // Request default audio device
ALCcontext* audioContext = alcCreateContext(audioDevice, NULL); // Create the audio context
alcMakeContextCurrent(audioContext);
// Request the default capture device with a ~2ms buffer
ALCdevice* inputDevice = alcCaptureOpenDevice(NULL, FREQ, AL_FORMAT_MONO16, CAP_SIZE);
alGenBuffers(NUM_BUFFERS, &helloBuffer[0]); // Create some buffer-objects
// Queue our buffers onto an STL list
for (int ii = 0; ii < NUM_BUFFERS; ++ii) {
bufferQueue.push_back(helloBuffer[ii]);
}
alGenSources(1, &helloSource); // Create a sound source
short* buffer = new short[CAP_SIZE]; // A buffer to hold captured audio
ALCint samplesIn = 0; // How many samples are captured
ALint availBuffers = 0; // Buffers to be recovered
ALuint myBuff; // The buffer we're using
ALuint buffHolder[NUM_BUFFERS]; // An array to hold catch the unqueued buffers
alcCaptureStart(inputDevice); // Begin capturing
bool done = false;
while (!done) { // Main loop
// Poll for recoverable buffers
alGetSourcei(helloSource, AL_BUFFERS_PROCESSED, &availBuffers);
if (availBuffers > 0) {
alSourceUnqueueBuffers(helloSource, availBuffers, buffHolder);
for (int ii = 0; ii < availBuffers; ++ii) {
// Push the recovered buffers back on the queue
bufferQueue.push_back(buffHolder[ii]);
}
}
// Poll for captured audio
alcGetIntegerv(inputDevice, ALC_CAPTURE_SAMPLES, 1, &samplesIn);
if (samplesIn > CAP_SIZE) {
// Grab the sound
alcCaptureSamples(inputDevice, buffer, samplesIn);
// Stuff the captured data in a buffer-object
if (!bufferQueue.empty()) { // We just drop the data if no buffers are available
myBuff = bufferQueue.front(); bufferQueue.pop_front();
alBufferData(myBuff, AL_FORMAT_MONO16, buffer, samplesIn * sizeof(short), FREQ);
// Queue the buffer
alSourceQueueBuffers(helloSource, 1, &myBuff);
// Restart the source if needed
// (if we take too long and the queue dries up,
// the source stops playing).
ALint sState = 0;
alGetSourcei(helloSource, AL_SOURCE_STATE, &sState);
if (sState != AL_PLAYING) {
alSourcePlay(helloSource);
}
}
}
}
// Stop capture
alcCaptureStop(inputDevice);
alcCaptureCloseDevice(inputDevice);
// Stop the sources
alSourceStopv(1, &helloSource);
alSourcei(helloSource, AL_BUFFER, 0);
// Clean-up
alDeleteSources(1, &helloSource);
alDeleteBuffers(NUM_BUFFERS, &helloBuffer[0]);
alcMakeContextCurrent(NULL);
alcDestroyContext(audioContext);
alcCloseDevice(audioDevice);
return 0;
}
Here is an image of the waveform showing the delay in input sound and the resulting echo. This example is shows a latency of about 70ms.
Waveform with 70ms echo delay
System specs:
Intel Core i7-9750H
24 GB Ram
Windows 10 Home: V 2004 - Build 19041.508
Sound driver: Realtek Audio (Driver version 10.0.19041.1)
Input device: Logitech G330 Headset
Issue is reproducible on other Windows Systems.
EDIT:
I tried to use PortAudio to do a similar thing and achieved a similar result. I've determined that this is due to Windows audio drivers.
I rebuilt PortAudio with ASIO audio only and installed ASIO4ALL audio driver. This has achieved an acceptable latency of <10ms.
I ultimately resolved this issue by ditching OpenAL in favor of PortAudio and Steinberg ASIO. I installed ASIO4ALL and rebuilt PortAudio to accept only ASIO device drivers. I needed to use the ASIO SDK from Steinberg to do this. (Followed guide here). This has allowed me to achieve a latency of between 5 and 10 ms.
This post helped a lot: input delay with PortAudio callback and ASIO sdk

Update parameters during ffmpeg encoding process is running

I want to update parameters like fps, bitrate, gop of video encoder which were already passed to AVCodecContext structure previously.I want to get it's reflection at same time whenever I update any parameters.
One thing can be done, is that need to close codec using av codec close and again open it.
But I think that is not good way.
Here is my ffmpeg's source code for video encoding:
int got_output = 0, ret = 0;
//av_init_packet(&pkt);
pkt.data = NULL; // packet data will be allocated by the encoder
pkt.size = 0;
ret = avcodec_encode_video2(c, &pkt, frame, &got_output);
if (ret < 0)
{
cerr << "Error sending a frame for encoding\n";
exit(1);
}
Is there any FFMPEG's API that can be used to reload encoding parameters?
No, FFmpeg does not have an API for a running process. It is something you would need to develop yourself.

ALSA: cannot recovery from underrun, prepare failed: Broken pipe

I'm writing a program that reads from two mono ALSA devices and writes them to one stereo ALSA device.
I use three threads and ping-pong buffer to manage them. Two reading threads and one writing threads. Their configurations are as follows:
// Capture ALSA device
alsaBufferSize = 16384;
alsaCaptureChunkSize = 4096;
bitsPerSample = 16;
samplingFrequency = 24000;
numOfChannels = 1;
block = true;
accessType = SND_PCM_ACCESS_RW_INTERLEAVED;
// Playback device (only list params that are different from above)
alsaBufferSize = 16384 * 2;
numOfChannels = 2;
accessType = SND_PCM_ACCESS_RW_NON_INTERLEAVED;
Two reading threads would write ping buffer and then pong buffer. The writing thread would wait for any of two buffer ready, lock it, read from it, and then unlock it.
But when I run this program, xrun appears and can't be recovered.
ALSA lib pcm.c:7316:(snd_pcm_recover) underrun occurred
ALSA lib pcm.c:7319:(snd_pcm_recover) cannot recovery from underrun, prepare failed: Broken pipe
Below is my code for writing to ALSA playback device:
bool CALSAWriter::writen(uint8_t** a_pOutputBuffer, uint32_t a_rFrames)
{
bool ret = false;
// 1. write audio chunk from ALSA
const snd_pcm_sframes_t alsaCaptureChunkSize = static_cast<snd_pcm_sframes_t>(a_rFrames); //(m_pALSACfg->alsaCaptureChunkSize);
const snd_pcm_sframes_t writenFrames = snd_pcm_writen(m_pALSAHandle, (void**)a_pOutputBuffer, alsaCaptureChunkSize);
if (0 < writenFrames)
{// write succeeded
ret = true;
}
else
{// write failed
logPrint("CALSAWriter WRITE FAILED for writen farmes = %d ", writenFrames);
ret = false;
const int alsaReadError = static_cast<int>(writenFrames);// alsa error is of int type
if (ALSA_OK == snd_pcm_recover(m_pALSAHandle, alsaReadError, 0))
{// recovery succeeded
a_rFrames = 0;// only recovery was done, no write at all was done
}
else
{
logPrint("CALSAWriter: failed to recover from ALSA write error: %s (%i)", snd_strerror(alsaReadError), alsaReadError);
ret = false;
}
}
// 2. check current buffer load
snd_pcm_sframes_t framesInBuffer = 0;
snd_pcm_sframes_t delayedFrames = 0;
snd_pcm_avail_delay(m_pALSAHandle, &framesInBuffer, &delayedFrames);
// round to nearest int, cast is safe, buffer size is no bigger than uint32_t
const int32_t ONE_HUNDRED_PERCENTS = 100;
const uint32_t bufferLoadInPercents = ONE_HUNDRED_PERCENTS *
static_cast<int32_t>(framesInBuffer) / static_cast<int32_t>(m_pALSACfg->alsaBufferSize);
logPrint("write: ALSA buffer percentage: %u, delayed frames: %d", bufferLoadInPercents, delayedFrames);
return ret;
}
Other diagnostic info:
02:53:00.465047 log info V 1 [write: ALSA buffer percentage: 75, delayed frames: 4096]
02:53:00.635758 log info V 1 [write: ALSA buffer percentage: 74, delayed frames: 4160]
02:53:00.805714 log info V 1 [write: ALSA buffer percentage: 74, delayed frames: 4152]
02:53:00.976781 log info V 1 [write: ALSA buffer percentage: 74, delayed frames: 4144]
02:53:01.147948 log info V 1 [write: ALSA buffer percentage: 0, delayed frames: 0]
02:53:01.317113 log error V 1 [CALSAWriter WRITE FAILED for writen farmes = -32 ]
02:53:01.317795 log error V 1 [CALSAWriter: failed to recover from ALSA write error: Broken pipe (-32)]
It took me about 3 days to find solution. Thanks for #CL. tips of "writen is called too late".
Issue:
Thread switching time is not constant.
Solution:
Insert an empty buffer before you invoke "writen" at the first time. The time length of this buffer could be any value to avoid multi-thread switching. I set it to 150ms.
Or you can set thread priority to high while I can't do this. Refer to ALSA: Ways to prevent underrun for speaker.
Problem diagnostic:
The fact is:
"readi" return every 171ms (4096/24000 = 0.171). Reading thread set buffer as ready.
Once buffer is ready, "writen" is invoked in writing thread. The buffer is copied to ALSA playback device. And it'll take playback device 171ms to play this part of buffer.
If playback device has finished playing all the buffer, and no new buffer is written. "Underrun" occurred.
The real scenario here:
At 0ms, "readi" starts. At 171ms "readi" finishes.
At 172ms, (1ms for thread switching), "writen" starts. At 343ms, "underrun" shall happen, if no new buffer written.
At 171ms, "readi" starts again. At 342ms "readi" finishes.
At this time, thread switching takes 2ms. Before "writen" starts at 344ms, "underrun" occurred at 343ms
When CPU load is high, it's not guarantee how long "thread switching" shall take. That's why you can insert an empty buffer at first write. And turn scenario into:
At 0ms, "readi" starts. At 171ms "readi" finishes.
At 172ms, (1ms for thread switching), "writen" starts with an 150ms-long buffer. At 493ms, "underrun" shall happen, if no new buffer written.
At 171ms, "readi" starts again. At 342ms "readi" finishes.
At this time, thread switching takes 50ms. "writen" starts at 392ms, "underrun" won't occur at all.

how do i create a stereo mp3 file with latest version of ffmpeg?

I'm updating my code from the older version of ffmpeg (53) to the newer (54/55). Code that did work has now been deprecated or removed so i'm having problems updating it.
Previously I could create a stereo MP3 file using a sample format called:
SAMPLE_FMT_S16
That matched up perfectly with my source stream. This has now been replace with
AV_SAMPLE_FMT_S16
Which works fine for mono recordings but when I try to create a stereo MP3 file it bugs out at avcodec_open2 with:
"Specified sample_fmt is not supported."
Through trial and error I've found that using
AV_SAMPLE_FMT_S16P
...is accepted by avcodec_open2 but when I get through and create the MP3 file the sound is very distorted - it sounds about 2 octaves lower than usual with a massive hum in the background - here's an example recording:
http://hosting.ispyconnect.com/example.mp3
I've been told by the ffmpeg guys that this is because I now need to manually deinterleave my byte stream before calling:
avcodec_fill_audio_frame
How do I do that? I've tried using the swrescale library without success and i've tried manually feeding in L/R data into avcodec_fill_audio_frame but the results i'm getting are sounding exactly the same as without interleaving.
Here is my code for encoding:
void add_audio_sample( AudioWriterPrivateData^ data, BYTE* soundBuffer, int soundBufferSize)
{
libffmpeg::AVCodecContext* c = data->AudioStream->codec;
memcpy(data->AudioBuffer + data->AudioBufferSizeCurrent, soundBuffer, soundBufferSize);
data->AudioBufferSizeCurrent += soundBufferSize;
uint8_t* pSoundBuffer = (uint8_t *)data->AudioBuffer;
DWORD nCurrentSize = data->AudioBufferSizeCurrent;
libffmpeg::AVFrame *frame;
int got_packet;
int ret;
int size = libffmpeg::av_samples_get_buffer_size(NULL, c->channels,
data->AudioInputSampleSize,
c->sample_fmt, 1);
while( nCurrentSize >= size) {
frame=libffmpeg::avcodec_alloc_frame();
libffmpeg::avcodec_get_frame_defaults(frame);
frame->nb_samples = data->AudioInputSampleSize;
ret = libffmpeg::avcodec_fill_audio_frame(frame, c->channels, c->sample_fmt, pSoundBuffer, size, 1);
if (ret<0)
{
throw gcnew System::IO::IOException("error filling audio");
}
//audio_pts = (double)audio_st->pts.val * audio_st->time_base.num / audio_st->time_base.den;
libffmpeg::AVPacket pkt = { 0 };
libffmpeg::av_init_packet(&pkt);
ret = libffmpeg::avcodec_encode_audio2(c, &pkt, frame, &got_packet);
if (ret<0)
throw gcnew System::IO::IOException("error encoding audio");
if (got_packet) {
pkt.stream_index = data->AudioStream->index;
if (pkt.pts != AV_NOPTS_VALUE)
pkt.pts = libffmpeg::av_rescale_q(pkt.pts, c->time_base, c->time_base);
if (pkt.duration > 0)
pkt.duration = av_rescale_q(pkt.duration, c->time_base, c->time_base);
pkt.flags |= AV_PKT_FLAG_KEY;
if (libffmpeg::av_interleaved_write_frame(data->FormatContext, &pkt) != 0)
throw gcnew System::IO::IOException("unable to write audio frame.");
}
nCurrentSize -= size;
pSoundBuffer += size;
}
memcpy(data->AudioBuffer, data->AudioBuffer + data->AudioBufferSizeCurrent - nCurrentSize, nCurrentSize);
data->AudioBufferSizeCurrent = nCurrentSize;
}
Would love to hear any ideas - I've been trying to get this working for 3 days now :(
you don't want to increase pSoundBuffer if a frame hasn't been fully encoded (e.g. got_packet isn't set to true) as no memory has been written yet. Also, you are allocating a frame during each loop: there's no need for that, you can re-use the same AVFrame over an over. Your code is also leaking as you never free the AVFrame.
I wrote a code as part of MythTV that encode audio to AC3.
This also do what you were looking for: deinterleave the content.
https://github.com/MythTV/mythtv/blob/476b2a826d43fca5e658ebe787c3cb1ec2334f98/mythtv/libs/libmyth/audio/audiooutputdigitalencoder.cpp#L178
I know this question is old, but for posterity: I'm working on some audio resampling code, and after I arrived at an audio sounding very similar to the mp3 the author linked, I identified the cause as being a mismatch in audio sampling rate between the input the resampler expects and the actual data.