I am having an issue again with ffmpeg, I'm a newbie with ffmpeg, and I can't find a good tutorial up to date...
This time, when I play a video with ffmpeg, it plays too fast, ffmpeg is ignoring the FPS, I don't want to handle that with a thread sleep, because the videos have differents FPS's.
I created a thread, there you can find the loop:
AVPacket framepacket;
while(av_read_frame(formatContext,&framepacket)>= 0){
pausecontrol.lock();
// Is it a video or audio frame¿?
if(framepacket.stream_index==gotVideoCodec){
int framereaded;
// Video? Ok
avcodec_decode_video2(videoCodecContext,videoFrame,&framereaded,&framepacket);
// Yeah, did we get it?
if(framereaded && doit){
AVRational millisecondbase = {1,1000};
int f_number = framepacket.dts;
int f_time = av_rescale_q(framepacket.dts,formatContext->streams[gotVideoCodec]->time_base,millisecondbase);
currentTime=f_time;
currentFrameNumber=f_number;
int stWidth = videoCodecContext->width;
int stHeight = videoCodecContext->height;
SwsContext *ctx = sws_getContext(stWidth, stHeight, videoCodecContext->pix_fmt, stWidth,
stHeight, PIX_FMT_RGB24, SWS_BICUBIC, NULL, NULL, NULL);
if(ctx!=0){
sws_scale(ctx,videoFrame->data,videoFrame->linesize,0,videoCodecContext->height,videoFrameRGB->data,videoFrameRGB->linesize);
QImage framecapsule=QImage(stWidth,stHeight,QImage::Format_RGB888);
for(int y=0;y<stHeight;y++){
memcpy(framecapsule.scanLine(y),videoFrameRGB->data[0]+y*videoFrameRGB->linesize[0],stWidth*3);
}
emit newFrameReady(framecapsule);
sws_freeContext(ctx);
}
}
}
if(framepacket.stream_index==gotAudioCodec){
// Audio? Ok
}
pausecontrol.unlock();
av_free_packet(&framepacket);
}
Any Idea?
The simplest solution is to use a delay based on the FPS value
firstFrame = true;
for(;;)
{
// decoding, color conversion, etc.
if (!firstFrame)
{
const double frameDuration = 1000.0 / frameRate;
duration_t actualDelay = get_local_time() - lastTime;
if (frameDuration > actualDelay)
sleep(frameDuration - actualDelay);
}
else
firstFrame = false;
emit newFrameReady(framecapsule);
lastTime = get_local_time();
}
get_local_time() and duration_t is abstract.
A more accurate method is to use a time stamp for each frame, but the idea is the same
Related
I want to make a common video player with opencv C++. Users should be able to freely move between frames of the video(commonly slider). So I simply wrote and tested two methods and both has problems.
My first solution:
int frameIdx = 0; /// This is accessed by other thread
cv::VideoCapture cap("video.mp4");
while (true) {
cv::Mat frame;
cap.set(cv::CAP_PROP_POS_FRAMES, frameIdx);
cap.read(frame);
showFrameToWindow(frame);
frameIdx++;
}
My second solution:
int frameIdx = 0; /// This is accessed by other thread
std::vector<cv::Mat> buffer;
cv::VideoCapture cap("video.mp4")
while (true) {
cv::Mat frame;
cap >> frame;
if (frame.empty()) break;
buffer.push_back(frame);
}
while (true) {
cv::Mat frame = buffer[frameIdx].clone();
showFrameToWindow(frame);
frameIdx++;
}
The first solution is too slow. I think there is an overhead cap.read(cv::Mat). It's not possible to play video more than 20~30fps in my computer.
The second solution, it is satisfactory for speed, but it requires a lot of memory space.
So I imagine what if I change std::vector to std::queue of the buffer, limit its size, and update it in other thread while playing the video.
I'm not sure it's gonna work, and I wonder there's an common algorithm to seek in large video file. Any comments will save me. Thanks.
I developed my second solution to limitied size of frame buffer and handling it with other thread.
/// These variables are accessed from all threads.
#define BUF_MAX 64
bool s_videoFileLoaded;
double s_videoFileHz; // milliseconds per frame
int s_videoFileFrameCount;
bool s_seek;
int s_seekingIndex; // indexing video file, binded with UI
std::queue<cv::Mat> s_frameBuffer;
std::queue<int> s_indexBuffer;
std::mutex s_mtx;
///
/// !!! When seeking slider in UI moved manually,
/// !!! #s_seek turns to True.
///
/// Start of main thread.
cv::VideoCapture cap("video.mp4");
s_videoFileLoaded = true;
s_videoFileFrameCount = cap.get(cv::CAP_PROP_FRAME_COUNT);
s_videoFileHz = 1000.0 / cap.get(cv::CAP_PROP_FPS);
s_seekingIndex = 0;
runThread( std::bind(&_videoHandlingThreadLoop, cap) );
Timer timer;
while (s_videoFileLoaded) {
timer.restart();
{
std::lock_guard<std::mutex> lock(s_mtx);
if (s_frameBuffer.empty())
continue;
cv::Mat frame = s_frameBuffer.front();
s_seekingIndex = s_indexBuffer.front();
s_frameBuffer.pop();
s_indexBuffer.pop();
showFrameToWindow(frame);
}
int remain = s_videoFileHz - timer.elapsed();
if (remain > 0) Sleep(remain);
}
void _videoHandlingThreadLoop(cv::VideoCapture& cap) {
s_seek = true;
int frameIndex = -1;
while (s_videoFileLoaded) {
if (s_frameBuffer.size() > BUF_MAX) {
Sleep(s_videoFileHz * BUF_MAX);
continue;
}
// Check whether it's time to 'seeking'.
if (s_seek) {
std::lock_guard<std::mutex> lock(s_mtx);
// Clear buffer
s_frameBuffer = std::queue<cv::Mat>();
s_indexBuffer = std::queue<int>();
frameIndex = s_seekingIndex;
cap.set(cv::CAP_PROP_POS_FRAMES, frameIndex);
s_seek = false;
}
// Read frame from the file and push to buffer.
cv::Mat frame;
if (cap.read(frame)) {
std::lock_guard<std::mutex> lock(s_mtx);
s_frameBuffer.push(frame);
s_indexBuffer.push(frameIndex);
frameIndex++;
}
// Check whether the frame is end of the file.
if (frameIndex >= s_videoFileFrameCount) {
s_seekingIndex = 0;
s_seek = true;
}
}
}
and this worked. I could play the video file with stable playback speed. But still had some lag when seeking manually.
I'm trying to encode images from a webcam with libx265 (libx264 tried earlier) ...The webcam can not shoot with stable FPS because of the different amount of light entering the matrix and, as a result, different delays. Therefore, I count the fps and dts of the incoming frame and set these values for the corresponding parameters of the x265_image object, and init the encoder fpsNum with 1000 and fpsDenom with 1 (for millisecond timebase).
The problem is that the encoder ignores pts and dts of input image and encodes at 1000 fps! The same trick with timebase produces smooth record with libvpx. Why it does not work with x264/x265 codecs?
Here is parameters initialization:
...
error = (x265_param_default_preset(param, "fast", "zerolatency") != 0);
if(!error){
param->sourceWidth = width;
param->sourceHeight = height;
param->frameNumThreads = 1;
param->fpsNum = 1000;
param->fpsDenom = 1;
// Intra refres:
param->keyframeMax = 15;
param->intraRefine = 1;
// Rate control:
param->rc.rateControlMode = X265_RC_CQP;
param->rc.rfConstant = 12;
param->rc.rfConstantMax = 48;
// For streaming:
param->bRepeatHeaders = 1;
param->bAnnexB = 1;
encoder = x265_encoder_open(param);
...
}
...
Here is frame adding function:
bool hevc::Push(unsigned char *data){
if(!error){
std::lock_guard<std::mutex> lock(m_framestack);
if( timer > 0){
framestack.back()->dts = clock() - timer;
timer+= framestack.back()->dts;
}
else{timer = clock();}
x265_picture *picture = x265_picture_alloc();
if( picture){
x265_picture_init(param, picture);
picture->height = param->sourceHeight;
picture->stride[0] = param->sourceWidth;
picture->stride[1] = picture->stride[2] = picture->stride[0] / 2;
picture->planes[0] = new char[ luma_size];
picture->planes[1] = new char[chroma_size];
picture->planes[2] = new char[chroma_size];
colorspaces::BGRtoI420(param->sourceWidth, param->sourceHeight, data, (byte*)picture->planes[0], (byte*)picture->planes[1], (byte*)picture->planes[2]);
picture->pts = picture->dts = 0;
framestack.emplace_back(picture);
}
else{error = true;}
}
return !error;
}
Global PTS is increasing right after x265_encoder_encode call:
pts+= pic_in->dts; and sets as a pts of new image from framestack queue when it comes to encoder.
Can the x265/x264 codecs encode at variable fps? How to configure it if yes?
I don't know about x265 but in x264 to encode variable frame rate (VFR) video you should enable x264_param_t.b_vfr_input option which was disabled by your zerolatency tuning (VFR encode need 1 frame latency). Also at least in x264 timebase should be in i_timebase_num/i_timebase_den and i_fps_num/i_fps_den to be average fps (or keep default 25/1 if you don't know fps) or you will broke ratecontrol.
I'm trying to encode a webcam frames with libx264 in realtime, and face with one problem - the resulting video length is exactly what I set, but camera is delays somtimes and the real capture time is more than video length. As a result the picture in video changes to fast.I think it is due to constant FPS in x264 settings, so I need to make it dynamic somehow. Is it possible? If I wrong about FPS, so what I need to do, to synchronize capturing and writing?
Also I would like to know what are the optimal encoder parameters for streaming via internet and for recording to disk (the client is streaming from camera or screen, and the server is recording)?
Here is console logs screenshot and my code:
#include <stdint.h>
#include "stringf.h"
#include "Capture.h"
#include "x264.h"
int main( int argc, char **argv ){
Camera instance;
if(!instance.Enable(0)){printf("Camera not available\n");return 1;}
// Initializing metrics and buffer of frame
unsigned int width, height, size = instance.GetMetrics(width, height);
unsigned char *data = (unsigned char *)malloc(size);
// Setting encoder (I'm not sure about all parameters)
x264_param_t param;
x264_param_default_preset(¶m, "ultrafast", "zerolatency");
param.i_threads = 1;
param.i_width = width;
param.i_height = height;
param.i_fps_num = 20;
param.i_fps_den = 1;
// Intra refres:
param.i_keyint_max = 8;
param.b_intra_refresh = 1;
// Rate control:
param.rc.i_rc_method = X264_RC_CRF;
param.rc.f_rf_constant = 25;
param.rc.f_rf_constant_max = 35;
// For streaming:
param.b_repeat_headers = 1;
param.b_annexb = 1;
x264_param_apply_profile(¶m, "baseline");
x264_t* encoder = x264_encoder_open(¶m);
int seconds, expected_time, operation_start, i_nals, frame_size, frames_count;
expected_time = 1000/param.i_fps_num;
operation_start = 0;
seconds = 1;
frames_count = param.i_fps_num * seconds;
int *Timings = new int[frames_count];
x264_picture_t pic_in, pic_out;
x264_nal_t* nals;
x264_picture_alloc(&pic_in, X264_CSP_I420, param.i_width, param.i_height);
// Capture-Encode-Write loop
for(int i = 0; i < frames_count; i++){
operation_start = GetTickCount();
size = instance.GrabBGR(&data);
instance.BGRtoI420(data, &pic_in.img.plane[0], &pic_in.img.plane[1], &pic_in.img.plane[2], param.i_width, param.i_height);
frame_size = x264_encoder_encode(encoder, &nals, &i_nals, &pic_in, &pic_out);
if( frame_size > 0){
stringf::WriteBufferToFile("test.h264",std::string(reinterpret_cast<char*>(nals->p_payload), frame_size),1);
}
Timings[i] = GetTickCount() - operation_start;
}
while( x264_encoder_delayed_frames( encoder ) ){ // Flush delayed frames
frame_size = x264_encoder_encode(encoder, &nals, &i_nals, NULL, &pic_out);
if( frame_size > 0 ){stringf::WriteBufferToFile("test.h264",std::string(reinterpret_cast<char*>(nals->p_payload), frame_size),1);}
}
unsigned int total_time = 0;
printf("Expected operation time was %d ms per frame at %u FPS\n",expected_time, param.i_fps_num);
for(unsigned int i = 0; i < frames_count; i++){
total_time += Timings[i];
printf("Frame %u takes %d ms\n",(i+1), Timings[i]);
}
printf("Record takes %u ms\n",total_time);
free(data);
x264_encoder_close( encoder );
x264_picture_clean( &pic_in );
return 0;
}
The capture takes 1453 ms and the output file plays exactly 1 sec.
So, in general, the video length must be the same as a capture time, but not as encoder "wants".How to do it?
I am trying to create a very simple webm(vp8/opus) encoder, however I can not get the audio to work.
ffprobe does detect the file format and duration
Stream #1:0(eng): Audio: opus, 48000 Hz, mono, fltp (default)
The video can be played just fine in VLC and Chrome, but with no audio, for some reason the audio input bitrate is always 0
Most of the audio encoding code was copied from
https://github.com/fnordware/AdobeWebM/blob/master/src/premiere/WebM_Premiere_Export.cpp
Here is the relevant code:
static const long long kTimeScale = 1000000000LL;
MkvWriter writer;
writer.Open("video.webm");
Segment mux_seg;
mux_seg.Init(&writer);
// VPX encoding...
int16_t pcm[SAMPLES];
uint64_t audio_track_id = mux_seg.AddAudioTrack(SAMPLE_RATE, 1, 0);
mkvmuxer::AudioTrack *audioTrack = (mkvmuxer::AudioTrack*)mux_seg.GetTrackByNumber(audio_track_id);
audioTrack->set_codec_id(mkvmuxer::Tracks::kOpusCodecId);
audioTrack->set_seek_pre_roll(80000000);
OpusEncoder *encoder = opus_encoder_create(SAMPLE_RATE, 1, OPUS_APPLICATION_AUDIO, NULL);
opus_encoder_ctl(encoder, OPUS_SET_BITRATE(64000));
opus_int32 skip = 0;
opus_encoder_ctl(encoder, OPUS_GET_LOOKAHEAD(&skip));
audioTrack->set_codec_delay(skip * kTimeScale / SAMPLE_RATE);
mux_seg.CuesTrack(audio_track_id);
uint64_t currentAudioSample = 0;
uint64_t opus_ts = 0;
while(has_frame) {
int bytes = opus_encode(encoder, pcm, SAMPLES, out, SAMPLES * 8);
opus_ts = currentAudioSample * kTimeScale / SAMPLE_RATE;
mux_seg.AddFrame(out, bytes, audio_track_id, opus_ts, true);
currentAudioSample += SAMPLES;
}
opus_encoder_destroy(encoder);
mux_seg.Finalize();
writer.Close();
Update #1:
It seems that the problem is that WebM requires the audio and video tracks to be interlaced.
However I can not figure out how to sync the audio.
Should I calculate the frame duration, then encode the equivalent audio samples?
The problem was that I was missing the OGG header data, and the audio frames timestamps were not accurate.
to complete the answer here is the pseudo code for the encoder.
const int kTicksPerSecond = 1000000000; // webm timescale
const int kTimeScale = kTicksPerSecond / FPS;
const int kTwoNanoSeconds = 1000000000;
init_opus_encoder();
audioTrack->set_seek_pre_roll(80000000);
audioTrack->set_codec_delay(opus_preskip);
audioTrack->SetCodecPrivate(ogg_header_data, ogg_header_size);
while(has_video_frame) {
encode_vpx_frame();
video_pts = frame_index * kTimeScale;
muxer_segment.addFrame(frame_packet_data, packet_length, video_track_id, video_pts, packet_flags);
// fill the video frames gap with OPUS audio samples
while(audio_pts < video_pts + kTimeScale) {
encode_opus_frame();
muxer_segment.addFrame(opus_frame_data, opus_frame_data_length, audio_track_id, audio_pts, true /* keyframe */);
audio_pts = curr_audio_samples * kTwoNanoSeconds / 48000;
curr_audio_samples += 960;
}
}
I'm currently using these settings with OpenAL and recording from a Mic:
BUFFERSIZE 4410
FREQ 22050 // Sample rate
CAP_SIZE 10000 // How much to capture at a time (affects latency)
AL_FORMAT_MONO16
Is it possible to go lower in recording quality? I've tried reducing the sample rate but the end result is a faster playback speed.
Alright, so this is some of the most hacky code I've ever written, and I truly hope no one in their right mind ever uses it in production... just sooooo many bad things.
But to answer your question, I've been able to get the quality down to 8bitMono recording at 11025. However, everything I've recorded from my mic comes with significant amounts of static, and I'm not entirely sure I know why. I've generated 8bit karplus-strong string plucks that sound fantastic, so it could just be my recording device.
#include <AL/al.h>
#include <AL/alc.h>
#include <conio.h>
#include <stdio.h>
#include <vector>
#include <time.h>
void sleep( clock_t wait )
{
clock_t goal;
goal = wait + clock();
while( goal > clock() )
;
}
#define BUFFERSIZE 4410
const int SRATE = 11025;
int main()
{
std::vector<ALchar> vBuffer;
ALCdevice *pDevice = NULL;
ALCcontext *pContext = NULL;
ALCdevice *pCaptureDevice;
const ALCchar *szDefaultCaptureDevice;
ALint iSamplesAvailable;
ALchar Buffer[BUFFERSIZE];
ALint iDataSize = 0;
ALint iSize;
// NOTE : This code does NOT setup the Wave Device's Audio Mixer to select a recording input
// or a recording level.
pDevice = alcOpenDevice(NULL);
pContext = alcCreateContext(pDevice, NULL);
alcMakeContextCurrent(pContext);
printf("Capture Application\n");
if (pDevice == NULL)
{
printf("Failed to initialize OpenAL\n");
//Shutdown code goes here
return 0;
}
// Check for Capture Extension support
pContext = alcGetCurrentContext();
pDevice = alcGetContextsDevice(pContext);
if (alcIsExtensionPresent(pDevice, "ALC_EXT_CAPTURE") == AL_FALSE){
printf("Failed to detect Capture Extension\n");
//Shutdown code goes here
return 0;
}
// Get list of available Capture Devices
const ALchar *pDeviceList = alcGetString(NULL, ALC_CAPTURE_DEVICE_SPECIFIER);
if (pDeviceList){
printf("\nAvailable Capture Devices are:-\n");
while (*pDeviceList)
{
printf("%s\n", pDeviceList);
pDeviceList += strlen(pDeviceList) + 1;
}
}
// Get the name of the 'default' capture device
szDefaultCaptureDevice = alcGetString(NULL, ALC_CAPTURE_DEFAULT_DEVICE_SPECIFIER);
printf("\nDefault Capture Device is '%s'\n\n", szDefaultCaptureDevice);
pCaptureDevice = alcCaptureOpenDevice(szDefaultCaptureDevice, SRATE, AL_FORMAT_MONO8, BUFFERSIZE);
if (pCaptureDevice)
{
printf("Opened '%s' Capture Device\n\n", alcGetString(pCaptureDevice, ALC_CAPTURE_DEVICE_SPECIFIER));
// Start audio capture
alcCaptureStart(pCaptureDevice);
// Wait for any key to get pressed before exiting
while (!_kbhit())
{
// Release some CPU time ...
sleep(1);
// Find out how many samples have been captured
alcGetIntegerv(pCaptureDevice, ALC_CAPTURE_SAMPLES, 1, &iSamplesAvailable);
printf("Samples available : %d\r", iSamplesAvailable);
// When we have enough data to fill our BUFFERSIZE byte buffer, grab the samples
if (iSamplesAvailable > (BUFFERSIZE / 2))
{
// Consume Samples
alcCaptureSamples(pCaptureDevice, Buffer, BUFFERSIZE / 2);
// Write the audio data to a file
//fwrite(Buffer, BUFFERSIZE, 1, pFile);
for(int i = 0; i < BUFFERSIZE / 2; i++){
vBuffer.push_back(Buffer[i]);
}
// Record total amount of data recorded
iDataSize += BUFFERSIZE / 2;
}
}
// Stop capture
alcCaptureStop(pCaptureDevice);
// Check if any Samples haven't been consumed yet
alcGetIntegerv(pCaptureDevice, ALC_CAPTURE_SAMPLES, 1, &iSamplesAvailable);
while (iSamplesAvailable)
{
if (iSamplesAvailable > (BUFFERSIZE / 2))
{
alcCaptureSamples(pCaptureDevice, Buffer, BUFFERSIZE / 2);
for(int i = 0; i < BUFFERSIZE/2; i++){
vBuffer.push_back(Buffer[i]);
}
iSamplesAvailable -= (BUFFERSIZE / 2);
iDataSize += BUFFERSIZE;
}
else
{
//TODO::Fix
alcCaptureSamples(pCaptureDevice, Buffer, iSamplesAvailable);
for(int i = 0; i < BUFFERSIZE/2; i++){
vBuffer.push_back(Buffer[i]);
}
iDataSize += iSamplesAvailable * 2;
iSamplesAvailable = 0;
}
}
alcCaptureCloseDevice(pCaptureDevice);
}
//TODO::Make less hacky
ALuint bufferID; // The OpenAL sound buffer ID
ALuint sourceID; // The OpenAL sound source
// Create sound buffer and source
alGenBuffers(1, &bufferID);
alGenSources(1, &sourceID);
alListener3f(AL_POSITION, 0.0f, 0.0f, 0.0f);
alSource3f(sourceID, AL_POSITION, 0.0f, 0.0f, 0.0f);
alBufferData(bufferID, AL_FORMAT_MONO8, &vBuffer[0], static_cast<ALsizei>(vBuffer.size()), SRATE);
// Attach sound buffer to source
alSourcei(sourceID, AL_BUFFER, bufferID);
// Finally, play the sound!!!
alSourcePlay(sourceID);
printf("Press any key to continue...");
getchar();
return 0;
}
As you can see from:
alBufferData(bufferID, AL_FORMAT_MONO8, &vBuffer[0], static_cast<ALsizei>(vBuffer.size()), SRATE);
I've verified that this is the case. For demonstration code I'm okay throwing this example out there, but I wouldn't ever use it in production.
I'm not sure but for me FREQ is the output frequency but not the sample rate.
define sampling-rate 48000
see this link : http://supertux.lethargik.org/wiki/OpenAL_Configuration