Screen recorder [closed] - c++

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I'm interested in a library(for windows) written in Delphi/Pascal or C++ that allows me to record(to a video format) desktop screen, requirements:
must be able to specify the frame rate, or at least be able to record # 5fps;
must be open source or free;
the output format could be almost any, but the quality must be good enough to be able to read text from the recording;
pluses, if possible:
option to record without colors(grayscale);
multiple display aware;
cross platform(Windows & Linux, other platforms would be nice as well, but not necessary);
If by any chance, I didn't explain something right, please feel free to ask so I can rephrase or give more details, etc.

FFMPEG supports screen capturing(casting) and is cross platform.

You could try Windows Media Encoder (freeware, wmv/asf only) or VLC (GPL, Win/OSX/Linux). Be aware that "hardware accelerated" views (Direct3D & OpenGL rendering for example) will not be available, and some quality loss will be experienced due to video compression. How much you lose will depend on your settings (codec, bitrate, resolution, etc)
Example: How to Stream your Desktop using VLC
vlc screen:// :screen-fps=30 :screen-caching=100 --sout '#transcode{vcodec=mp4v,vb=4096,acodec=mpga,ab=256,scale=1,width=1280,height=800}:rtp{dst=192.168.1.2,port=1234,access=udp,mux=ts}'
You can find more options in VLC documentation, for saving your stream as a file for example.

This is the one I use with Delphi, it's called "Professional Screen Camera Component". Admittedly I had to make some changes to support unicode versions (replace PChar with PAnsiChar, replace Char with AnsiChar).
It'll happily record away at whatever framerate I set it to, will encode the video with whatever codec I specify (if I want it to), and allows you to specify the region you wish to record.
Comes with a demo project too!
Oh, and it's free/open source!

It is probably overkill for your needs, but the video grabber component from DataStead can also record screen activity and save the output as video file. See http://www.datastead.com/products/tvideograbber/overview.html. I'm not associated with DataStead, but have been a customer for a few years and it works great.

FFmpeg can be used to capture the screen.
watch the screen recorded video demo using FFMPEG : https://www.youtube.com/watch?v=a31bBY3HuxE
Container format : MP4
Codec : MPEG4
Follow the steps to record the screen in video using FFmpeg and other libraries.
Initialize required registers
use x11grab(for linux OS) in av_find_input_format
mention the posistion to capture the video in screen (Eg. ":0.0+10,250" in av_format_open_input)
Now go for regular video parameters initialization and memory allocation.
start capturing the frames and store it in a file.
Finally, release the allocated resources once completed !.
below code is written in c++ and uses linux(ubuntu) platform video format is in mp4 format.
// sample code to record the computer screen !
#ifndef SCREENRECORDER_H
#define SCREENRECORDER_H
#include <iostream>
#include <cstdio>
#include <cstdlib>
#include <fstream>
#include <cstring>
#include <math.h>
#include <string.h>
#define __STDC_CONSTANT_MACROS
//FFMPEG LIBRARIES
extern "C"
{
#include "libavcodec/avcodec.h"
#include "libavcodec/avfft.h"
#include "libavdevice/avdevice.h"
#include "libavfilter/avfilter.h"
#include "libavfilter/avfiltergraph.h"
#include "libavfilter/buffersink.h"
#include "libavfilter/buffersrc.h"
#include "libavformat/avformat.h"
#include "libavformat/avio.h"
// libav resample
#include "libavutil/opt.h"
#include "libavutil/common.h"
#include "libavutil/channel_layout.h"
#include "libavutil/imgutils.h"
#include "libavutil/mathematics.h"
#include "libavutil/samplefmt.h"
#include "libavutil/time.h"
#include "libavutil/opt.h"
#include "libavutil/pixdesc.h"
#include "libavutil/file.h"
// lib swresample
#include "libswscale/swscale.h"
}
class ScreenRecorder
{
private:
AVInputFormat *pAVInputFormat;
AVOutputFormat *output_format;
AVCodecContext *pAVCodecContext;
AVFormatContext *pAVFormatContext;
AVFrame *pAVFrame;
AVFrame *outFrame;
AVCodec *pAVCodec;
AVCodec *outAVCodec;
AVPacket *pAVPacket;
AVDictionary *options;
AVOutputFormat *outAVOutputFormat;
AVFormatContext *outAVFormatContext;
AVCodecContext *outAVCodecContext;
AVStream *video_st;
AVFrame *outAVFrame;
const char *dev_name;
const char *output_file;
double video_pts;
int out_size;
int codec_id;
int value;
int VideoStreamIndx;
public:
ScreenRecorder();
~ScreenRecorder();
int openCamera();
int init_outputfile();
int collectFrames();
};
#endif
using namespace std;
ScreenRecorder::ScreenRecorder()
{
cout<<"\n\n Registering required functions...";
av_register_all();
avcodec_register_all();
avdevice_register_all();
cout<<"\n\n Registered successfully...";
}
ScreenRecorder::~ScreenRecorder()
{
avformat_close_input(&pAVFormatContext);
if( !pAVFormatContext )
{
cout<<"\n\n1.Success : avformat_close_input()";
}
else
{
cout<<"\n\nError : avformat_close_input()";
}
avformat_free_context(pAVFormatContext);
if( !pAVFormatContext )
{
cout<<"\n\n2.Success : avformat_free_context()";
}
else
{
cout<<"\n\nError : avformat_free_context()";
}
cout<<"\n\n---------------Successfully released all resources------------------\n\n\n";
cout<<endl;
cout<<endl;
cout<<endl;
}
int ScreenRecorder::collectFrames()
{
int flag;
int frameFinished;
//when you decode a single packet, you still don't have information enough to have a frame [depending on the type of codec, some of them //you do], when you decode a GROUP of packets that represents a frame, then you have a picture! that's why frameFinished will let //you know you decoded enough to have a frame.
int frame_index = 0;
value = 0;
pAVPacket = (AVPacket *)av_malloc(sizeof(AVPacket));
av_init_packet(pAVPacket);
pAVFrame = av_frame_alloc();
if( !pAVFrame )
{
cout<<"\n\nError : av_frame_alloc()";
return -1;
}
outFrame = av_frame_alloc();//Allocate an AVFrame and set its fields to default values.
if( !outFrame )
{
cout<<"\n\nError : av_frame_alloc()";
return -1;
}
int video_outbuf_size;
int nbytes = av_image_get_buffer_size(outAVCodecContext->pix_fmt,outAVCodecContext->width,outAVCodecContext->height,32);
uint8_t *video_outbuf = (uint8_t*)av_malloc(nbytes);
if( video_outbuf == NULL )
{
cout<<"\n\nError : av_malloc()";
}
// Setup the data pointers and linesizes based on the specified image parameters and the provided array.
value = av_image_fill_arrays( outFrame->data, outFrame->linesize, video_outbuf , AV_PIX_FMT_YUV420P, outAVCodecContext->width,outAVCodecContext->height,1 ); // returns : the size in bytes required for src
if(value < 0)
{
cout<<"\n\nError : av_image_fill_arrays()";
}
SwsContext* swsCtx_ ;
// Allocate and return swsContext.
// a pointer to an allocated context, or NULL in case of error
// Deprecated : Use sws_getCachedContext() instead.
swsCtx_ = sws_getContext(pAVCodecContext->width,
pAVCodecContext->height,
pAVCodecContext->pix_fmt,
outAVCodecContext->width,
outAVCodecContext->height,
outAVCodecContext->pix_fmt,
SWS_BICUBIC, NULL, NULL, NULL);
int ii = 0;
int no_frames = 100;
cout<<"\n\nEnter No. of Frames to capture : ";
cin>>no_frames;
AVPacket outPacket;
int j = 0;
int got_picture;
while( av_read_frame( pAVFormatContext , pAVPacket ) >= 0 )
{
if( ii++ == no_frames )break;
if(pAVPacket->stream_index == VideoStreamIndx)
{
value = avcodec_decode_video2( pAVCodecContext , pAVFrame , &frameFinished , pAVPacket );
if( value < 0)
{
cout<<"Error : avcodec_decode_video2()";
}
if(frameFinished)// Frame successfully decoded :)
{
sws_scale(swsCtx_, pAVFrame->data, pAVFrame->linesize,0, pAVCodecContext->height, outFrame->data,outFrame->linesize);
av_init_packet(&outPacket);
outPacket.data = NULL; // packet data will be allocated by the encoder
outPacket.size = 0;
avcodec_encode_video2(outAVCodecContext , &outPacket ,outFrame , &got_picture);
if(got_picture)
{
if(outPacket.pts != AV_NOPTS_VALUE)
outPacket.pts = av_rescale_q(outPacket.pts, video_st->codec->time_base, video_st->time_base);
if(outPacket.dts != AV_NOPTS_VALUE)
outPacket.dts = av_rescale_q(outPacket.dts, video_st->codec->time_base, video_st->time_base);
printf("Write frame %3d (size= %2d)\n", j++, outPacket.size/1000);
if(av_write_frame(outAVFormatContext , &outPacket) != 0)
{
cout<<"\n\nError : av_write_frame()";
}
av_packet_unref(&outPacket);
} // got_picture
av_packet_unref(&outPacket);
} // frameFinished
}
}// End of while-loop
value = av_write_trailer(outAVFormatContext);
if( value < 0)
{
cout<<"\n\nError : av_write_trailer()";
}
//THIS WAS ADDED LATER
av_free(video_outbuf);
}
int ScreenRecorder::openCamera()
{
value = 0;
options = NULL;
pAVFormatContext = NULL;
pAVFormatContext = avformat_alloc_context();//Allocate an AVFormatContext.
pAVInputFormat = av_find_input_format("x11grab");
value = avformat_open_input(&pAVFormatContext, ":0.0+10,250", pAVInputFormat, NULL);
if(value != 0)
{
cout<<"\n\nError : avformat_open_input\n\nstopped...";
return -1;
}
value = av_dict_set( &options,"framerate","30",0 );
if(value < 0)
{
cout<<"\n\nError : av_dict_set(framerate , 30 , 0)";
return -1;
}
value = av_dict_set( &options, "preset", "medium", 0 );
if(value < 0)
{
cout<<"\n\nError : av_dict_set(preset , medium)";
return -1;
}
// value = avformat_find_stream_info(pAVFormatContext,NULL);
if(value < 0)
{
cout<<"\n\nError : avformat_find_stream_info\nstopped...";
return -1;
}
VideoStreamIndx = -1;
for(int i = 0; i < pAVFormatContext->nb_streams; i++ ) // find video stream posistion/index.
{
if( pAVFormatContext->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO )
{
VideoStreamIndx = i;
break;
}
} // End for-loop
if( VideoStreamIndx == -1)
{
cout<<"\n\nError : VideoStreamIndx = -1";
return -1;
}
// assign pAVFormatContext to VideoStreamIndx
pAVCodecContext = pAVFormatContext->streams[VideoStreamIndx]->codec;
pAVCodec = avcodec_find_decoder(pAVCodecContext->codec_id);
if( pAVCodec == NULL )
{
cout<<"\n\nError : avcodec_find_decoder()";
return -1;
}
value = avcodec_open2(pAVCodecContext , pAVCodec , NULL);//Initialize the AVCodecContext to use the given AVCodec.
if( value < 0 )
{
cout<<"\n\nError : avcodec_open2()";
return -1;
}
}
int ScreenRecorder::init_outputfile()
{
outAVFormatContext = NULL;
value = 0;
output_file = "output.mp4";
avformat_alloc_output_context2(&outAVFormatContext, NULL, NULL, output_file);
if (!outAVFormatContext)
{
cout<<"\n\nError : avformat_alloc_output_context2()";
return -1;
}
/*Returns the output format in the list of registered output formats which best matches the provided parameters, or returns NULL if there is no match.
*/
output_format = av_guess_format(NULL, output_file ,NULL);
if( !output_format )
{
cout<<"\n\nError : av_guess_format()";
return -1;
}
video_st = avformat_new_stream(outAVFormatContext ,NULL);
if( !video_st )
{
cout<<"\n\nError : avformat_new_stream()";
return -1;
}
outAVCodecContext = avcodec_alloc_context3(outAVCodec);
if( !outAVCodecContext )
{
cout<<"\n\nError : avcodec_alloc_context3()";
return -1;
}
outAVCodecContext = video_st->codec;
outAVCodecContext->codec_id = AV_CODEC_ID_MPEG4;// AV_CODEC_ID_MPEG4; // AV_CODEC_ID_H264 // AV_CODEC_ID_MPEG1VIDEO
outAVCodecContext->codec_type = AVMEDIA_TYPE_VIDEO;
outAVCodecContext->pix_fmt = AV_PIX_FMT_YUV420P;
outAVCodecContext->bit_rate = 400000; // 2500000
outAVCodecContext->width = 1920;
outAVCodecContext->height = 1080;
outAVCodecContext->gop_size = 3;
outAVCodecContext->max_b_frames = 2;
outAVCodecContext->time_base.num = 1;
outAVCodecContext->time_base.den = 30; // 15fps
if (codec_id == AV_CODEC_ID_H264)
{
av_opt_set(outAVCodecContext->priv_data, "preset", "slow", 0);
}
outAVCodec = avcodec_find_encoder(AV_CODEC_ID_MPEG4);
if( !outAVCodec )
{
cout<<"\n\nError : avcodec_find_encoder()";
return -1;
}
// Some container formats (like MP4) require global headers to be present
// Mark the encoder so that it behaves accordingly.
if ( outAVFormatContext->oformat->flags & AVFMT_GLOBALHEADER)
{
outAVCodecContext->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
}
value = avcodec_open2(outAVCodecContext, outAVCodec, NULL);
if( value < 0)
{
cout<<"\n\nError : avcodec_open2()";
return -1;
}
if ( !(outAVFormatContext->flags & AVFMT_NOFILE) )
{
if( avio_open2(&outAVFormatContext->pb , output_file , AVIO_FLAG_WRITE ,NULL, NULL) < 0 )
{
cout<<"\n\nError : avio_open2()";
}
}
if(!outAVFormatContext->nb_streams)
{
cout<<"\n\nError : Output file dose not contain any stream";
return -1;
}
value = avformat_write_header(outAVFormatContext , &options);
if(value < 0)
{
cout<<"\n\nError : avformat_write_header()";
return -1;
}
cout<<"\n\nOutput file information :\n\n";
av_dump_format(outAVFormatContext , 0 ,output_file ,1);
}
int main()
{
ScreenRecorder s_record;
s_record.openCamera();
s_record.init_outputfile();
s_record.collectFrames();
cout<<"\n\n---------EXIT_SUCCESS------------\n\n";
return 0;
}
/* to compile the code : g++ -Wno-format-zero-length -Wno-write-strings -L/home/abdullah/ffmpeg_build/lib/ -L/usr/lib/x86_64-linux-gnu/ -I/home/abdullah/ffmpeg_build/include/ -o ScreenRecorder ScreenRecorder.cpp -lavdevice -lavfilter -lswscale -lavformat -lavcodec -lavutil -lswresample -lm -lva -lpthread -lvorbis -lvpx -lopus -lz -lpostproc -ldl -lfdk-aac -lmp3lame -lvorbisenc -lvorbisfile -lx264 -ltheora -lx265 -ltheoraenc -ltheoradec -ldl -lrt -lbz2 -lasound -lSDL -lSDLmain -lSDL_ttf -lfreetype -lass -llzma -lftgl -lperl -lcrypto -lxcb -lxcb-shm -lxcb-xfixes -lao -lxcb-shape -lfftw3 */
Here's the screenshot.
complete working code in github link:

I haven't done this myself before, but when I googled around (as I'm sure you have), I ran into this:
http://www.codeproject.com/KB/GDI/barry_s_screen_capture.aspx
It looks as if it should do what you're asking reasonably easily (for Windows), and it has no license associated with it (as confirmed at the bottom). I don't believe its set up as a library, but I'm sure you could bind the interface to the sample WinCap functions into one with reasonable ease.

use screen capture lite
https://github.com/smasherprog/screen_capture_lite
This is a C++ library and cross-platform

Related

Why does adding audio stream to ffmpeg's libavcodec output container cause a crash?

As it stands, my project correctly uses libavcodec to decode a video, where each frame is manipulated (it doesn't matter how) and output to a new video. I've cobbled this together from examples found online, and it works. The result is a perfect .mp4 of the manipulated frames, minus the audio.
My problem is, when I try to add an audio stream to the output container, I get a crash in mux.c that I can't explain. It's in static int compute_muxer_pkt_fields(AVFormatContext *s, AVStream *st, AVPacket *pkt). Where st->internal->priv_pts->val = pkt->dts; is attempted, priv_pts is nullptr.
I don't recall the version number, but this is from a November 4, 2020 ffmpeg build from git.
My MediaContentMgr is much bigger than what I have here. I'm stripping out everything to do with the frame manipulation, so if I'm missing anything, please let me know and I'll edit.
The code that, when added, triggers the nullptr exception, is called out inline
The .h:
#ifndef _API_EXAMPLE_H
#define _API_EXAMPLE_H
#include <glad/glad.h>
#include <GLFW/glfw3.h>
#include "glm/glm.hpp"
extern "C" {
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavutil/avutil.h>
#include <libavutil/opt.h>
#include <libswscale/swscale.h>
}
#include "shader_s.h"
class MediaContainerMgr {
public:
MediaContainerMgr(const std::string& infile, const std::string& vert, const std::string& frag,
const glm::vec3* extents);
~MediaContainerMgr();
void render();
bool recording() { return m_recording; }
// Major thanks to "shi-yan" who helped make this possible:
// https://github.com/shi-yan/videosamples/blob/master/libavmp4encoding/main.cpp
bool init_video_output(const std::string& video_file_name, unsigned int width, unsigned int height);
bool output_video_frame(uint8_t* buf);
bool finalize_output();
private:
AVFormatContext* m_format_context;
AVCodec* m_video_codec;
AVCodec* m_audio_codec;
AVCodecParameters* m_video_codec_parameters;
AVCodecParameters* m_audio_codec_parameters;
AVCodecContext* m_codec_context;
AVFrame* m_frame;
AVPacket* m_packet;
uint32_t m_video_stream_index;
uint32_t m_audio_stream_index;
void init_rendering(const glm::vec3* extents);
int decode_packet();
// For writing the output video:
void free_output_assets();
bool m_recording;
AVOutputFormat* m_output_format;
AVFormatContext* m_output_format_context;
AVCodec* m_output_video_codec;
AVCodecContext* m_output_video_codec_context;
AVFrame* m_output_video_frame;
SwsContext* m_output_scale_context;
AVStream* m_output_video_stream;
AVCodec* m_output_audio_codec;
AVStream* m_output_audio_stream;
AVCodecContext* m_output_audio_codec_context;
};
#endif
And, the hellish .cpp:
#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>
#include <inttypes.h>
#include "media_container_manager.h"
MediaContainerMgr::MediaContainerMgr(const std::string& infile, const std::string& vert, const std::string& frag,
const glm::vec3* extents) :
m_video_stream_index(-1),
m_audio_stream_index(-1),
m_recording(false),
m_output_format(nullptr),
m_output_format_context(nullptr),
m_output_video_codec(nullptr),
m_output_video_codec_context(nullptr),
m_output_video_frame(nullptr),
m_output_scale_context(nullptr),
m_output_video_stream(nullptr)
{
// AVFormatContext holds header info from the format specified in the container:
m_format_context = avformat_alloc_context();
if (!m_format_context) {
throw "ERROR could not allocate memory for Format Context";
}
// open the file and read its header. Codecs are not opened here.
if (avformat_open_input(&m_format_context, infile.c_str(), NULL, NULL) != 0) {
throw "ERROR could not open input file for reading";
}
printf("format %s, duration %lldus, bit_rate %lld\n", m_format_context->iformat->name, m_format_context->duration, m_format_context->bit_rate);
//read avPackets (?) from the avFormat (?) to get stream info. This populates format_context->streams.
if (avformat_find_stream_info(m_format_context, NULL) < 0) {
throw "ERROR could not get stream info";
}
for (unsigned int i = 0; i < m_format_context->nb_streams; i++) {
AVCodecParameters* local_codec_parameters = NULL;
local_codec_parameters = m_format_context->streams[i]->codecpar;
printf("AVStream->time base before open coded %d/%d\n", m_format_context->streams[i]->time_base.num, m_format_context->streams[i]->time_base.den);
printf("AVStream->r_frame_rate before open coded %d/%d\n", m_format_context->streams[i]->r_frame_rate.num, m_format_context->streams[i]->r_frame_rate.den);
printf("AVStream->start_time %" PRId64 "\n", m_format_context->streams[i]->start_time);
printf("AVStream->duration %" PRId64 "\n", m_format_context->streams[i]->duration);
printf("duration(s): %lf\n", (float)m_format_context->streams[i]->duration / m_format_context->streams[i]->time_base.den * m_format_context->streams[i]->time_base.num);
AVCodec* local_codec = NULL;
local_codec = avcodec_find_decoder(local_codec_parameters->codec_id);
if (local_codec == NULL) {
throw "ERROR unsupported codec!";
}
if (local_codec_parameters->codec_type == AVMEDIA_TYPE_VIDEO) {
if (m_video_stream_index == -1) {
m_video_stream_index = i;
m_video_codec = local_codec;
m_video_codec_parameters = local_codec_parameters;
}
m_height = local_codec_parameters->height;
m_width = local_codec_parameters->width;
printf("Video Codec: resolution %dx%d\n", m_width, m_height);
}
else if (local_codec_parameters->codec_type == AVMEDIA_TYPE_AUDIO) {
if (m_audio_stream_index == -1) {
m_audio_stream_index = i;
m_audio_codec = local_codec;
m_audio_codec_parameters = local_codec_parameters;
}
printf("Audio Codec: %d channels, sample rate %d\n", local_codec_parameters->channels, local_codec_parameters->sample_rate);
}
printf("\tCodec %s ID %d bit_rate %lld\n", local_codec->name, local_codec->id, local_codec_parameters->bit_rate);
}
m_codec_context = avcodec_alloc_context3(m_video_codec);
if (!m_codec_context) {
throw "ERROR failed to allocate memory for AVCodecContext";
}
if (avcodec_parameters_to_context(m_codec_context, m_video_codec_parameters) < 0) {
throw "ERROR failed to copy codec params to codec context";
}
if (avcodec_open2(m_codec_context, m_video_codec, NULL) < 0) {
throw "ERROR avcodec_open2 failed to open codec";
}
m_frame = av_frame_alloc();
if (!m_frame) {
throw "ERROR failed to allocate AVFrame memory";
}
m_packet = av_packet_alloc();
if (!m_packet) {
throw "ERROR failed to allocate AVPacket memory";
}
}
MediaContainerMgr::~MediaContainerMgr() {
avformat_close_input(&m_format_context);
av_packet_free(&m_packet);
av_frame_free(&m_frame);
avcodec_free_context(&m_codec_context);
glDeleteVertexArrays(1, &m_VAO);
glDeleteBuffers(1, &m_VBO);
}
bool MediaContainerMgr::advance_frame() {
while (true) {
if (av_read_frame(m_format_context, m_packet) < 0) {
// Do we actually need to unref the packet if it failed?
av_packet_unref(m_packet);
continue;
//return false;
}
else {
if (m_packet->stream_index == m_video_stream_index) {
//printf("AVPacket->pts %" PRId64 "\n", m_packet->pts);
int response = decode_packet();
av_packet_unref(m_packet);
if (response != 0) {
continue;
//return false;
}
return true;
}
else {
printf("m_packet->stream_index: %d\n", m_packet->stream_index);
printf(" m_packet->pts: %lld\n", m_packet->pts);
printf(" mpacket->size: %d\n", m_packet->size);
if (m_recording) {
int err = 0;
//err = avcodec_send_packet(m_output_video_codec_context, m_packet);
printf(" encoding error: %d\n", err);
}
}
}
// We're done with the packet (it's been unpacked to a frame), so deallocate & reset to defaults:
/*
if (m_frame == NULL)
return false;
if (m_frame->data[0] == NULL || m_frame->data[1] == NULL || m_frame->data[2] == NULL) {
printf("WARNING: null frame data");
continue;
}
*/
}
}
int MediaContainerMgr::decode_packet() {
// Supply raw packet data as input to a decoder
// https://ffmpeg.org/doxygen/trunk/group__lavc__decoding.html#ga58bc4bf1e0ac59e27362597e467efff3
int response = avcodec_send_packet(m_codec_context, m_packet);
if (response < 0) {
char buf[256];
av_strerror(response, buf, 256);
printf("Error while receiving a frame from the decoder: %s\n", buf);
return response;
}
// Return decoded output data (into a frame) from a decoder
// https://ffmpeg.org/doxygen/trunk/group__lavc__decoding.html#ga11e6542c4e66d3028668788a1a74217c
response = avcodec_receive_frame(m_codec_context, m_frame);
if (response == AVERROR(EAGAIN) || response == AVERROR_EOF) {
return response;
} else if (response < 0) {
char buf[256];
av_strerror(response, buf, 256);
printf("Error while receiving a frame from the decoder: %s\n", buf);
return response;
} else {
printf(
"Frame %d (type=%c, size=%d bytes) pts %lld key_frame %d [DTS %d]\n",
m_codec_context->frame_number,
av_get_picture_type_char(m_frame->pict_type),
m_frame->pkt_size,
m_frame->pts,
m_frame->key_frame,
m_frame->coded_picture_number
);
}
return 0;
}
bool MediaContainerMgr::init_video_output(const std::string& video_file_name, unsigned int width, unsigned int height) {
if (m_recording)
return true;
m_recording = true;
advance_to(0L); // I've deleted the implmentation. Just seeks to beginning of vid. Works fine.
if (!(m_output_format = av_guess_format(nullptr, video_file_name.c_str(), nullptr))) {
printf("Cannot guess output format.\n");
return false;
}
int err = avformat_alloc_output_context2(&m_output_format_context, m_output_format, nullptr, video_file_name.c_str());
if (err < 0) {
printf("Failed to allocate output context.\n");
return false;
}
//TODO(P0): Break out the video and audio inits into their own methods.
m_output_video_codec = avcodec_find_encoder(m_output_format->video_codec);
if (!m_output_video_codec) {
printf("Failed to create video codec.\n");
return false;
}
m_output_video_stream = avformat_new_stream(m_output_format_context, m_output_video_codec);
if (!m_output_video_stream) {
printf("Failed to find video format.\n");
return false;
}
m_output_video_codec_context = avcodec_alloc_context3(m_output_video_codec);
if (!m_output_video_codec_context) {
printf("Failed to create video codec context.\n");
return(false);
}
m_output_video_stream->codecpar->codec_id = m_output_format->video_codec;
m_output_video_stream->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
m_output_video_stream->codecpar->width = width;
m_output_video_stream->codecpar->height = height;
m_output_video_stream->codecpar->format = AV_PIX_FMT_YUV420P;
// Use the same bit rate as the input stream.
m_output_video_stream->codecpar->bit_rate = m_format_context->streams[m_video_stream_index]->codecpar->bit_rate;
m_output_video_stream->avg_frame_rate = m_format_context->streams[m_video_stream_index]->avg_frame_rate;
avcodec_parameters_to_context(m_output_video_codec_context, m_output_video_stream->codecpar);
m_output_video_codec_context->time_base = m_format_context->streams[m_video_stream_index]->time_base;
//TODO(P1): Set these to match the input stream?
m_output_video_codec_context->max_b_frames = 2;
m_output_video_codec_context->gop_size = 12;
m_output_video_codec_context->framerate = m_format_context->streams[m_video_stream_index]->r_frame_rate;
//m_output_codec_context->refcounted_frames = 0;
if (m_output_video_stream->codecpar->codec_id == AV_CODEC_ID_H264) {
av_opt_set(m_output_video_codec_context, "preset", "ultrafast", 0);
} else if (m_output_video_stream->codecpar->codec_id == AV_CODEC_ID_H265) {
av_opt_set(m_output_video_codec_context, "preset", "ultrafast", 0);
} else {
av_opt_set_int(m_output_video_codec_context, "lossless", 1, 0);
}
avcodec_parameters_from_context(m_output_video_stream->codecpar, m_output_video_codec_context);
m_output_audio_codec = avcodec_find_encoder(m_output_format->audio_codec);
if (!m_output_audio_codec) {
printf("Failed to create audio codec.\n");
return false;
}
I've commented out all of the audio stream init beyond this next line, because this is where
the trouble begins. Creating this output stream causes the null reference I mentioned. If I
uncomment everything below here, I still get the null deref. If I comment out this line, the
deref exception vanishes. (IOW, I commented out more and more code until I found that this
was the trigger that caused the problem.)
I assume that there's something I'm doing wrong in the rest of the commented out code, that,
when fixed, will fix the nullptr and give me a working audio stream.
m_output_audio_stream = avformat_new_stream(m_output_format_context, m_output_audio_codec);
if (!m_output_audio_stream) {
printf("Failed to find audio format.\n");
return false;
}
/*
m_output_audio_codec_context = avcodec_alloc_context3(m_output_audio_codec);
if (!m_output_audio_codec_context) {
printf("Failed to create audio codec context.\n");
return(false);
}
m_output_audio_stream->codecpar->codec_id = m_output_format->audio_codec;
m_output_audio_stream->codecpar->codec_type = AVMEDIA_TYPE_AUDIO;
m_output_audio_stream->codecpar->format = m_format_context->streams[m_audio_stream_index]->codecpar->format;
m_output_audio_stream->codecpar->bit_rate = m_format_context->streams[m_audio_stream_index]->codecpar->bit_rate;
m_output_audio_stream->avg_frame_rate = m_format_context->streams[m_audio_stream_index]->avg_frame_rate;
avcodec_parameters_to_context(m_output_audio_codec_context, m_output_audio_stream->codecpar);
m_output_audio_codec_context->time_base = m_format_context->streams[m_audio_stream_index]->time_base;
*/
//TODO(P2): Free assets that have been allocated.
err = avcodec_open2(m_output_video_codec_context, m_output_video_codec, nullptr);
if (err < 0) {
printf("Failed to open codec.\n");
return false;
}
if (!(m_output_format->flags & AVFMT_NOFILE)) {
err = avio_open(&m_output_format_context->pb, video_file_name.c_str(), AVIO_FLAG_WRITE);
if (err < 0) {
printf("Failed to open output file.");
return false;
}
}
err = avformat_write_header(m_output_format_context, NULL);
if (err < 0) {
printf("Failed to write header.\n");
return false;
}
av_dump_format(m_output_format_context, 0, video_file_name.c_str(), 1);
return true;
}
//TODO(P2): make this a member. (Thanks to https://emvlo.wordpress.com/2016/03/10/sws_scale/)
void PrepareFlipFrameJ420(AVFrame* pFrame) {
for (int i = 0; i < 4; i++) {
if (i)
pFrame->data[i] += pFrame->linesize[i] * ((pFrame->height >> 1) - 1);
else
pFrame->data[i] += pFrame->linesize[i] * (pFrame->height - 1);
pFrame->linesize[i] = -pFrame->linesize[i];
}
}
This is where we take an altered frame and write it to the output container. This works fine
as long as we haven't set up an audio stream in the output container.
bool MediaContainerMgr::output_video_frame(uint8_t* buf) {
int err;
if (!m_output_video_frame) {
m_output_video_frame = av_frame_alloc();
m_output_video_frame->format = AV_PIX_FMT_YUV420P;
m_output_video_frame->width = m_output_video_codec_context->width;
m_output_video_frame->height = m_output_video_codec_context->height;
err = av_frame_get_buffer(m_output_video_frame, 32);
if (err < 0) {
printf("Failed to allocate output frame.\n");
return false;
}
}
if (!m_output_scale_context) {
m_output_scale_context = sws_getContext(m_output_video_codec_context->width, m_output_video_codec_context->height,
AV_PIX_FMT_RGB24,
m_output_video_codec_context->width, m_output_video_codec_context->height,
AV_PIX_FMT_YUV420P, SWS_BICUBIC, nullptr, nullptr, nullptr);
}
int inLinesize[1] = { 3 * m_output_video_codec_context->width };
sws_scale(m_output_scale_context, (const uint8_t* const*)&buf, inLinesize, 0, m_output_video_codec_context->height,
m_output_video_frame->data, m_output_video_frame->linesize);
PrepareFlipFrameJ420(m_output_video_frame);
//TODO(P0): Switch m_frame to be m_input_video_frame so I don't end up using the presentation timestamp from
// an audio frame if I threadify the frame reading.
m_output_video_frame->pts = m_frame->pts;
printf("Output PTS: %d, time_base: %d/%d\n", m_output_video_frame->pts,
m_output_video_codec_context->time_base.num, m_output_video_codec_context->time_base.den);
err = avcodec_send_frame(m_output_video_codec_context, m_output_video_frame);
if (err < 0) {
printf(" ERROR sending new video frame output: ");
switch (err) {
case AVERROR(EAGAIN):
printf("AVERROR(EAGAIN): %d\n", err);
break;
case AVERROR_EOF:
printf("AVERROR_EOF: %d\n", err);
break;
case AVERROR(EINVAL):
printf("AVERROR(EINVAL): %d\n", err);
break;
case AVERROR(ENOMEM):
printf("AVERROR(ENOMEM): %d\n", err);
break;
}
return false;
}
AVPacket pkt;
av_init_packet(&pkt);
pkt.data = nullptr;
pkt.size = 0;
pkt.flags |= AV_PKT_FLAG_KEY;
int ret = 0;
if ((ret = avcodec_receive_packet(m_output_video_codec_context, &pkt)) == 0) {
static int counter = 0;
printf("pkt.key: 0x%08x, pkt.size: %d, counter:\n", pkt.flags & AV_PKT_FLAG_KEY, pkt.size, counter++);
uint8_t* size = ((uint8_t*)pkt.data);
printf("sizes: %d %d %d %d %d %d %d %d %d\n", size[0], size[1], size[2], size[2], size[3], size[4], size[5], size[6], size[7]);
av_interleaved_write_frame(m_output_format_context, &pkt);
}
printf("push: %d\n", ret);
av_packet_unref(&pkt);
return true;
}
bool MediaContainerMgr::finalize_output() {
if (!m_recording)
return true;
AVPacket pkt;
av_init_packet(&pkt);
pkt.data = nullptr;
pkt.size = 0;
for (;;) {
avcodec_send_frame(m_output_video_codec_context, nullptr);
if (avcodec_receive_packet(m_output_video_codec_context, &pkt) == 0) {
av_interleaved_write_frame(m_output_format_context, &pkt);
printf("final push:\n");
} else {
break;
}
}
av_packet_unref(&pkt);
av_write_trailer(m_output_format_context);
if (!(m_output_format->flags & AVFMT_NOFILE)) {
int err = avio_close(m_output_format_context->pb);
if (err < 0) {
printf("Failed to close file. err: %d\n", err);
return false;
}
}
return true;
}
EDIT
The call stack on the crash (which I should have included in the original question):
avformat-58.dll!compute_muxer_pkt_fields(AVFormatContext * s, AVStream * st, AVPacket * pkt) Line 630 C
avformat-58.dll!write_packet_common(AVFormatContext * s, AVStream * st, AVPacket * pkt, int interleaved) Line 1122 C
avformat-58.dll!write_packets_common(AVFormatContext * s, AVPacket * pkt, int interleaved) Line 1186 C
avformat-58.dll!av_interleaved_write_frame(AVFormatContext * s, AVPacket * pkt) Line 1241 C
CamBot.exe!MediaContainerMgr::output_video_frame(unsigned char * buf) Line 553 C++
CamBot.exe!main() Line 240 C++
If I move the call to avformat_write_header so it's immediately before the audio stream initialization, I still get a crash, but in a different place. The crash happens on line 6459 of movenc.c, where we have:
/* Non-seekable output is ok if using fragmentation. If ism_lookahead
* is enabled, we don't support non-seekable output at all. */
if (!(s->pb->seekable & AVIO_SEEKABLE_NORMAL) && // CRASH IS HERE
(!(mov->flags & FF_MOV_FLAG_FRAGMENT) || mov->ism_lookahead)) {
av_log(s, AV_LOG_ERROR, "muxer does not support non seekable output\n");
return AVERROR(EINVAL);
}
The exception is a nullptr exception, where s->pb is NULL. The call stack is:
avformat-58.dll!mov_init(AVFormatContext * s) Line 6459 C
avformat-58.dll!init_muxer(AVFormatContext * s, AVDictionary * * options) Line 407 C
[Inline Frame] avformat-58.dll!avformat_init_output(AVFormatContext *) Line 489 C
avformat-58.dll!avformat_write_header(AVFormatContext * s, AVDictionary * * options) Line 512 C
CamBot.exe!MediaContainerMgr::init_video_output(const std::string & video_file_name, unsigned int width, unsigned int height) Line 424 C++
CamBot.exe!main() Line 183 C++
Please note that you should always try to provide a self-contained minimal working example to make it easier for others to help. With the actual code, the matching FFmpeg version, and an input video that triggers the segmentation fault (to be sure), the issue would be a matter of analyzing the control flow to identify why st->internal->priv_pts was not allocated. Without the full scenario, I have to report to making assumptions that may or may not correspond to your actual code.
Based on your description, I attempted to reproduce the issue by cloning https://github.com/FFmpeg/FFmpeg.git and creating a new branch from commit b52e0d95 (November 4, 2020) to approximate your FFmpeg version.
I recreated your scenario using the provided code snippets by
including the avformat_new_stream() call for the audio stream
keeping the remaining audio initialization commented out
including the original avformat_write_header() call site (unchanged order)
With that scenario, the video write with MP4 video/audio input fails in avformat_write_header():
[mp4 # 0x2b39f40] sample rate not set 0
The call stack of the error location:
#0 0x00007ffff75253d7 in raise () from /lib64/libc.so.6
#1 0x00007ffff7526ac8 in abort () from /lib64/libc.so.6
#2 0x000000000094feca in init_muxer (s=0x2b39f40, options=0x0) at libavformat/mux.c:309
#3 0x00000000009508f4 in avformat_init_output (s=0x2b39f40, options=0x0) at libavformat/mux.c:490
#4 0x0000000000950a10 in avformat_write_header (s=0x2b39f40, options=0x0) at libavformat/mux.c:514
[...]
In init_muxer(), the sample rate in the stream parameters is checked unconditionally:
case AVMEDIA_TYPE_AUDIO:
if (par->sample_rate <= 0) {
av_log(s, AV_LOG_ERROR, "sample rate not set %d\n", par->sample_rate); abort();
ret = AVERROR(EINVAL);
goto fail;
}
That condition has been in effect since 2014-06-18 at the very least (didn't go back any further) and still exists. With a version from November 2020, the check must be active and the parameter must be set accordingly.
If I uncomment the remaining audio initialization, the situation remains unchanged (as expected). So, satisfy the condition, I added the missing parameter as follows:
m_output_audio_stream->codecpar->sample_rate =
m_format_context->streams[m_audio_stream_index]->codecpar->sample_rate;
With that, the check succeeds, avformat_write_header() succeeds, and the actual video write succeeds.
As you indicated in your question, the segmentation fault is caused by st->internal->priv_pts being NULL at this location:
#0 0x00000000009516db in compute_muxer_pkt_fields (s=0x2b39f40, st=0x2b3a580, pkt=0x7fffffffe2d0) at libavformat/mux.c:632
#1 0x0000000000953128 in write_packet_common (s=0x2b39f40, st=0x2b3a580, pkt=0x7fffffffe2d0, interleaved=1) at libavformat/mux.c:1125
#2 0x0000000000953473 in write_packets_common (s=0x2b39f40, pkt=0x7fffffffe2d0, interleaved=1) at libavformat/mux.c:1188
#3 0x0000000000953634 in av_interleaved_write_frame (s=0x2b39f40, pkt=0x7fffffffe2d0) at libavformat/mux.c:1243
[...]
In the FFmpeg code base, the allocation of priv_pts is handled by init_pts() for all streams referenced by the context. init_pts() has two call sites:
libavformat/mux.c:496:
if (s->oformat->init && ret) {
if ((ret = init_pts(s)) < 0)
return ret;
return AVSTREAM_INIT_IN_INIT_OUTPUT;
}
libavformat/mux.c:530:
if (!s->internal->streams_initialized) {
if ((ret = init_pts(s)) < 0)
goto fail;
}
In both cases, the calls are triggered by avformat_write_header() (indirectly via avformat_init_output() for the first, directly for the second). According to control flow analysis, there's no success case that would leave priv_pts unallocated.
Considering a high probability that our versions of FFmpeg are compatible in terms of behavior, I have to assume that 1) the sample rate must be provided for audio streams and 1) priv_pts is always allocated by avformat_write_header() in the absence of errors. Therefore, two possible root causes come to mind:
Your stream is not an audio stream (unlikely; the type is based on the codec, which in turn is based on the output file extension - assuming mp4)
You do not call avformat_write_header() (unlikely) or do not handle the error in the caller of your C++ member function (the return value of avformat_write_header() is checked but I do not have code corresponding to the caller of the C++ member function; your actual code might differ significantly from the code provided, so it's possible and the only plausible conclusion that can be drawn from available data)
The solution: Ensure that processing does not continue if avformat_write_header() fails. By adding the audio stream, avformat_write_header() starts to fail unless you set the stream sample rate. If the error is ignored, av_interleaved_write_frame() triggers a segmentation fault by accessing the unallocated st->internal->priv_pts.
As mentioned initially, scenario is incomplete. If you do call avformat_write_header() and stop processing in case of an error (meaning you do not call av_interleaved_write_frame()), more information is needed. As it stands now, that is unlikely. For further analysis, the executable output (stdout, stderr) is required to see your traces and FFmpeg log messages. If that does not reveal new information, a self-contained minimal working example and the video input are needed to get all the full picture.

Xcode app for macOS. This is how I setup to get audio from usb mic input. Worked a year ago, now doesn't. Why

Here is my audio init code. My app responds when queue buffers are ready, but all data in buffer is zero. Checking sound in system preferences shows that USB Audio CODEC in sound input dialog is active. AudioInit() is called right after app launches.
{
#pragma mark user data struct
typedef struct MyRecorder
{
AudioFileID recordFile;
SInt64 recordPacket;
Float32 *pSampledData;
MorseDecode *pMorseDecoder;
} MyRecorder;
#pragma mark utility functions
void CheckError(OSStatus error, const char *operation)
{
if(error == noErr) return;
char errorString[20];
// see if it appears to be a 4 char code
*(UInt32*)(errorString + 1) = CFSwapInt32HostToBig(error);
if (isprint(errorString[1]) && isprint(errorString[2]) &&
isprint(errorString[3]) && isprint(errorString[4]))
{
errorString[0] = errorString[5] = '\'';
errorString[6] = '\0';
}
else
{
sprintf(errorString, "%d", (int)error);
}
fprintf(stderr, "Error: %s (%s)\n", operation, errorString);
}
OSStatus MyGetDefaultInputDeviceSampleRate(Float64 *outSampleRate)
{
OSStatus error;
AudioDeviceID deviceID = 0;
AudioObjectPropertyAddress propertyAddress;
UInt32 propertySize;
propertyAddress.mSelector = kAudioHardwarePropertyDefaultInputDevice;
propertyAddress.mScope = kAudioObjectPropertyScopeGlobal;
propertyAddress.mElement = 0;
propertySize = sizeof(AudioDeviceID);
error = AudioObjectGetPropertyData(kAudioObjectSystemObject,
&propertyAddress,
0,
NULL,
&propertySize,
&deviceID);
if(error)
return error;
propertyAddress.mSelector = kAudioDevicePropertyNominalSampleRate;
propertyAddress.mScope = kAudioObjectPropertyScopeGlobal;
propertyAddress.mElement = 0;
propertySize = sizeof(Float64);
error = AudioObjectGetPropertyData(deviceID,
&propertyAddress,
0,
NULL,
&propertySize,
outSampleRate);
return error;
}
static int MyComputeRecordBufferSize(const AudioStreamBasicDescription *format,
AudioQueueRef queue,
float seconds)
{
int packets, frames, bytes;
frames = (int)ceil(seconds * format->mSampleRate);
if(format->mBytesPerFrame > 0)
{
bytes = frames * format->mBytesPerFrame;
}
else
{
UInt32 maxPacketSize;
if(format->mBytesPerPacket > 0)
{
// constant packet size
maxPacketSize = format->mBytesPerPacket;
}
else
{
// get the largest single packet size possible
UInt32 propertySize = sizeof(maxPacketSize);
CheckError(AudioQueueGetProperty(queue,
kAudioConverterPropertyMaximumOutputPacketSize,
&maxPacketSize,
&propertySize),
"Couldn't get queues max output packet size");
}
if(format->mFramesPerPacket > 0)
packets = frames / format->mFramesPerPacket;
else
// worst case scenario: 1 frame in a packet
packets = frames;
// sanity check
if(packets == 0)
packets = 1;
bytes = packets * maxPacketSize;
}
return bytes;
}
extern void bridgeToMainThread(MorseDecode *pDecode);
static int callBacks = 0;
// ---------------------------------------------
static void MyAQInputCallback(void *inUserData,
AudioQueueRef inQueue,
AudioQueueBufferRef inBuffer,
const AudioTimeStamp *inStartTime,
UInt32 inNumPackets,
const AudioStreamPacketDescription *inPacketDesc)
{
MyRecorder *recorder = (MyRecorder*)inUserData;
Float32 *pAudioData = (Float32*)(inBuffer->mAudioData);
recorder->pMorseDecoder->pBuffer = pAudioData;
recorder->pMorseDecoder->bufferSize = inNumPackets;
bridgeToMainThread(recorder->pMorseDecoder);
CheckError(AudioQueueEnqueueBuffer(inQueue,
inBuffer,
0,
NULL),
"AudioQueueEnqueueBuffer failed");
printf("packets = %ld, bytes = %ld\n",(long)inNumPackets,(long)inBuffer->mAudioDataByteSize);
callBacks++;
//printf("\ncallBacks = %d\n",callBacks);
//if(callBacks == 0)
//audioStop();
}
static AudioQueueRef queue = {0};
static MyRecorder recorder = {0};
static AudioStreamBasicDescription recordFormat;
void audioInit()
{
// set up format
memset(&recordFormat,0,sizeof(recordFormat));
recordFormat.mFormatID = kAudioFormatLinearPCM;
recordFormat.mChannelsPerFrame = 2;
recordFormat.mBitsPerChannel = 32;
recordFormat.mBytesPerPacket = recordFormat.mBytesPerFrame = recordFormat.mChannelsPerFrame * sizeof(Float32);
recordFormat.mFramesPerPacket = 1;
//recordFormat.mFormatFlags = kAudioFormatFlagsCanonical;
recordFormat.mFormatFlags = kAudioFormatFlagsNativeFloatPacked;
MyGetDefaultInputDeviceSampleRate(&recordFormat.mSampleRate);
UInt32 propSize = sizeof(recordFormat);
CheckError(AudioFormatGetProperty(kAudioFormatProperty_FormatInfo,
0,
NULL,
&propSize,
&recordFormat),
"AudioFormatProperty failed");
recorder.pMorseDecoder = MorseDecode::pInstance();
recorder.pMorseDecoder->m_sampleRate = recordFormat.mSampleRate;
// recorder.pMorseDecoder->setCircularBuffer();
//set up queue
CheckError(AudioQueueNewInput(&recordFormat,
MyAQInputCallback,
&recorder,
NULL,
kCFRunLoopCommonModes,
0,
&queue),
"AudioQueueNewInput failed");
UInt32 size = sizeof(recordFormat);
CheckError(AudioQueueGetProperty(queue,
kAudioConverterCurrentOutputStreamDescription,
&recordFormat,
&size), "Couldn't get queue's format");
// set up buffers and enqueue
const int kNumberRecordBuffers = 3;
int bufferByteSize = MyComputeRecordBufferSize(&recordFormat, queue, AUDIO_BUFFER_DURATION);
for(int bufferIndex = 0; bufferIndex < kNumberRecordBuffers; bufferIndex++)
{
AudioQueueBufferRef buffer;
CheckError(AudioQueueAllocateBuffer(queue,
bufferByteSize,
&buffer),
"AudioQueueAllocateBuffer failed");
CheckError(AudioQueueEnqueueBuffer(queue,
buffer,
0,
NULL),
"AudioQueueEnqueueBuffer failed");
}
}
void audioRun()
{
CheckError(AudioQueueStart(queue, NULL), "AudioQueueStart failed");
}
void audioStop()
{
CheckError(AudioQueuePause(queue), "AudioQueuePause failed");
}
}
This sounds like the new macOS 'microphone privacy' setting, which, if set to 'no access' for your app, will cause precisely this behaviour. So:
Open the System Preferences pane.
Click on 'Security and Privacy'.
Select the Privacy tab.
Click on 'Microphone' in the left-hand pane.
Locate your app in the right-hand pane and tick the checkbox next to it.
Then restart your app and test it.
Tedious, no?
Edit: As stated in the comments, you can't directly request microphone access, but you can detect whether it has been granted to your app or not by calling [AVCaptureDevice authorizationStatusForMediaType: AVMediaTypeAudio].

How to read h264 stream as a file from the USB webcam directly in c/c++ without using opencv?

I am able to read a video file of h264 format and doing some machine learning inference on top of it. The code works absolutely fine for input from a file. Below code is a sample code from Deepstream SDK
FileDataProvider(const char *szFilePath, simplelogger::Logger *logger)
: logger_(logger)
{
fp_ = fopen(szFilePath, "rb");
//fp_ = fopen("/dev/video0", "rb");
if (nullptr == fp_) {
LOG_ERROR(logger, "Failed to open file " << szFilePath);
exit(1);
}
pLoadBuf_ = new uint8_t[nLoadBuf_];
pPktBuf_ = new uint8_t[nPktBuf_];
assert(nullptr != pLoadBuf_);
}
~FileDataProvider() {
if (fp_) {
fclose(fp_);
}
if (pLoadBuf_) {
delete [] pLoadBuf_;
}
if (pPktBuf_) {
delete [] pPktBuf_;
}
}
What is requirement ?
Read from the Logitech c920 webcam instead for video file.
I know How to read from webcam using opencv. But I don't want to use opencv here.
My Research
Using v4l we can get the stream and display it in vlc.
Camera supports below formats.
#ubox:~$ v4l2-ctl --device=/dev/video1 --list-formats
ioctl: VIDIOC_ENUM_FMT Index : 0 Type : Video Capture
Pixel Format: 'YUYV' Name : YUYV 4:2:2
Index : 1 Type : Video Capture Pixel Format: 'H264'
(compressed) Name : H.264
Index : 2 Type : Video Capture Pixel Format: 'MJPG'
(compressed) Name : Motion-JPEG
Reading output of a USB webcam in Linux
vlc v4l2:///dev/video1 --v4l2-chroma=h264 - this displays the video from the webcam.
How to do this?
- Now how to feed this live stream into
above sample code such that it reads from the webcam rather than file?
[update-1]
- In otherwords, does v4l has some options to write the video stream as h264 formant ? So that, I can read that file like before(above code) when its(v4l) writing to disk.
[update-2]
- we can use ffmpeg instead of v4l. If any solutions for using ffmpeg to save the video stream into disk continuously, so that other programs reads that file ?
Before using ioctl to capture frames from camera, you need to set the format like below first.
fp_ = open("/dev/video0", O_RDWR);
struct v4l2_format fmt = {0};
fmt.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
fmt.fmt.pix.pixelformat = V4L2_PIX_FMT_H264;
ioctl(fp_, VIDIOC_S_FMT, &fmt);
then, initialize and map buffer
struct Buffer
{
void *start;
unsigned int length;
unsigned int flags;
};
int buffer_count_ = 4;
Buffer *buffers_;
bool AllocateBuffer()
{
struct v4l2_requestbuffers req = {0};
req.count = buffer_count_;
req.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
req.memory = V4L2_MEMORY_MMAP;
if (ioctl(fp_, VIDIOC_REQBUFS, &req) < 0)
{
perror("ioctl Requesting Buffer");
return false;
}
buffers_ = new Buffer[buffer_count_];
for (int i = 0; i < buffer_count_; i++)
{
struct v4l2_buffer buf = {0};
buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
buf.memory = V4L2_MEMORY_MMAP;
buf.index = i;
if (ioctl(fp_, VIDIOC_QUERYBUF, &buf) < 0)
{
perror("ioctl Querying Buffer");
return false;
}
buffers_[i].start = mmap(NULL, buf.length, PROT_READ | PROT_WRITE, MAP_SHARED, fd_, buf.m.offset);
buffers_[i].length = buf.length;
if (MAP_FAILED == buffers_[i].start)
{
printf("MAP FAILED: %d\n", i);
for (int j = 0; j < i; j++)
munmap(buffers_[j].start, buffers_[j].length);
return false;
}
if (ioctl(fp_, VIDIOC_QBUF, &buf) < 0)
{
perror("ioctl Queue Buffer");
return false;
}
}
return true;
}
STREAMON to start capturing
v4l2_buf_type type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
ioctl(fp_, VIDIOC_STREAMON, &type);
finally read a frame from the mapped buffer. Generally, CaptureImage() will be in the while loop.
Buffer CaptureImage()
{
fd_set fds;
FD_ZERO(&fds);
FD_SET(fd_, &fds);
struct timeval tv = {0};
tv.tv_sec = 1;
tv.tv_usec = 0;
int r = select(fd_ + 1, &fds, NULL, NULL, &tv);
if (r == 0)
{
// timeout
}
struct v4l2_buffer buf = {0};
buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
buf.memory = V4L2_MEMORY_MMAP;
while (ioctl(fp_, VIDIOC_DQBUF, &buf) < 0)
{
perror("Retrieving Frame");
}
struct Buffer buffer = {.start = buffers_[buf.index].start,
.length = buf.bytesused,
.flags = buf.flags};
if (ioctl(fp_, VIDIOC_QBUF, &buf) < 0)
{
perror("Queue buffer");
}
return buffer;
}

AVerMedia Capture Card C985 didn't work with C++ and openCV

I bought 'AVerMedia Capture Card (C985 LITE)' last week, and I connected video camera to this capture card's HDMI input.
When I tested with AVerMedia's RECentral software, Amcap, ffmpeg, it worked.
But, when I tested with AVerMedia's AVerCapSDKDemo, VLC, Windows Movie maker, Windows directshow, it didn't work.
Then, I try to get camera frame(in real time) by internet sample code and my c++ code (with and without using openCV). All of the code work with general USB Webcam, but didn't work with this capture card.
The result showed that every c++ code can see this capture card, but can't see the camera that connected to the card.
The conditions, that I tested and it didn't work, are below:
1st PC Spec: Intel core i5, Ram 16 GB, HDD 1 TB, DirectX 11 with windows10 64 bit
2nd PC Spec: Intel core i7, Ram 8 GB, HDD 1 TB, DirectX 11 with windows7 64 bit
IDE: visual studio 2015
Camera: GoPro and SONY Handycam, both full HD with HDMI output
About my project, I want to tracking the car on the road in real time,
therefore I decided to use C985 Capture Card that support full HD.
Does anyone have any advice?
Thank you very much.
Best regards,
--
Edit: Add Example Code
1.My code with openCV: For this code, it always show "error: frame not read from webcam\n".
#include<opencv2/core/core.hpp>
#include<opencv2/highgui/highgui.hpp>
#include<opencv2/imgproc/imgproc.hpp>
#include<iostream>
#include<conio.h>
int main() {
cv::VideoCapture capWebcam(0); // declare a VideoCapture object and associate to webcam, 0 => use 1st webcam
if (capWebcam.isOpened() == false) { // check if VideoCapture object was associated to webcam successfully
std::cout << "error: capWebcam not accessed successfully\n\n"; // if not, print error message to std out
_getch(); // may have to modify this line if not using Windows
return(0); // and exit program
}
char charCheckForEscKey = 0;
while (charCheckForEscKey != 27 && capWebcam.isOpened()) { // until the Esc key is pressed or webcam connection is lost
bool blnFrameReadSuccessfully = capWebcam.read(imgOriginal); // get next frame
if (!blnFrameReadSuccessfully || imgOriginal.empty()) { // if frame not read successfully
std::cout << "error: frame not read from webcam\n"; // print error message to std out
continue; // and jump out of while loop
}
cv::namedWindow("imgOriginal", CV_WINDOW_NORMAL); // note: you can use CV_WINDOW_NORMAL which allows resizing the window
cv::imshow("imgOriginal", imgOriginal); // show windows
charCheckForEscKey = cv::waitKey(1); // delay (in ms) and get key press, if any
} // end while
return(0);
}
2.My code without openCV. (Using AForge): For this code, the image show nothing.
private void Form1_Load(object sender, EventArgs e)
{
FilterInfoCollection videoDevices = new FilterInfoCollection(FilterCategory.VideoInputDevice);
for (int i = 0; i< videoDevices.Count; i++)
{
comboBox1.Items.Add(videoDevices[i].MonikerString);
}
// create video source
}
private void video_NewFrame(object sender, NewFrameEventArgs eventArgs)
{
Bitmap img = (Bitmap)eventArgs.Frame.Clone();
pictureBox1.Image = img;
}
private void button1_Click(object sender, EventArgs e)
{
VideoCaptureDeviceForm xx = new VideoCaptureDeviceForm();
xx.ShowDialog();
VideoCaptureDevice videoSource = new VideoCaptureDevice(xx.VideoDeviceMoniker);
//videoSource.Source = "AVerMedia HD Capture C985 Bus 2";
VideoInput input = videoSource.CrossbarVideoInput;
MessageBox.Show("" + videoSource.CheckIfCrossbarAvailable());
MessageBox.Show(" " + input.Index + " " + input.Type);
// set NewFrame event handler
videoSource.NewFrame += video_NewFrame;
foreach(var x in videoSource.AvailableCrossbarVideoInputs)
{
MessageBox.Show("AvailableCrossbarVideoInputs > " + x.Index);
}
videoSource.VideoSourceError += VideoSource_VideoSourceError;
// start the video source
videoSource.Start();
// signal to stop when you no longer need capturing
videoSource.SignalToStop();
videoSource.Start();
MessageBox.Show("AvailableCrossbarVideoInputs length :" + videoSource.AvailableCrossbarVideoInputs.Length);
input = videoSource.CrossbarVideoInput;
MessageBox.Show(" " + input.Index + " " + input.Type);
videoSource.SignalToStop();
videoSource.Start();
}
3.Code from Internet: I use the code from code project(Capture Live Video from various Video Devices) in link below. It showed "can't detect Webcam".
https://www.codeproject.com/articles/7123/capture-live-video-from-various-video-devices
Hope my code can help: (I use AVerMedia SDK + OpenCV3, use directshow api to open device then get video to mat format)
#include "stdafx.h"
#include "atlstr.h"
#include <iostream>
#include "AVerCapAPI_Pro.h"
#include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <windows.h>
using namespace std;
using namespace cv;
void ErrorMsg(DWORD ErrorCode)
{
printf("ErrorCode = %d\n", ErrorCode);
if (ErrorCode == CAP_EC_SUCCESS)
{
printf("CAP_EC_SUCCESS\n");
}
if (ErrorCode == CAP_EC_INIT_DEVICE_FAILED)
{
printf("CAP_EC_INIT_DEVICE_FAILED\n");
}
if (ErrorCode == CAP_EC_DEVICE_IN_USE)
{
printf("CAP_EC_DEVICE_IN_USE\n");
}
if (ErrorCode == CAP_EC_NOT_SUPPORTED)
{
printf("CAP_EC_NOT_SUPPORTED\n");
}
if (ErrorCode == CAP_EC_INVALID_PARAM)
{
printf("CAP_EC_INVALID_PARAM\n");
}
if (ErrorCode == CAP_EC_TIMEOUT)
{
printf("CAP_EC_TIMEOUT\n");
}
if (ErrorCode == CAP_EC_NOT_ENOUGH_MEMORY)
{
printf("CAP_EC_NOT_ENOUGH_MEMORY\n");
}
if (ErrorCode == CAP_EC_UNKNOWN_ERROR)
{
printf("CAP_EC_UNKNOWN_ERROR\n");
}
if (ErrorCode == CAP_EC_ERROR_STATE)
{
printf("CAP_EC_ERROR_STATE\n");
}
if (ErrorCode == CAP_EC_HDCP_PROTECTED_CONTENT)
{
printf("CAP_EC_HDCP_PROTECTED_CONTENT\n");
}
}
BOOL WINAPI CaptureVideo(VIDEO_SAMPLE_INFO VideoInfo, BYTE *pbData, LONG lLength, __int64 tRefTime, LONG lUserData);
BOOL bGetData = FALSE;
Mat ans2;
int main(int argc, char** argv)
{
LONG lRetVal;
DWORD dwDeviceNum;
DWORD dwDeviceIndex = 0;
HANDLE hAverCapturedevice[10];
//Device Control
//1. Get Device Number
lRetVal = AVerGetDeviceNum(&dwDeviceNum);
if (lRetVal != CAP_EC_SUCCESS) {
printf("\nAVerGetDeviceNum Fail");
ErrorMsg(lRetVal);
system("pause");
}
if (dwDeviceNum == 0) {
printf("NO device found\n");
system("pause");
}
else {
printf("Device Number = %d\n", dwDeviceNum);
}
//2. Create device representative object handle
for (DWORD dwDeviceIndex = 0; dwDeviceIndex < dwDeviceNum; dwDeviceIndex++) {
lRetVal = AVerCreateCaptureObjectEx(dwDeviceIndex, DEVICETYPE_ALL, NULL, &hAverCapturedevice[dwDeviceIndex]);
if (lRetVal != CAP_EC_SUCCESS) {
printf("\nAVerCreateCaptureObjectEx Fail\n");
ErrorMsg(lRetVal);
system("pause");
}
else
printf("\nAVerCreateCaptureObjectEx Success\n");
}
//3. Start Streaming//
//3.1 set video source
//lRetVal = AVerSetVideoSource(hAverCapturedevice[0], 3);
lRetVal = AVerSetVideoSource(hAverCapturedevice[0], 3);
//3.2 set Video Resolution & FrameRate
VIDEO_RESOLUTION VideoResolution = { 0 };
INPUT_VIDEO_INFO InputVideoInfo;
ZeroMemory(&InputVideoInfo, sizeof(InputVideoInfo));
InputVideoInfo.dwVersion = 2;
Sleep(500);
lRetVal = AVerGetVideoInfo(hAverCapturedevice[0], &InputVideoInfo);
VideoResolution.dwVersion = 1;
VideoResolution.dwVideoResolution = VIDEORESOLUTION_1280X720;
lRetVal = AVerSetVideoResolutionEx(hAverCapturedevice[0], &VideoResolution);
lRetVal = AVerSetVideoInputFrameRate(hAverCapturedevice[0], 6000);
//3.3 Start Streaming
lRetVal = AVerStartStreaming(hAverCapturedevice[0]);
if (lRetVal != CAP_EC_SUCCESS) {
printf("\AVerStartStreaming Fail\n");
ErrorMsg(lRetVal);
//system("pause");
}
else
{
printf("\AVerStartStreaming Success\n");
//system("pause");
}
//4. Capture Single Image
#if 0
CAPTURE_IMAGE_INFO m_CaptureImageInfo = { 0 };
char text[] = "E:\Lena.bmp";
wchar_t wtext[20];
#define _CRT_SECURE_NO_WARNINGS
#pragma warning( disable : 4996 )
mbstowcs(wtext, text, strlen(text) + 1);//Plus null
LPWSTR m_strSavePath = wtext;
CAPTURE_SINGLE_IMAGE_INFO pCaptureSingleImageInfo = { 0 };
pCaptureSingleImageInfo.dwVersion = 1;
pCaptureSingleImageInfo.dwImageType = 2;
pCaptureSingleImageInfo.bOverlayMix = FALSE;
pCaptureSingleImageInfo.lpFileName = m_strSavePath;
//pCaptureSingleImageInfo.rcCapRect = 0;
lRetVal = AVerCaptureSingleImage(hAverCapturedevice[0], &pCaptureSingleImageInfo);
printf("\AVerCaptureSingleImage\n");
ErrorMsg(lRetVal);
#endif
#if 1
//video capture
VIDEO_CAPTURE_INFO VideoCaptureInfo;
ZeroMemory(&VideoCaptureInfo, sizeof(VIDEO_CAPTURE_INFO));
VideoCaptureInfo.bOverlayMix = FALSE;
VideoCaptureInfo.dwCaptureType = CT_SEQUENCE_FRAME;
VideoCaptureInfo.dwSaveType = ST_CALLBACK_RGB24;
VideoCaptureInfo.lpCallback = CaptureVideo;
VideoCaptureInfo.lCallbackUserData = NULL;
lRetVal = AVerCaptureVideoSequenceStart(hAverCapturedevice[0], VideoCaptureInfo);
if (FAILED(lRetVal))
{
return lRetVal;
}
//system("pause");// hange up
#endif
int i;
scanf_s("%d", &i, 4); //must input any number in console !!
//5. Stop Streaming
lRetVal = AVerCaptureVideoSequenceStop(hAverCapturedevice[0]);
lRetVal = AVerStopStreaming(hAverCapturedevice[0]);
//printf("\AVerStopStreaming Success\n");
ErrorMsg(lRetVal);
return 0;
}
BOOL WINAPI CaptureVideo(VIDEO_SAMPLE_INFO VideoInfo, BYTE *pbData, LONG lLength, __int64 tRefTime, LONG lUserData)
{
if (!bGetData)
{
ans2 = Mat(VideoInfo.dwHeight, VideoInfo.dwWidth, CV_8UC3, (uchar*)pbData).clone();//single capture image
//ans2 = Mat(VideoInfo.dwHeight, VideoInfo.dwWidth, CV_8UC3, (uchar*)pbData); //sequence capture image
bGetData = TRUE;
}
imshow("ans2", ans2);
waitKey(1);
return TRUE;
}
Now, it's solved by formatted computer and installed Windows 10 without updates.
And I wrote program to call GraphEdit that set up the following filters.
GraphEdit's filter
Everything seemed to work fine until I updated windows by mistake.

Losing quality when encoding with ffmpeg

I am using the c libraries of ffmpeg to read frames from a video and create an output file that is supposed to be identical to the input.
However, somewhere during this process some quality gets lost and the result is "less sharp". My guess is that the problem is the encoding and that the frames are too compressed (also because the size of the file decreases quite significantly). Is there some parameter in the encoder that allows me to control the quality of the result? I found that AVCodecContext has a compression_level member, but changing it that does not seem to have any effect.
I post here part of my code in case it could help. I would say that something must be changed in the init function of OutputVideoBuilder when I set the codec. The AVCodecContext that is passed to the method is the same of InputVideoHandler.
Here are the two main classes that I created to wrap the ffmpeg functionalities:
// This class opens the video files and sets the decoder
class InputVideoHandler {
public:
InputVideoHandler(char* name);
~InputVideoHandler();
AVCodecContext* getCodecContext();
bool readFrame(AVFrame* frame, int* success);
private:
InputVideoHandler();
void init(char* name);
AVFormatContext* formatCtx;
AVCodec* codec;
AVCodecContext* codecCtx;
AVPacket packet;
int streamIndex;
};
void InputVideoHandler::init(char* name) {
streamIndex = -1;
int numStreams;
if (avformat_open_input(&formatCtx, name, NULL, NULL) != 0)
throw std::exception("Invalid input file name.");
if (avformat_find_stream_info(formatCtx, NULL)<0)
throw std::exception("Could not find stream information.");
numStreams = formatCtx->nb_streams;
if (numStreams < 0)
throw std::exception("No streams in input video file.");
for (int i = 0; i < numStreams; i++) {
if (formatCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO) {
streamIndex = i;
break;
}
}
if (streamIndex < 0)
throw std::exception("No video stream in input video file.");
// find decoder using id
codec = avcodec_find_decoder(formatCtx->streams[streamIndex]->codec->codec_id);
if (codec == nullptr)
throw std::exception("Could not find suitable decoder for input file.");
// copy context from input stream
codecCtx = avcodec_alloc_context3(codec);
if (avcodec_copy_context(codecCtx, formatCtx->streams[streamIndex]->codec) != 0)
throw std::exception("Could not copy codec context from input stream.");
if (avcodec_open2(codecCtx, codec, NULL) < 0)
throw std::exception("Could not open decoder.");
}
// frame must be initialized with av_frame_alloc() before!
// Returns true if there are other frames, false if not.
// success == 1 if frame is valid, 0 if not.
bool InputVideoHandler::readFrame(AVFrame* frame, int* success) {
*success = 0;
if (av_read_frame(formatCtx, &packet) < 0)
return false;
if (packet.stream_index == streamIndex) {
avcodec_decode_video2(codecCtx, frame, success, &packet);
}
av_free_packet(&packet);
return true;
}
// This class opens the output and write frames to it
class OutputVideoBuilder{
public:
OutputVideoBuilder(char* name, AVCodecContext* inputCtx);
~OutputVideoBuilder();
void writeFrame(AVFrame* frame);
void writeVideo();
private:
OutputVideoBuilder();
void init(char* name, AVCodecContext* inputCtx);
void logMsg(AVPacket* packet, AVRational* tb);
AVFormatContext* formatCtx;
AVCodec* codec;
AVCodecContext* codecCtx;
AVStream* stream;
};
void OutputVideoBuilder::init(char* name, AVCodecContext* inputCtx) {
if (avformat_alloc_output_context2(&formatCtx, NULL, NULL, name) < 0)
throw std::exception("Could not determine file extension from provided name.");
codec = avcodec_find_encoder(inputCtx->codec_id);
if (codec == nullptr) {
throw std::exception("Could not find suitable encoder.");
}
codecCtx = avcodec_alloc_context3(codec);
if (avcodec_copy_context(codecCtx, inputCtx) < 0)
throw std::exception("Could not copy output codec context from input");
codecCtx->time_base = inputCtx->time_base;
codecCtx->compression_level = 0;
if (avcodec_open2(codecCtx, codec, NULL) < 0)
throw std::exception("Could not open encoder.");
stream = avformat_new_stream(formatCtx, codec);
if (stream == nullptr) {
throw std::exception("Could not allocate stream.");
}
stream->id = formatCtx->nb_streams - 1;
stream->codec = codecCtx;
stream->time_base = codecCtx->time_base;
av_dump_format(formatCtx, 0, name, 1);
if (!(formatCtx->oformat->flags & AVFMT_NOFILE)) {
if (avio_open(&formatCtx->pb, name, AVIO_FLAG_WRITE) < 0) {
throw std::exception("Could not open output file.");
}
}
if (avformat_write_header(formatCtx, NULL) < 0) {
throw std::exception("Error occurred when opening output file.");
}
}
void OutputVideoBuilder::writeFrame(AVFrame* frame) {
AVPacket packet = { 0 };
int success;
av_init_packet(&packet);
if (avcodec_encode_video2(codecCtx, &packet, frame, &success))
throw std::exception("Error encoding frames");
if (success) {
av_packet_rescale_ts(&packet, codecCtx->time_base, stream->time_base);
packet.stream_index = stream->index;
logMsg(&packet,&stream->time_base);
av_interleaved_write_frame(formatCtx, &packet);
}
av_free_packet(&packet);
}
This is the part of the main function that reads and write frames:
while (inputHandler->readFrame(frame,&gotFrame)) {
if (gotFrame) {
try {
outputBuilder->writeFrame(frame);
}
catch (std::exception e) {
std::cout << e.what() << std::endl;
return -1;
}
}
}
Your qmin/qmax answer is partially correct, but it misses the point, in that the quality indeed goes up, but the compression ratio (in terms of quality per bit) will suffer significantly as you restrict the qmin/qmax range - i.e. you will spend many more bits to accomplish the same quality than should really be necessary if you used the encoder optimally.
To increase quality without hurting the compression ratio, you need to actually increase the quality target. How you do this differs a little depending on the codec, but you typically increase the target CRF value or target bitrate. For commandline options, see e.g. the H264 docs. There's identical docs for HEVC/VP9 also. To use these options in the C API, use av_opt_set() with the same option names/values.
In case this could be useful to someone else, I add the answer that damjeux suggested, which worked for me. AVCodecContex has two members qmin and qmax which control the QP (quantization parameter) of the encoder. By default in my case qmin is 2 and qmax is 31. By setting qmax to a lower value the quality of the output improves.