I want to read and open a video in encoded domain without decoding. I have written the code up to now and it works without errors. But the output of the method av_read_frame() just gives number of zeros and same negative integer value is repeating.
I'm not sure whether I passed the parameters correctly to the method. Please help.
void CFfmpegmethods::VideoRead(){
av_register_all();
const char *url = "H:\\Sanduni_projects\\ad_1.mp4";
AVDictionary *options = NULL;
AVFormatContext *s = avformat_alloc_context(); //NULL;
//AVFormatContext *avfmt = NULL;
//avformat_alloc_context();
AVPacket pkt;
//AVFormatContext *avformat_alloc_context();
//AVIOContext *avio_alloc_context();
//open an input stream and read the header
int ret = avformat_open_input(&s, url, NULL, NULL);
//avformat_find_stream_info(s, &options); //finding the missing information
if (ret < 0)
abort();
av_dict_set(&options, "video_size", "640x480", 0);
av_dict_set(&options, "pixel_format", "rgb24", 0);
if (avformat_open_input(&s, url, NULL, &options) < 0){
abort();
}
av_dict_free(&options);
AVDictionaryEntry *e;
if (e = av_dict_get(options, "", NULL, AV_DICT_IGNORE_SUFFIX)) {
fprintf(stderr, "Option %s not recognized by the demuxer.\n", e->key);
abort();
}
//int i = 0;
while (1){
//Split what is stored in the file into frames and return one for each call
//returns the next frame of the stream
int frame = av_read_frame(s, &pkt);
//cout <<i << " " << frame << endl;
waitKey(30);
//i++;
}
//make the packet free
av_packet_unref(&pkt);
//Close the file after reading
avformat_close_input(&s);
}
Method av_read_frame() output zeros while reading the packets and after that gives negative values. In my code loop runs infinitely. Therefore gives infinite number of negative values.
This is the modified code
while (1){
//Split what is stored in the file into frames and return one for each call
//returns the next frame of the stream
int frame = av_read_frame(s, pkt);
duration = pkt->duration;
size = pkt->size;
total_size = total_size + size;
total_duration = total_duration + duration;
i++;
if (frame < 0) break;
cout << "frame" << i << " " << size << " "<< duration << endl;
}
Related
I can't seem to understand why this doesn't work I’m trying to get a sound sample to a given function.
I've based my code on a version of the function that uses objective-c which works.
But the code below written in C++ doesn't it meant to function by passing a float buffer to OXY_DecodeAudioBuffer function that latter tries and looks for data in the buffer.
Question: Am I passing the right buffer size and output from the buffer to the function? I always get no data found in buffer. Can anyone see the issue?
The hardware I'm using is Raspberry Pi 2 with a USB microphone.
I’ve also included the function with the description:
//OXY_DecodeAudioBuffer function, receives an audiobuffer of specified size and outputs if encoded data is found
//* Parameters:
// audioBuffer: float array of bufferSize size with audio data to be decoded
// size: size of audioBuffer
// oxyingObject: OXY object instance, created in OXY_Create()
//* Returns: -1 if no decoded data is found, -2 if start token is found, -3 if complete word has been decoded, positive number if character is decoded (number is the token idx)
OXY_DLLEXPORT int32_t OXY_DecodeAudioBuffer(float *audioBuffer, int size, void *oxyingObject);
The float_buffer output from the code below:
1. -0.00354004 -0.00369263 -0.00338745 -0.00354004 -0.00341797 -0.00402832
Program Code:
#include <stdio.h>
#include <stdlib.h>
#include <alsa/asoundlib.h>
#include <unistd.h>
#include <math.h>
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <iostream>
#include "Globals.h"
#include "OxyCoreLib_api.h"
void* mCore;
using namespace std;
void GetDecodedMode(){
std::cerr << "DECODE_MODE ---> " << OXY_GetDecodedMode(mCore) << std::endl << std::endl;
}
int main(void)
{
int i,j;
int err;
int mode = 3;
int16_t *buffer;
float* float_buffer;
// Allocate our own buffers (1 channel, 16 bits per sample, thus 16 bits per frame, thus 2 bytes per frame).
// Practice learns the buffers used contain 512 frames, if this changes it will be fixed in processAudio.
int buffer_frames = 512; //Not sure this correct but reason above
unsigned int rate = 44100;
float sampleRate = 44100.f; //to configure
snd_pcm_t *capture_handle;
snd_pcm_hw_params_t *hw_params;
snd_pcm_format_t format = SND_PCM_FORMAT_S16_LE;
if ((err = snd_pcm_open(&capture_handle, "hw:1,0", SND_PCM_STREAM_CAPTURE, 0)) < 0) {
fprintf(stderr, "cannot open audio device %s (%s)\n","device",snd_strerror(err));
exit(1);
} else {fprintf(stdout, "audio interface opened\n");}
if ((err = snd_pcm_hw_params_malloc(&hw_params)) < 0) {
fprintf(stderr, "cannot allocate hardware parameter structure (%s)\n",
snd_strerror(err));
exit(1);
} else { fprintf(stdout, "hw_params allocated\n"); }
if ((err = snd_pcm_hw_params_any(capture_handle, hw_params)) < 0) {
fprintf(stderr, "cannot initialize hardware parameter structure (%s)\n",
snd_strerror(err));
exit(1);
} else { fprintf(stdout, "hw_params initialized\n"); }
if ((err = snd_pcm_hw_params_set_access(capture_handle, hw_params, SND_PCM_ACCESS_RW_INTERLEAVED)) < 0) {
fprintf(stderr, "cannot set access type (%s)\n",
snd_strerror(err));
exit(1);
} else { fprintf(stdout, "hw_params access set\n"); }
if ((err = snd_pcm_hw_params_set_format(capture_handle, hw_params, format)) < 0) {
fprintf(stderr, "cannot set sample format (%s)\n",
snd_strerror(err));
exit(1);
} else { fprintf(stdout, "hw_params format set\n"); }
if ((err = snd_pcm_hw_params_set_rate_near(capture_handle, hw_params, &rate, 0)) < 0) {
fprintf(stderr, "cannot set sample rate (%s)\n",
snd_strerror(err));
exit(1);
} else { fprintf(stdout, "hw_params rate set\n"); }
if ((err = snd_pcm_hw_params_set_channels(capture_handle, hw_params, 1)) < 0) {
fprintf(stderr, "cannot set channel count (%s)\n",
snd_strerror(err));
exit(1);
} else { fprintf(stdout, "hw_params channels set\n"); }
if ((err = snd_pcm_hw_params(capture_handle, hw_params)) < 0) {
fprintf(stderr, "cannot set parameters (%s)\n",
snd_strerror(err));
exit(1);
} else { fprintf(stdout, "hw_params set\n"); }
snd_pcm_hw_params_free(hw_params);
fprintf(stdout, "hw_params freed\n");
if ((err = snd_pcm_prepare(capture_handle)) < 0) {
fprintf(stderr, "cannot prepare audio interface for use (%s)\n",
snd_strerror(err));
exit(1);
} else { fprintf(stdout, "audio interface prepared\n"); }
//allocate buffer of 16bit ints, as specified in PCM_FORMAT
//initialise
mCore = OXY_Create();
//Configure - Mode 3 inaudible, 44100, bufferSize
OXY_Configure(mode, sampleRate, buffer_frames, mCore);
//Debug to make sure
GetDecodedMode();
buffer = static_cast<int16_t*>(malloc(buffer_frames * snd_pcm_format_width(format) / 8 * 2));
//buffer = malloc(buffer_frames * snd_pcm_format_width(format) / 8 * 2);
float_buffer = static_cast<float*>(malloc(buffer_frames*sizeof(float)));
//float_buffer = malloc(buffer_frames*sizeof(float));
fprintf(stdout, "buffer allocated\n");
//where did 10000 come from doubt its correct
for (i = 0; i < 10000; ++i) {
//read from audio device into buffer
if ((err = snd_pcm_readi(capture_handle, buffer, buffer_frames)) != buffer_frames) {
fprintf(stderr, "read from audio interface failed (%s)\n",
err, snd_strerror(err));
exit(1);
}
//try to change buffer from short ints to floats for transformation
for (i = 0; i < buffer_frames; i++){
//norm
float_buffer[i] = (float)buffer[i]/32768.0;
//Example output of float_buffer
/*
-0.00354004
-0.00369263
-0.00338745
-0.00354004
-0.00341797
-0.00402832
-0.00341797
-0.00427246
-0.00375366
-0.00378418
-0.00408936
-0.00332642
-0.00369263
-0.00350952
-0.00369263
-0.00369263
-0.00344849
-0.00354004
*/
}
//send to float_to be tested
int ret = OXY_DecodeAudioBuffer(float_buffer, buffer_frames, mCore);
if (ret == -2)
{
std::cerr << "FOUND_TOKEN ---> -2 " << std::endl << std::endl;
}
else if(ret>=0)
{
std::cerr << "Decode started ---> -2 " << ret << std::endl << std::endl;
}
else if (ret == -3)
{
//int sizeStringDecoded = OXY_GetDecodedData(mStringDecoded, mCore);
std::cerr << "STRING DECODED ---> -2 " << std::endl << std::endl;
// ...
}
else
{
std::cerr << "No data found in this buffer" << std::endl << std::endl;
//no data found in this buffer
}
}
free(buffer);
snd_pcm_close(capture_handle);
std::cerr << "memory freed\n" << std::endl << std::endl;
//snd_pcm_close(capture_handle);
return(0);
//exit(0);
}
Working objective-c version using the same API:
//
// IosAudioController.m
//
#import "IosAudioController.h"
#import <AudioToolbox/AudioToolbox.h>
#import "OxyCoreLib_api.h"
#define kOutputBus 0
#define kInputBus 1
IosAudioController* iosAudio;
void checkStatus(int status){
if (status) {
printf("Status not 0! %d\n", status);
exit(1);
}
}
static OSStatus recordingCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {
if (iosAudio->mOxyObject->mDecoding == 0)
return noErr;
// Because of the way our audio format (setup below) is chosen:
// we only need 1 buffer, since it is mono
// Samples are 16 bits = 2 bytes.
// 1 frame includes only 1 sample
AudioBuffer buffer;
buffer.mNumberChannels = 1;
buffer.mDataByteSize = inNumberFrames * 2;
buffer.mData = malloc( inNumberFrames * 2 );
// Put buffer in a AudioBufferList
AudioBufferList bufferList;
bufferList.mNumberBuffers = 1;
bufferList.mBuffers[0] = buffer;
// Then:
// Obtain recorded samples
OSStatus status;
status = AudioUnitRender([iosAudio audioUnit],
ioActionFlags,
inTimeStamp,
inBusNumber,
inNumberFrames,
&bufferList);
checkStatus(status);
// Now, we have the samples we just read sitting in buffers in bufferList
// Process the new data
[iosAudio processAudio:&bufferList];
//Now Decode Audio *******************
//convert from AudioBuffer format to *float buffer
iosAudio->floatBuffer = (float *)malloc(inNumberFrames * sizeof(float));
//UInt16 *frameBuffer = bufferList.mBuffers[0].mData;
SInt16 *frameBuffer = bufferList.mBuffers[0].mData;
for(int j=0;j<inNumberFrames;j++)
{
iosAudio->floatBuffer[j] = frameBuffer[j]/32768.0;
}
int ret = OXY_DecodeAudioBuffer(iosAudio->floatBuffer, inNumberFrames, (void*)iosAudio->mOxyObject->mOxyCore);
if (ret == -2)
{
// NSLog(#"BEGIN TOKEN FOUND!");
[iosAudio->mObject performSelector:iosAudio->mSelector withObject:[NSNumber numberWithInt:0]];
}
else if (ret >= 0)
{
NSLog(#"Decode started %#",#(ret).stringValue);
}
else if (ret == -3)
{
int sizeStringDecoded = OXY_GetDecodedData(iosAudio->mStringDecoded, (void*)iosAudio->mOxyObject->mOxyCore);
NSString *tmpString = [NSString stringWithUTF8String:iosAudio->mStringDecoded];
iosAudio->mOxyObject->mDecodedString = [NSString stringWithUTF8String:iosAudio->mStringDecoded];
if (sizeStringDecoded > 0)
{
iosAudio->mOxyObject->mDecodedOK = 1;
NSLog(#"Decoded OK! %# ", tmpString);
[iosAudio->mObject performSelector:iosAudio->mSelector withObject:[NSNumber numberWithInt:1]];
}
else
{
iosAudio->mOxyObject->mDecodedOK = -1;
NSLog(#"END DECODING BAD! %# ", tmpString);
[iosAudio->mObject performSelector:iosAudio->mSelector withObject:[NSNumber numberWithInt:2]];
}
}
else
{
//no data found in this buffer
}
// release the malloc'ed data in the buffer we created earlier
free(bufferList.mBuffers[0].mData);
free(iosAudio->floatBuffer);
return noErr;
}
static OSStatus playbackCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {
// Notes: ioData contains buffers (may be more than one!)
// Fill them up as much as you can. Remember to set the size value in each buffer to match how
// much data is in the buffer.
for (int i=0; i < ioData->mNumberBuffers; i++)
{ // in practice we will only ever have 1 buffer, since audio format is mono
AudioBuffer buffer = ioData->mBuffers[i];
// NSLog(#" Buffer %d has %d channels and wants %d bytes of data.", i, buffer.mNumberChannels, buffer.mDataByteSize);
// copy temporary buffer data to output buffer
UInt32 size = min(buffer.mDataByteSize, [iosAudio tempBuffer].mDataByteSize); // dont copy more data than we have, or than fits
memcpy(buffer.mData, [iosAudio tempBuffer].mData, size);
buffer.mDataByteSize = size; // indicate how much data we wrote in the buffer
// uncomment to hear random noise
/*UInt16 *frameBuffer = buffer.mData;
for (int j = 0; j < inNumberFrames; j++)
frameBuffer[j] = rand();*/
// Play encoded buffer
if (iosAudio->mOxyObject->mEncoding > 0)
{
int sizeSamplesRead;
float audioBuffer[2048];
sizeSamplesRead = OXY_GetEncodedAudioBuffer(audioBuffer, (void*)iosAudio->mOxyObject->mOxyCore);
if (sizeSamplesRead == 0)
iosAudio->mOxyObject->mEncoding = 0;
SInt16 *frameBuffer = buffer.mData;
for(int j=0;j<sizeSamplesRead;j++)
{
frameBuffer[j] = audioBuffer[j]*32768.0;
}
}
else
{
SInt16 *frameBuffer = buffer.mData;
for (int j = 0; j < inNumberFrames; j++)
frameBuffer[j] = 0;
}
}
return noErr;
}
#implementation IosAudioController
#synthesize audioUnit, tempBuffer;
- (id) init {
self = [super init];
OSStatus status;
// Describe audio component
AudioComponentDescription desc;
desc.componentType = kAudioUnitType_Output;
desc.componentSubType = kAudioUnitSubType_RemoteIO;
desc.componentFlags = 0;
desc.componentFlagsMask = 0;
desc.componentManufacturer = kAudioUnitManufacturer_Apple;
// Get component
AudioComponent inputComponent = AudioComponentFindNext(NULL, &desc);
// Get audio units
status = AudioComponentInstanceNew(inputComponent, &audioUnit);
checkStatus(status);
// Enable IO for recording
UInt32 flag = 1;
status = AudioUnitSetProperty(audioUnit,
kAudioOutputUnitProperty_EnableIO,
kAudioUnitScope_Input,
kInputBus,
&flag,
sizeof(flag));
checkStatus(status);
// Enable IO for playback
status = AudioUnitSetProperty(audioUnit,
kAudioOutputUnitProperty_EnableIO,
kAudioUnitScope_Output,
kOutputBus,
&flag,
sizeof(flag));
checkStatus(status);
// Describe format
AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = 44100.0;
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
audioFormat.mFramesPerPacket = 1;
audioFormat.mChannelsPerFrame = 1;
audioFormat.mBitsPerChannel = 16;
audioFormat.mBytesPerPacket = 2;
audioFormat.mBytesPerFrame = 2;
// Apply format
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Output,
kInputBus,
&audioFormat,
sizeof(audioFormat));
checkStatus(status);
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_StreamFormat,
kAudioUnitScope_Input,
kOutputBus,
&audioFormat,
sizeof(audioFormat));
checkStatus(status);
// Set input callback
AURenderCallbackStruct callbackStruct;
callbackStruct.inputProc = recordingCallback;
callbackStruct.inputProcRefCon = (__bridge void *)self;
status = AudioUnitSetProperty(audioUnit,
kAudioOutputUnitProperty_SetInputCallback,
kAudioUnitScope_Global,
kInputBus,
&callbackStruct,
sizeof(callbackStruct));
checkStatus(status);
// Set output callback
callbackStruct.inputProc = playbackCallback;
callbackStruct.inputProcRefCon = (__bridge void *)self;
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Global,
kOutputBus,
&callbackStruct,
sizeof(callbackStruct));
checkStatus(status);
// Disable buffer allocation for the recorder (optional - do this if we want to pass in our own)
flag = 0;
status = AudioUnitSetProperty(audioUnit,
kAudioUnitProperty_ShouldAllocateBuffer,
kAudioUnitScope_Output,
kInputBus,
&flag,
sizeof(flag));
// Allocate our own buffers (1 channel, 16 bits per sample, thus 16 bits per frame, thus 2 bytes per frame).
// Practice learns the buffers used contain 512 frames, if this changes it will be fixed in processAudio.
tempBuffer.mNumberChannels = 1;
int size = 512;
#if (TARGET_OS_SIMULATOR)
size = 256; //TODO check this value!! depends on play/record callback buffer size
#else
size = 512; //TODO check this value!! depends on play/record callback buffer size
#endif
tempBuffer.mDataByteSize = size * 2;
tempBuffer.mData = malloc( size * 2);
// Initialise
status = AudioUnitInitialize(audioUnit);
checkStatus(status);
return self;
}
- (void) start {
OSStatus status = AudioOutputUnitStart(audioUnit);
checkStatus(status);
}
- (void) stop {
OSStatus status = AudioOutputUnitStop(audioUnit);
checkStatus(status);
}
- (void) processAudio: (AudioBufferList*) bufferList{
AudioBuffer sourceBuffer = bufferList->mBuffers[0];
// fix tempBuffer size if it's the wrong size
if (tempBuffer.mDataByteSize != sourceBuffer.mDataByteSize) {
free(tempBuffer.mData);
tempBuffer.mDataByteSize = sourceBuffer.mDataByteSize;
tempBuffer.mData = malloc(sourceBuffer.mDataByteSize);
}
// copy incoming audio data to temporary buffer
memcpy(tempBuffer.mData, bufferList->mBuffers[0].mData, bufferList->mBuffers[0].mDataByteSize);
}
- (void) dealloc {
AudioUnitUninitialize(audioUnit);
free(tempBuffer.mData);
}
- (void) setOxyObject: (OxyCore*) oxyObject
{
mOxyObject = oxyObject;
}
- (void) setListenCallback:(id)object withSelector:(SEL)selector
{
mObject = object;
mSelector = selector;
}
#end
One problem that I can see is that you are using 2 nested loops with the same variable for iteration. The first loop for (i = 0; i < 10000; ++i) and the second one for (i = 0; i < buffer_frames; i++), if buffer_frames >= 10000 - 1 the first loop will be executed once and exit, otherwise it will enter an infinite loop.
I have two more remarks regarding the following line:
buffer = static_cast<int16_t*>(malloc(buffer_frames * snd_pcm_format_width(format) / 8 * 2));
According to the API reference snd_pcm_format_width(format) returns the number of bits per sample. As you have 16 bits per sample and each frame contains only one sample, you should allocate buffer_frames * snd_pcm_format_width(format) / 8 bytes of memory (that 2 from your multiplication represents the number of channels which in your case is 1). Also, I suggest to change your buffer type to char* as it is the only type that is not prone to violating the strict aliasing rule. Thus, the line becomes:
static_cast<char*>(malloc(buffer_frames * (snd_pcm_format_width(format) / 8)));
and when you do the trick to change from short ints to float, the second for loop becomes:
int16_t* sint_buffer = buffer;
for (j = 0; j < buffer_frames; ++j){
float_buffer[j] = (float)sint_buffer[j]/32768.0;
// everything else goes here
}
I am trying to decode a video stream from the browser using the ffmpeg API. The stream is produced by the webcam and recorded with MediaRecorder as webm format. What I ultimately need is a vector of opencv cv::Mat objects for further processing.
I have written a C++ webserver using the uWebsocket library. The video stream is sent via websocket from the browser to the server once per second. On the server, I append the received data to my custom buffer and decode it with the ffmpeg API.
If I just save the data on the disk and later I play it with a media player, it works fine. So, whatever the browser sends is a valid video.
I do not think that I correctly understand how should the custom IO behave with network streaming as nothing seems to be working.
The custom buffer:
struct Buffer
{
std::vector<uint8_t> data;
int currentPos = 0;
};
The readAVBuffer method for custom IO
int MediaDecoder::readAVBuffer(void* opaque, uint8_t* buf, int buf_size)
{
MediaDecoder::Buffer* mbuf = (MediaDecoder::Buffer*)opaque;
int count = 0;
for(int i=0;i<buf_size;i++)
{
int index = i + mbuf->currentPos;
if(index >= (int)mbuf->data.size())
{
break;
}
count++;
buf[i] = mbuf->data.at(index);
}
if(count > 0) mbuf->currentPos+=count;
std::cout << "read : "<<count<<" "<<mbuf->currentPos<<", buff size:"<<mbuf->data.size() << std::endl;
if(count <= 0) return AVERROR(EAGAIN); //is this error that should be returned? It cannot be EOF since we're not done yet, most likely
return count;
}
The big decode method, that's supposed to return whatever frames it could read
std::vector<cv::Mat> MediaDecoder::decode(const char* data, size_t length)
{
std::vector<cv::Mat> frames;
//add data to the buffer
for(size_t i=0;i<length;i++) {
buf.data.push_back(data[i]);
}
//do not invoke the decoders until we have 1MB of data
if(((buf.data.size() - buf.currentPos) < 1*1024*1024) && !initializedCodecs) return frames;
std::cout << "decoding data length "<<length<<std::endl;
if(!initializedCodecs) //initialize ffmpeg objects. Custom I/O, format, decoder, etc.
{
//these are just members of the class
avioCtxPtr = std::unique_ptr<AVIOContext,avio_context_deleter>(
avio_alloc_context((uint8_t*)av_malloc(4096),4096,0,&buf,&readAVBuffer,nullptr,nullptr),
avio_context_deleter());
if(!avioCtxPtr)
{
std::cerr << "Could not create IO buffer" << std::endl;
return frames;
}
fmt_ctx = std::unique_ptr<AVFormatContext,avformat_context_deleter>(avformat_alloc_context(),
avformat_context_deleter());
fmt_ctx->pb = avioCtxPtr.get();
fmt_ctx->flags |= AVFMT_FLAG_CUSTOM_IO ;
//fmt_ctx->max_analyze_duration = 2 * AV_TIME_BASE; // read 2 seconds of data
{
AVFormatContext *fmtCtxRaw = fmt_ctx.get();
if (avformat_open_input(&fmtCtxRaw, "", nullptr, nullptr) < 0) {
std::cerr << "Could not open movie" << std::endl;
return frames;
}
}
if (avformat_find_stream_info(fmt_ctx.get(), nullptr) < 0) {
std::cerr << "Could not find stream information" << std::endl;
return frames;
}
if((video_stream_idx = av_find_best_stream(fmt_ctx.get(), AVMEDIA_TYPE_VIDEO, -1, -1, nullptr, 0)) < 0)
{
std::cerr << "Could not find video stream" << std::endl;
return frames;
}
AVStream *video_stream = fmt_ctx->streams[video_stream_idx];
AVCodec *dec = avcodec_find_decoder(video_stream->codecpar->codec_id);
video_dec_ctx = std::unique_ptr<AVCodecContext,avcodec_context_deleter> (avcodec_alloc_context3(dec),
avcodec_context_deleter());
if (!video_dec_ctx)
{
std::cerr << "Failed to allocate the video codec context" << std::endl;
return frames;
}
avcodec_parameters_to_context(video_dec_ctx.get(),video_stream->codecpar);
video_dec_ctx->thread_count = 1;
/* video_dec_ctx->max_b_frames = 0;
video_dec_ctx->frame_skip_threshold = 10;*/
AVDictionary *opts = nullptr;
av_dict_set(&opts, "refcounted_frames", "1", 0);
av_dict_set(&opts, "deadline", "1", 0);
av_dict_set(&opts, "auto-alt-ref", "0", 0);
av_dict_set(&opts, "lag-in-frames", "1", 0);
av_dict_set(&opts, "rc_lookahead", "1", 0);
av_dict_set(&opts, "drop_frame", "1", 0);
av_dict_set(&opts, "error-resilient", "1", 0);
int width = video_dec_ctx->width;
videoHeight = video_dec_ctx->height;
if(avcodec_open2(video_dec_ctx.get(), dec, &opts) < 0)
{
std::cerr << "Failed to open the video codec context" << std::endl;
return frames;
}
AVPixelFormat pFormat = AV_PIX_FMT_BGR24;
img_convert_ctx = std::unique_ptr<SwsContext,swscontext_deleter>(sws_getContext(width, videoHeight,
video_dec_ctx->pix_fmt, width, videoHeight, pFormat,
SWS_BICUBIC, nullptr, nullptr,nullptr),swscontext_deleter());
frame = std::unique_ptr<AVFrame,avframe_deleter>(av_frame_alloc(),avframe_deleter());
frameRGB = std::unique_ptr<AVFrame,avframe_deleter>(av_frame_alloc(),avframe_deleter());
int numBytes = av_image_get_buffer_size(pFormat, width, videoHeight,32 /*https://stackoverflow.com/questions/35678041/what-is-linesize-alignment-meaning*/);
std::unique_ptr<uint8_t,avbuffer_deleter> imageBuffer((uint8_t *) av_malloc(numBytes*sizeof(uint8_t)),avbuffer_deleter());
av_image_fill_arrays(frameRGB->data,frameRGB->linesize,imageBuffer.get(),pFormat,width,videoHeight,32);
frameRGB->width = width;
frameRGB->height = videoHeight;
initializedCodecs = true;
}
AVPacket pkt;
av_init_packet(&pkt);
pkt.data = nullptr;
pkt.size = 0;
int read_frame_return = 0;
while ( (read_frame_return=av_read_frame(fmt_ctx.get(), &pkt)) >= 0)
{
readFrame(&frames,&pkt,video_dec_ctx.get(),frame.get(),img_convert_ctx.get(),
videoHeight,frameRGB.get());
//if(cancelled) break;
}
avioCtxPtr->eof_reached = 0;
avioCtxPtr->error = 0;
//flush
// readFrame(frames.get(),nullptr,video_dec_ctx.get(),frame.get(),
// img_convert_ctx.get(),videoHeight,frameRGB.get());
avioCtxPtr->eof_reached = 0;
avioCtxPtr->error = 0;
if(frames->size() <= 0)
{
std::cout << "buffer pos: "<<buf.currentPos<<", buff size:"<<buf.data.size()
<<",read_frame_return:"<<read_frame_return<< std::endl;
}
return frames;
}
What I would expect to happen would be for a continuous extraction of cv::Mat frames as I feed it more and more data. What actually happens is that after the the buffer is fully read I see:
[matroska,webm # 0x507b450] Read error at pos. 1278266 (0x13813a)
[matroska,webm # 0x507b450] Seek to desired resync point failed. Seeking to earliest point available instead.
And then no more bytes are read from the buffer even if later I increase the size of it.
There is something terribly wrong I'm doing here and I don't understand what.
What I ended up doing was to do the reading of the incoming data and actual decoding in a different thread. The read method, however, will just block if there are no more bytes available, waiting until anything is coming.
When new bytes are arriving, they're added to the buffer and the conditional_variable signals the waiting thread to wake up and start reading data again from the buffer.
It works well enough.
I am trying to compress and decompress raw PCM (16-Bit) audio, using OPUS.
Here below is my code for opus_encoder.c. If I remove my decoder.c, the buffer works just fine as in the microphone is able to take in raw PCM data. However, once I have implemented my decoder class, it gave me a lot of errors such as memory allocation, heap corruption and so on. Here are some of my errors:
std::bad_alloc at memory location 0x0031D4BC
Stack overflow (parameters: 0x00000000, 0x05122000)
Access violation reading location 0x04A40000.
Based on my understanding, I think my decoder size cannot allocate the memory properly. Can you take a look at my codes and see what went wrong?
Opus_encoder.c
#include "opusencoder.h"
#include <QtConcurrent/QtConcurrent>
opusencoder::opusencoder(){
}
opusencoder::~opusencoder(){
}
OpusEncoder *enc;
int error;
unsigned char *compressedbuffer;
opus_uint32 enc_final_range;
short pcm = 0;
unsigned char *opusencoder::encodedata(const char *audiodata, const unsigned int& size) {
if (size == 0)
return false;
enc = (OpusEncoder *)malloc(opus_encoder_get_size(1));
enc = opus_encoder_create(8000, 1, OPUS_APPLICATION_VOIP, &error);
if (enc == NULL)
{
exit;
}
opus_int32 rate;
opus_encoder_ctl(enc, OPUS_GET_BANDWIDTH(&rate));
this->encoded_data_size = rate;
int len;
for (int i = 0; i < size / 2; i++)
{
//combine pairs of bytes in the original data into two-byte number
//convert const char to short
pcm= audiodata[2 * i] << 8 | audiodata[(2 * i) + 1];
}
qDebug() << "audiodata: " << pcm << endl;
compressedbuffer = new (unsigned char[this->encoded_data_size]);
len = opus_encode(enc, &pcm, 320, compressedbuffer, this->encoded_data_size);
len = opus_packet_unpad(compressedbuffer, len);
len++;
if (len < 0)
{
qDebug() << "Failure to compress";
return NULL;
}
qDebug() << "COmpressed buffer:" << compressedbuffer << endl;
qDebug() << "opus_encode() ................................ OK.\n" << endl;
}
Opus_decoder.c
##include "opusdecoder.h"
#include <QtConcurrent/QtConcurrent>
#define OPUS_CLEAR(dst, n) (memset((dst), 0, (n)*sizeof(*(dst))))
int num_channels = 1;
opusdecoder::opusdecoder(){
}
opusdecoder::~opusdecoder(){
}
opus_int16* opusdecoder::decodedata(int frame_size, const unsigned char *data)
{
dec = opus_decoder_create(8000, 1, &err);
if (dec == NULL)
{
exit;
}
opus_int32 rate;
opus_decoder_ctl(dec, OPUS_GET_BANDWIDTH(&rate));
rate = decoded_data_size;
this->num_channels = num_channels;
int decodedatanotwo;
opus_int16 *decompress = new (opus_int16[frame_size * this->num_channels]);
opus_packet_get_nb_channels(data);
decodedatanotwo= opus_decode(dec, data, this->decoded_data_size, decompress, 320, 0);
if (decodedatanotwo < 0)
{
qDebug() << "Failure to decompress";
return NULL;
}
qDebug() << "opus_decode() ................................ OK.\n" << decodedatanotwo << endl;
if (decodedatanotwo != frame_size)
{
exit;
}
}
im reading a udp-mjpeg-stream with the ffmpeg-API. When i read and display the Stream with an ARM-Processor i have 2 Problems:
1- The Applikation is too slow and there is a big delay between network cam and displayed video.
2- the memory usage increases every time when i call the function av_read_frame().
The Source code
const char *cam1_url = "udp://192.168.1.1:1234";
AVCodec *pCodec;
AVFrame *pFrame, *pFrameRGB;
AVCodecContext *pCodecCon;
AVDictionary *pUdpStreamOptions = NULL;
AVInputFormat *pMjpegFormat = av_find_input_format("mjpeg");
av_dict_set(&pUdpStreamOptions, "fifo_size", "5000000", 0);
av_register_all();
avdevice_register_all();
avcodec_register_all();
avformat_network_init();
AVFormatContext *pFormatCont = avformat_alloc_context();
if(avformat_open_input(&pFormatCont,cam1_url,pMjpegFormat,&pUdpStreamOptions) < 0)
{
cout << "!! Error !! - avformat_open_input(): failed to open input URL" << endl;
}
if(avformat_find_stream_info(pFormatCont,NULL) < 0)
{
cout << "!! Error !! - avformat_find_stream_info(), Failed to retrieve stream info" << endl;
}
av_dump_format(pFormatCont, 0, cam1_url, 0);
int videoStream;
for(int i=0; i< pFormatCont->nb_streams; i++)
{
if(pFormatCont->streams[i]->codec->codec_type==AVMEDIA_TYPE_VIDEO)
{
videoStream=i;
cout << " videoStream = " << videoStream << endl;
}
}
pCodecCon = pFormatCont->streams[videoStream]->codec;
pCodec = avcodec_find_decoder(pCodecCon->codec_id);
if(NULL == pCodec)
{
cout << "couldnt find codec" << endl;
return EXIT_FAILURE;
}
if(avcodec_open2(pCodecCon,pCodec,NULL) < 0)
{
cout << "!! Error !! - in avcodec_open2()" << endl;
return EXIT_FAILURE;
}
uint8_t *frameBuffer;
int numRxBytes = 0;
AVPixelFormat pFormat =AV_PIX_FMT_BGR24;
int width_rgb = (int)((float)pCodecCon->width);
int height_rgb = (int)((float)pCodecCon->height);
numRxBytes = avpicture_get_size(pFormat,width_rgb,height_rgb);
frameBuffer = (uint8_t *) av_malloc(numRxBytes*sizeof(uint8_t));
avpicture_fill((AVPicture *) pFrameRGB, frameBuffer, pFormat,width_rgb,height_rgb);
AVPacket rx_pkt; // received packet
int frameFinished = 0;
struct SwsContext *imgConvertCtx;
av_init_packet(&rx_pkt);
while(av_read_frame(pFormatCont, &rx_pkt) >= 0)
{
if(rx_pkt.stream_index == videoStream)
{
av_frame_free(&pFrame);
pFrame = av_frame_alloc();
av_frame_free(&pFrameRGB);
pFrameRGB = av_frame_alloc();
avcodec_decode_video2(pCodecCon, pFrame, &frameFinished,&rx_pkt);
if(frameFinished)
{
imgConvertCtx = sws_getCachedContext(NULL, pFrame->width,pFrame->height, AV_PIX_FMT_YUVJ420P,width_rgb,height_rgb,AV_PIX_FMT_BGR24, SWS_BICUBIC, NULL, NULL,NULL);
sws_scale(imgConvertCtx, ((AVPicture*)pFrame)->data, ((AVPicture*)pFrame)->linesize, 0, pCodecCon->height, ((AVPicture *)pFrameRGB)->data, ((AVPicture *)pFrameRGB)->linesize);
av_frame_unref(pFrame);
av_frame_unref(pFrameRGB);
}
}
av_free_packet(&rx_pkt);
av_packet_unref(&rx_pkt);
}
//cvDestroyWindow("Cam1Video");
av_free_packet(&rx_pkt);
avcodec_close(pCodecCon);
av_free(pFrame);
av_free(pFrameRGB);
avformat_close_input(&pFormatCont);
I have read, the reason could be that the ffmpeg-Libs saves the incomming frames in the cache but the arm-processor isnt fast enough to process them. After like 4 minutes the system craches.
How could i solve the Problem.
one option could be to tell ffmpeg to act as frame grabber, also to read frames in real time, with the flag "-re". How can i set this Flag in the c++ source code. Or can anybody help me to solve that Problem.
Thank you very much
I suffered some choppy audio when i try to capture audio from a live stream.
Another essential problem which could explain the problem is that the Wav file created is twice longer than the capture time.
The audio is perfect when i play the avs input file with ffplay, so the avs is ok, the problem is after whether in the capture or in the Wav writing.
To capture :
av_read_frame(pFormatCtx, &packet)
if(packet.stream_index == mAudioStream)
{
int buff_size = sizeof(mAudioBuffer);
std::cout << "Buff_size " << buff_size << std::endl;
len = avcodec_decode_audio3(pAudioCodecCtx,(int16_t*)mAudioBuffer, &buff_size,&packet);
if(len < 0){
qDebug("Extractor - Audio isEnd = -1;");
mAudioBufferSize = 0;
isEnd = ERROR_;
return isEnd;
}
// Set packet result type
mFrameType = AUDIO_PKT;
mAudioBufferSize = buff_size;
//store audio synchronization informations:
if(packet.pts != AV_NOPTS_VALUE) {
mAudioPts_ = av_q2d(pFormatCtx->streams[mAudioStream]->time_base);
mAudioPts_ *= packet.pts;
}
}
// store a copy of current audio frame in _frame
_frame.audioFrame = new decoded_frame_t::audio_frame_t();
_frame.audioFrame->sampleRate = mediaInfos.audioSampleRate;
_frame.audioFrame->sampleSize = mediaInfos.audioSampleSize;
_frame.audioFrame->nbChannels = mediaInfos.audioNbChannels;
_frame.audioFrame->nbSamples = mAudioBufferSize / ((mediaInfos.audioSampleSize/8) * mediaInfos.audioNbChannels);
_frame.audioFrame->buf.resize(mAudioBufferSize);
memcpy(&_frame.audioFrame->buf[0],mAudioBuffer,mAudioBufferSize);
Then i store in a Wav File using libsndfile :
SNDFILE* fd;
SF_INFO sfInf;
sfInf.frames = 0;
sfInf.channels = p_capt->ui_nbChannels;
sfInf.samplerate = p_capt->ui_sampleRate;
sfInf.format = SF_FORMAT_WAV | SF_FORMAT_PCM_U8;
sfInf.sections = 0;
sfInf.seekable = 0;
if (sf_format_check(&sfInf) == FALSE)
std::cout << "Format parameter are uncorrect ! Exit saving !" << std::endl;
else
{
fd = sf_open(fileName.toStdString().c_str(), SFM_WRITE, &sfInf);
if (fd == NULL)
{
std::cout << "Unable to open the file " << fileName.toStdString() << std::endl;
return GRAB_ST_NOK;
}
//little trick because v_buf is a uint8_t vector
sf_count_t l = sf_write_short(fd, (const short *)(&(p_capt->v_buf[0])), p_capt->v_buf.size()/2);
if (l != p_capt->v_buf.size()/2)
{
std::cout << "sf_write didn't write the right amoung of bits " << l << " != " << p_capt->v_buf.size()/2 << std::endl;
ret = GRAB_ST_NOK;
}
else
{
sf_write_sync(fd);
sf_close(fd);
ret = GRAB_ST_OK;
}
}
I hope it's understandable. Waiting for remarks.
Kurt
Ok problem solved.
There were two main problems :
resize DO add n element and is not just preparing the vector for further push etc...
the buff_size of avcodec_decode_audio3 return a length in bytes but is copying in a int16_t array so it can be disturbing.