How to use x264 for encoding with ffmpeg? - c++

I tryed to use ffmpeg for encoding video/ But it fails on initialization of AVCodecContext annd AVCodec.
What I do:
_codec = avcodec_find_encoder(CODEC_ID_H264);
_codecContext = avcodec_alloc_context3(_codec);
_codecContext->coder_type = 0;
_codecContext->me_cmp|= 1;
_codecContext->me_method=ME_HEX;
_codecContext->me_subpel_quality = 0;
_codecContext->me_range = 16;
_codecContext->gop_size = 12;
_codecContext->scenechange_threshold = 40;
_codecContext->i_quant_factor = 0.71;
_codecContext->b_frame_strategy = 1;
_codecContext->qcompress = 0.5;
_codecContext->qmin = 2;
_codecContext->qmax = 31;
_codecContext->max_qdiff = 4;
_codecContext->max_b_frames = 3;
_codecContext->refs = 3;
_codecContext->trellis = 1;
_codecContext->width = format.biWidth;
_codecContext->height = format.biHeight;
_codecContext->time_base.num = 1;
_codecContext->time_base.den = 30;
_codecContext->pix_fmt = PIX_FMT_YUV420P;
_codecContext->chromaoffset = 0;
_codecContext->thread_count =1;
_codecContext->bit_rate = (int)(128000.f * 0.80f);
_codecContext->bit_rate_tolerance = (int) (128000.f * 0.20f);
int error = avcodec_open2(_codecContext, _codec, NULL);
if(error< )
{
std::cout<<"Open codec fail. Error "<<error<<"\n";
return NULL;
}
In such way ii fails on avopen_codec2() with:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xae1fdb70 (LWP 30675)]
0xb2eb2cbb in x264_param_default () from /usr/lib/libx264.so.120
If i comment all AVCodecContext parameters settins - I have :
[libx264 # 0xac75edd0] invalid width x height (0x0)
And avcodec_open retunrs negative value. Which steps, I'm doing, are wrong?
Thanks for any help (ffmpeg 0.10 && libx264 daily snapshot for yesterday)

In my experience you should give FFMPEG the least amount of information when initialising your codec as possible. This may seem counter intuitive but it means that FFMPEG will use it's default settings that are more likely to work than your own guesses. See what I would include below:
AVStream *st;
m_video_codec = avcodec_find_encoder(AV_CODEC_ID_H264);
st = avformat_new_stream(_outputCodec, m_video_codec);
_outputCodecContext = st->codec;
_outputCodecContext->codec_id = m_fmt->video_codec;
_outputCodecContext->bit_rate = m_AVIMOV_BPS; //Bits Per Second
_outputCodecContext->width = m_AVIMOV_WIDTH; //Note Resolution must be a multiple of 2!!
_outputCodecContext->height = m_AVIMOV_HEIGHT; //Note Resolution must be a multiple of 2!!
_outputCodecContext->time_base.den = m_AVIMOV_FPS; //Frames per second
_outputCodecContext->time_base.num = 1;
_outputCodecContext->gop_size = m_AVIMOV_GOB; // Intra frames per x P frames
_outputCodecContext->pix_fmt = AV_PIX_FMT_YUV420P;//Do not change this, H264 needs YUV format not RGB
As in previous answers, here is a working example of the FFMPEG library encoding RGB frames to a H264 video:
http://www.imc-store.com.au/Articles.asp?ID=276
An extra thought on your code though:
Have you called register all like below?
avcodec_register_all();
av_register_all();
If you don't call these two functions near the start of your code your subsequent calls to FFMPEG will fail and you'll most likely seg-fault.
Have a look at the linked example, I tested it on VC++2010 and it works perfectly.

Related

FFMPEG/C++ How to simply write out/passthrough h264 stream?

Im trying to learn the basics of ffmpeg writing (reading already works), so im just trying to take in an input .ts file and write out/passthrough the exact same h264 stream to a new output file. I dont get any compilation errors, but for some reason i cant figure out why my output file's framerate is very wrong. Also when i read in my output file, i get printouts saying "Packet corrupt (stream = 0, dts = #)"
I followed the instructions in the ffmpeg library comments so im not sure what im missing. I call initOutStream(), then initH264encoder(), then during the reading/decoding readH264Packet() is called repeatedly. (Removed code for readability sake, left relevant sections below);
Edit: If i put my output file through the actuall ffmpeg cmd app, the framerate issue seems to get fixed. Wonder where im messing up
void test::initOutStream() {
//create muxing context
outstreamContext = avformat_alloc_context();
//oformat
AVOutputFormat *guessFormat; //Populate oformat
guessFormat = av_guess_format(NULL, inputVideoUrl.c_str(), NULL);
outstreamContext->oformat = guessFormat;
outstreamContext->oformat->video_codec = AV_CODEC_ID_H264;
//outstreamContext->bit_rate = 400000; //No affect;
//pb
AVIOContext *outAVIOContext = nullptr;
//int result = avio_open(&outAVIOContext, outputVideoUrl.c_str(), AVIO_FLAG_WRITE);
int result = avio_open2(&outAVIOContext, outputVideoUrl.c_str(), AVIO_FLAG_WRITE, NULL, NULL); //Documentain said to use this method
outstreamContext->pb = outAVIOContext;
}
.
void test::initH264encoder() { //Frame -> packet
int result;
h264OutCodec = avcodec_find_encoder(AV_CODEC_ID_H264);
h264OutStream = avformat_new_stream(outstreamContext, h264OutCodec);
h264OutStream->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
h264OutStream->codecpar->codec_id = AV_CODEC_ID_H264;
h264OutStream->codecpar->width = 640;
h264OutStream->codecpar->height = 480;
h264OutStream->id = H264_STREAM_ID;
h264OutStream->codecpar->color_range = AVCOL_RANGE_MPEG;
//h264OutStream->codecpar->bit_rate = 400000;
h264OutContext = avcodec_alloc_context3(h264OutCodec);
h264OutContext->width = 640;
h264OutContext->height = 480;
h264OutContext->time_base = (AVRational){1,static_cast<int>(29.97)};
h264OutContext->pix_fmt = AV_PIX_FMT_YUV420P;
result = avcodec_open2(h264OutContext, h264OutCodec, nullptr);
//Alloc packet + finish
outPacket = av_packet_alloc();
//Write header
result = avformat_write_header(outstreamContext, NULL);
}
Assume that reading was set up correctly
void test::readH264Packet(__unused uint64_t tick) {
//...av_read_frame(streamContext, inPacket);
//...avcodec_send_packet(h264Context, inPacket);
//...avcodec_receive_frame(h264Context, yuvFrame)
//My passthrough:
if(shouldOutputH264Stream){
result = avcodec_send_frame(h264OutContext, yuvFrame); //1. Encode frame to packet
result = avcodec_receive_packet(h264OutContext, outPacket2); //2. get encoded packet
result = av_interleaved_write_frame(outstreamContext, outPacket2); //3. write packet
//Write trailer and free happens later
}
}

Problems in my sample C code using FFmpeg API

I've been trying to change an FFmpeg's example code HERE to call other filters using its C APIs. Say the filter be freezedetect=n=-60dB:d=8 which normally runs like this:
ffmpeg -i small.mp4 -vf "freezedetect=n=-60dB:d=8" -map 0:v:0 -f null -
And prints outputs like this:
[freezedetect # 0x25b91c0] lavfi.freezedetect.freeze_start: 5.005
[freezedetect # 0x25b91c0] lavfi.freezedetect.freeze_duration: 2.03537
[freezedetect # 0x25b91c0] lavfi.freezedetect.freeze_end: 7.04037
However, the original example displays frames, not these metadata information. How can I change the code to print this metadata information (and not the frames)?
I've been trying to change the display_frame function below into a display_metadata function. Looks like the frame variable has a metadata dictionary which looks promising, but my attempts failed to use it. I'm also new to C language.
Original display_frame function:
static void display_frame(const AVFrame *frame, AVRational time_base)
{
int x, y;
uint8_t *p0, *p;
int64_t delay;
if (frame->pts != AV_NOPTS_VALUE) {
if (last_pts != AV_NOPTS_VALUE) {
/* sleep roughly the right amount of time;
* usleep is in microseconds, just like AV_TIME_BASE. */
delay = av_rescale_q(frame->pts - last_pts,
time_base, AV_TIME_BASE_Q);
if (delay > 0 && delay < 1000000)
usleep(delay);
}
last_pts = frame->pts;
}
/* Trivial ASCII grayscale display. */
p0 = frame->data[0];
puts("\033c");
for (y = 0; y < frame->height; y++) {
p = p0;
for (x = 0; x < frame->width; x++)
putchar(" .-+#"[*(p++) / 52]);
putchar('\n');
p0 += frame->linesize[0];
}
fflush(stdout);
}
My new display_metadata function that needs to be completed:
static void display_metadata(const AVFrame *frame)
{
// printf("%d\n",frame->height);
AVDictionary* dic = frame->metadata;
printf("%d\n",*(dic->count));
// fflush(stdout);
}

iOS AVCaptureSession stream micro audio Objective-C++

I am currently messing the first time with iOS and Objective-C++. Im coming from C/C++ so please excuse my bad coding in the below examples.
I am trying to live stream the microphone audio of my iOS device over tcp, the iOS device is acting as server and sends the data to all clients that connect.
To do so, I am first using AVCaptureDevice and requestAccessForMediaType:AVMediaTypeAudio to request access to the microphone (along with the needed entry in the Info.plist).
Then I create a AVCaptureSession* using the below function:
AVCaptureSession* createBasicARecordingSession(aReceiver* ObjectReceivingAudioFrames){
AVCaptureSession* s = [[AVCaptureSession alloc] init];
AVCaptureDevice* aDevice = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeAudio];
AVCaptureDeviceInput* aInput = NULL;
if([aDevice lockForConfiguration:NULL] == YES && aDevice){
aInput = [AVCaptureDeviceInput deviceInputWithDevice:aDevice error:nil];
[aDevice unlockForConfiguration];
}
else if(!aDevice){
fprintf(stderr, "[d] could not create device. (%p)\n", aDevice);
return NULL;
}
else{
fprintf(stderr, "[d] could not lock device.\n");
return NULL;
}
if(!aInput){
fprintf(stderr, "[d] could not create input.\n");
return NULL;
}
AVCaptureAudioDataOutput* aOutput = [[AVCaptureAudioDataOutput alloc] init];
dispatch_queue_t aQueue = dispatch_queue_create("aQueue", NULL);
if(!aOutput){
fprintf(stderr, "[d] could not create output.\n");
return NULL;
}
[aOutput setSampleBufferDelegate:ObjectReceivingAudioFrames queue:aQueue];
// the below line does only work on macOS
//aOutput.audioSettings = settings;
[s beginConfiguration];
if([s canAddInput:aInput]){
[s addInput:aInput];
}
else{
fprintf(stderr, "[d] could not add input.\n");
return NULL;
}
if([s canAddOutput:aOutput]){
[s addOutput:aOutput];
}
else{
fprintf(stderr, "[d] could not add output.\n");
return NULL;
}
[s commitConfiguration];
return s;
}
The aReceiver* class (?) is defined below and receives the audio frames provided by the AVCaptureAudioDataOutput* object. The frames are stored inside a std::vector.
(im adding the code as image as I could not get it formatted right...)
Then I start the AVCaptureSession* using [audioSession start]
When a tcp client connects I first create a AudioConverterRef and two AudioStreamBasicDescription to convert the audio frames to AAC, see below:
AudioStreamBasicDescription asbdIn, asbdOut;
AudioConverterRef converter;
asbdIn.mFormatID = kAudioFormatLinearPCM;
//asbdIn.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
asbdIn.mFormatFlags = 12;
asbdIn.mSampleRate = 44100;
asbdIn.mChannelsPerFrame = 1;
asbdIn.mFramesPerPacket = 1;
asbdIn.mBitsPerChannel = 16;
//asbdIn.mBytesPerFrame = (asbdIn.mBitsPerChannel / 8) * asbdIn.mBitsPerChannel;
asbdIn.mBytesPerFrame = 2;
asbdIn.mBytesPerPacket = asbdIn.mBytesPerFrame;
asbdIn.mReserved = 0;
asbdOut.mFormatID = kAudioFormatMPEG4AAC;
asbdOut.mFormatFlags = 0;
asbdOut.mSampleRate = 44100;
asbdOut.mChannelsPerFrame = 1;
asbdOut.mFramesPerPacket = 1024;
asbdOut.mBitsPerChannel = 0;
//asbdOut.mBytesPerFrame = (asbdOut.mBitsPerChannel / 8) * asbdOut.mBitsPerChannel;
asbdOut.mBytesPerFrame = 0;
asbdOut.mBytesPerPacket = asbdOut.mBytesPerFrame;
asbdOut.mReserved = 0;
OSStatus err = AudioConverterNew(&asbdIn, &asbdOut, &converter);
Then I create a AudioBufferList* to store the encoded frames:
while(audioInput.locked){ // audioInput is my aReceiver*
usleep(0.2 * 1000000);
}
audioInput.locked = true;
UInt32 RequestedPackets = 8192;
//AudioBufferList* aBufferList = (AudioBufferList*)malloc(sizeof(AudioBufferList));
AudioBufferList* aBufferList = static_cast<AudioBufferList*>(calloc(1, offsetof(AudioBufferList, mBuffers) + (sizeof(AudioBuffer) * 1)));
aBufferList->mNumberBuffers = 1;
aBufferList->mBuffers[0].mNumberChannels = asbdIn.mChannelsPerFrame;
aBufferList->mBuffers[0].mData = static_cast<void*>(calloc(RequestedPackets, asbdIn.mBytesPerFrame));
aBufferList->mBuffers[0].mDataByteSize = asbdIn.mBytesPerFrame * RequestedPackets;
Then I go through the frames stored in the std::vector mentioned earlier and pass them to AudioConverterFillComplexBuffer(). After conversion, i concat all encoded frames into one NSMutableData which I then write() to the socket connected to the client.
long aBufferListSize = audioInput.aBufferList.size();
while(aBufferListSize > 0){
err = AudioConverterFillComplexBuffer(converter, feedAFrames, static_cast<void*>(&audioInput.aBufferList[audioInput.aBufferList.size() - aBufferListSize]), &RequestedPackets, aBufferList, NULL);
NSMutableData* encodedData = [[NSMutableData alloc] init];
long encodedDataLen = 0;
for(int i = 0; i < aBufferList->mNumberBuffers; i++){
Float32* frame = (Float32*)aBufferList->mBuffers[i].mData;
[encodedData appendBytes:frame length:aBufferList->mBuffers[i].mDataByteSize];
encodedDataLen += aBufferList->mBuffers[i].mDataByteSize;
}
unsigned char* encodedDataBytes = (unsigned char*)[encodedData bytes];
fprintf(stderr, "[d] got %li encoded bytes to send...\n", encodedDataLen);
long bytes = write(Client->GetFD(), encodedDataBytes, encodedDataLen);
fprintf(stderr, "[d] written %li of %li bytes.\n", bytes, encodedDataLen);
usleep(0.2 * 1000000);
aBufferListSize--;
}
audioInput.aBufferList.clear();
audioInput.locked = false;
Below is the feedAFrames() callback used in the AudioConverterFillComplexBuffer() call:
(again this is an image of the code, same reason as above)
Step 5 to 7 are repeated until the tcp connection is closed.
Each step runs without any noticeable error (I know I could include way better error handling here), and I do get data out of step 3 and 7. However it does not seem to be AAC what comes out at the end.
As im rather new to all of this, im really not sure what my error is, im sure there are several things I made wrong. It is kind of hard to find suitable example code of what I am trying to do, and the above is the best I could come up with until now with all that I have found paired with the apple dev documentation.
I hope someone might take some time to explain me what I did wrong and how I can get this to work. Thanks for reading until here!

Get frame time in ffmpeg

I am trying to make a little video player that has seek bar (with ffmpeg, of course). For that i need function that will, using data from frame and/or packet, get me current time in the video that should be set in seek slider.
It should work like this:
my_time = get_cur_time()
seek(my_time + 10)
assert(my_time+10 == get_cur_time())
seek(my_time - 10)
assert(my_time-10 == get_cur_time())
I do understand thatffmpeg does not support precise seeking, so equality here means "something reasonably cloae).
What code have i used for this thus far:
frame_time = frame->pts*av_q2d(video_dec_ctx->time_base) * 1000;
where frame is AVFrame and video_dec_ctx is AVCodecContext.
And for seeking:
int fn = ffmpeg::av_rescale(tsms,fmt_ctx->streams[video_stream->index]->time_base.den,
fmt_ctx->streams[video_stream->index]->time_base.num);
int frame = fn/1000;
printf("\t avformat_seek_file to %d\n",frame);
int flags = AVSEEK_FLAG_FRAME;
if (frame < this->frame->pts)
flags |= AVSEEK_FLAG_BACKWARD;
if(ffmpeg::av_seek_frame(fmt_ctx,video_stream->index,frame,flags))
{
printf("\nFailed to seek for time %d",frame);
return false;
}
avcodec_flush_buffers(video_dec_ctx);
int got_frame = 0;
do
if (av_read_frame(fmt_ctx, &pkt) >= 0) {
decode_packet_ro(&got_frame, 0);
av_free_packet(&pkt);
}
else
{
read_cache = true;
pkt.data = NULL;
pkt.size = 0;
break;
}
while(!(got_frame && this->frame->pts >= frame));
The code does forward seeking passably, but after any attempt of backward seeking my second assertion fails. After seeking to previous position, my method of getting time does not return position less that one before seeking. That causes my seek slider to work grossly incorrectly.

OSX AUGraph recreation causes badComponentType error

On OSX I'm creating an AUGraph for my audio system like so:
OSStatus result = NewAUGraph(&mGraph);
AUNode outputNode;
AudioComponentDescription outputDesc;
outputDesc.componentType = kAudioUnitType_Output;
outputDesc.componentSubType = kAudioUnitSubType_DefaultOutput;
outputDesc.componentManufacturer = kAudioUnitManufacturer_Apple;
outputDesc.componentFlags = 0;
outputDesc.componentFlagsMask = 0;
result = AUGraphAddNode(mGraph, &outputDesc, &outputNode);
AUNode converterNode;
AudioComponentDescription converterDesc;
converterDesc.componentType = kAudioUnitType_FormatConverter;
converterDesc.componentSubType = kAudioUnitSubType_AUConverter;
converterDesc.componentManufacturer = kAudioUnitManufacturer_Apple;
converterDesc.componentFlags = 0;
outputDesc.componentFlagsMask = 0;
result = AUGraphAddNode(mGraph, &converterDesc, &converterNode);
result = AUGraphConnectNodeInput(mGraph, converterNode, 0, outputNode, 0);
result = AUGraphOpen(mGraph);
...initialize graph, start graph, etc...
This all works fine, I can hear sound, etc. Later the system is shut down:
unsigned char isRunning = false;
AUGraphIsRunning(mGraph, &isRunning);
if (isRunning)
AUGraphStop(mGraph);
OSStatus result;
unsigned char isInitialized = false;
AUGraphIsInitialized(mGraph, &isInitialized);
if (isInitialized)
{
result = AUGraphUninitialize(mGraph);
}
result = DisposeAUGraph(mGraph);
Again, no problems here. However a short while after the first code block gets executed again when the system is restarted. On:
result = AUGraphOpen(mGraph);
"result" comes out as -2005 (badComponentType). Anyone know what causes this?
Calling AUGraphClose in the shutdown fixed this. Guess you can't have two open graphs with the same output unit in?