ffmpeg audio frame from directshow sampleCB imediasample - c++

i use isamplegrabber sampleCB callback to get audio sample, i can get buffer and buffer length from imediasample and i use avcodec_fill_audio_frame(frame,ost->enc->channels,ost->enc->sample_fmt,(uint8_t *)buffer,length,0) to make an avframe , but this frame does not make any audio in my mux file! i think the length is very smaller than frame_size.
can every one help me please? or give me some example if it is possible.
thank you
this is my samplecb code :
HRESULT AudioSampleGrabberCallBack::SampleCB(double Time, IMediaSample*pSample){
BYTE *pBuffer;
pSample->GetPointer(&pBuffer);
long BufferLen = pSample->GetActualDataLength();
muxer->PutAudioFrame(pBuffer,BufferLen);
}
and this is samplegrabber pin media type :
AM_MEDIA_TYPE pmt2;
ZeroMemory(&pmt2, sizeof(AM_MEDIA_TYPE));
pmt2.majortype = MEDIATYPE_Audio;
pmt2.subtype = FOURCCMap(0x1602);
pmt2.formattype = FORMAT_WaveFormatEx;
hr = pSampleGrabber_audio->SetMediaType(&pmt2);
after that i using ffmpeg muxing example to process frames and i think i need only to change the signal generating part of code :
AVFrame *Muxing::get_audio_frame(OutputStream *ost,BYTE* buffer,long length)
{
AVFrame *frame = ost->tmp_frame;
int j, i, v;
uint16_t *q = (uint16_t*)frame->data[0];
int buffer_size = av_samples_get_buffer_size(NULL, ost->enc->channels,
ost->enc->frame_size,
ost->enc->sample_fmt, 0);
// uint8_t *sample = (uint8_t *) av_malloc(buffer_size);
av_samples_alloc(&frame->data[0], frame->linesize, ost->enc->channels, ost->enc->frame_size, ost->enc->sample_fmt, 1);
avcodec_fill_audio_frame(frame, ost->enc->channels, ost->enc->sample_fmt,frame->data[0], buffer_size, 1);
frame->pts = ost->next_pts;
ost->next_pts += frame->nb_samples;
return frame;
}

The code snippets suggest you are getting AAC data using Sample Grabber and you are trying to write that into file using FFmpeg's libavformat. This can work out.
You initialize your sample grabber to get audio data in WAVE_FORMAT_AAC_LATM format. This format is not so wide spread and you are interested in reviewing your filter graph to make sure the upstream connection on the Sample Grabber is such that you expect. There is a chance that somehow there is a weird chain of filter that pretend to produce AAC-LATM and the reality is that the data is invalid (or not even reaching grabber callback). So you need to review the filter graph (see Loading a Graph From an External Process and Understanding Your DirectShow Filter Graph), then step through your callback with debugger to make sure you get the data and it makes sense.
Next thing, you are expected to initialize AVFormatContext, AVStream to indicate that you will be writing data in AAC LATM format. Provided code does not show you are doing it right. The sample you are referring to is using default codecs.
Related reading: Support LATM AAC in MP4 container
Then, you need to make sure that both incoming data and your FFmpeg output setup are in agreement about whether the data has or does not have ADTS headers, the provided code does not shed any light on this.
Furthermore, I am afraid you might be preparing your audio data incorrectly. The sample in question generates raw audio data and applies encoder to produce compressed content using avcodec_encode_audio2. Then a packed with compressed audio is being sent to writing using av_interleaved_write_frame. The way you attached your code snippets to the question makes me thing you are doing it wrong. For starters, you still don't show relevant code which makes me think you have troubles identifying what code is relevant exactly. Then you are dealing with your AAC data as if it was raw PCM audio in get_audio_frame code snippet whereas you are interested in reviewing FFmpeg sample code with the thought in mind that you already have compressed AAC data and sample gets to thins point after return from avcodec_encode_audio2 call. This is where you are supposed to merge your code and the sample.

Related

Get composition time C++

I made my own rtmp server using libav and ffmpeg. I receive as input either an flv file or an rtmp streaming "containing" an flv file.
Since I manipulate the flv file and the relative composition time of each frame, I would like to know if there is a way to get this composition time.
I thought that given my AVPacket, I could analyze the raw buffer in order to extract the right information since I know that the flv header is 11 bytes and then in the next 16 bytes I should find the composition time.
But it doesn't work.
This is a rough example of code:
AVPacket pkt;
AVFormatContext *ifmt_ctx
while(true)
{
AVStream *in_stream, *out_stream;
ret = av_read_frame(ifmt_ctx, &pkt);
//get the composite time
}
AVPacket needs to be able to represent the data found in all media formats. Some formats (like mp4 and flv) have a decode_time and a composition_time, other (like transport streams) have a decode_time and a presentation_time. To make it easier for the programmer, AVPacket chose one method to store the information and converts when needed. Luckily its an an easy to convert back:
auto cts = pkt.pts - pkt.dts

ffmpeg c++ API encode mpegts with KLV data stream

I need to encode an mpegts video using the ffmpeg C++ API. The output video shall have two streams: the first one shall be of type AVMEDIA_TYPE_VIDEO; the second one shall be of type AVMEDIA_TYPE_DATA and shall contain a set of KLV data.
I have written my own KLV library to manage the KLV format.
However I'm not able to create "from scratch" a new video by combining the two streams. Following the implementation as in FFMPEG C api h.264 encoding / MPEG2 ts streaming problems I can successfully encode a mpegts video with a single video stream.
However I'm not able to add a new AVMEDIA_TYPE_DATA stream to the output video since, as soon as I add a new data stream using methods like avformat_new_stream(...) the output video is empty: neither the data stream nor the video one are produced and the output file is empty.
Can anyone suggest me a tutorial page or a sample on how to properly add a data stream to my output video in mpegts format?
Thanks a lot!
I was able to get a KLV stream added to a muxed output by starting with the "muxing.c" example that comes with the FFmpeg source, and modifying it as follows.
First, I created the AVStream as follows, where "oc" is the AVFormatContext (muxer) variable:
AVStream *klv_stream = klv_stream = avformat_new_stream(oc, NULL);
klv_stream->codec->codec_type = AVMEDIA_TYPE_DATA;
klv_stream->codec->codec_id = AV_CODEC_ID_TIMED_ID3;
klv_stream->time_base = AVRational{ 1, 30 };
klv_stream->id = oc->nb_streams - 1;
Then, during the encoding/muxing loop:
AVPacket pkt;
av_init_packet(&pkt);
pkt.data = (uint8_t*)GetKlv(pkt.size);
auto res = write_frame(oc, &video_st.st->time_base, klv_stream, &pkt);
free(pkt.data);
(The GetKlv() function returns a malloc()'ed array of binary data that would be replaced by whatever you're using to get your encoded KLV. It sets pkt.size to the length of the data.)
With this modification, and specifying a ".ts" target file, I get a three-stream file that plays just fine in VLC. The KLV stream has a stream_type of 0x15, indicating synchronous KLV.
Note the codec_id value of AV_CODEC_ID_TIMED_ID3. According to the libavformat source file "mpegtsenc.c", a value of AV_CODEC_ID_OPUS should result in stream_type 6, for asynchronous KLV (no accompanying PTS or DTS). This is actually important for my application, but I'm unable to get it to work -- the call to avformat_write_header() throws a division by zero error. If I get that figured out, I'll add an update here.

IMediaSample returned by Sample Grabber has unexpected buffer size

I was working on a audio/video capture library for Windows using Media Foundation. However, I faced the issue described in this post for some webcams on Windows 8.1. I have thus decided to have another implementation using Directshow to support in my app the webcams for which the drivers haven't been updated yet.
The library works quite well, but I have noticed a problem with some webcams for which the sample (IMediaSample) returned has not the expected size according to the format that was set before starting the camera.
For example, I have the case where the format set has a subtype of MEDIASUBTYPE_RGB24 (3 bytes per pixels) and the frame size is 640x480. The biSizeImage (from BITMAPINFOHEADER) is well 640*480*3 = 921600 when applying the format.
The IAMStreamConfig::SetFormat() method succeed to apply the format.
hr = pStreamConfig->SetFormat(pmt);
I also set the format to the Sample Grabber Interface as follow:
hr = pSampleGrabberInterface->SetMediaType(pmt);
I applied the format before starting the graph.
However, in the callback (ISampleGrabberCB::SampleCB), I'm receiving a sample of size 230400 (which might be a buffer for a frame of size 320x240 (320*240*3=230400)).
HRESULT MyClass::SampleCB(double sampleTime, IMediaSample *pSample)
{
unsigned char* pBuffer= 0;
HRESULT hr = pSample->GetPointer((unsigned char**) &pBuffer);
if(SUCCEEDED(hr) {
long bufSize = pSample->GetSize();
//bufSize = 230400
}
}
I tried to investigate the media type returned using the IMediaSample::GetMediaType() method, but the media type is NULL, which means, according to the documentation of the GetMediaType method that the media type has not changed (so I guess, it's still the media type I've applied successfully using the IAMStreamConfig::SetFormat() function).
HRESULT hr = pSample->GetMediaType(&pType);
if(SUCCEEDED(hr)) {
if(pType==NULL) {
//it enters here => the media type has not changed!
}
}
Why the sample buffer size returned is not the expected size in this case? How can I solve this issue?
Thanks in advance!
Sample Grabber callback will always return "correct" size in terms that it matches actual data size and formats used in streaming pipeline.
If you are seeing the mismatch, it means that your filter graph topology differs from what you expect it to be. You need to review the graph (esp. using remote connection by GraphEdit), inspect media types and check why it was built incorrectly. For example, you could be applying format of your interest after connecting pins, which is too late.
See also:
How can I reverse engineer a DirectShow graph?

Get encoder name from SinkWriter or ICodecAPI or IMFTransform

I'm using the SinkWriter in order to encode video using media foundation.
After I initialize the SinkWriter, I would like to get the underlying encoder it uses, and print out its name, so I can see what encoder it uses. (In my case, the encoder is most probably the H.264 Video Encoder included in MF).
I can get references to the encoder's ICodecAPI and IMFTransform interface (using pSinkWriter->GetServiceForStream), but I don't know how to get the encoder's friendly name using those interfaces.
Does anyone know how to get the encoder's friendly name from the sinkwriter? Or from its ICodecAPI or IMFTransform interface?
This is by far an effective solution and i am not 100% sure it works, but what could be done is:
1) At start-up enumerate all the codecs that could be used (as i understand in this case H264 encoders) and subscribe to setting change event
MFT_REGISTER_TYPE_INFO TransformationOutput = { MFMediaType_Video, MFVideoFormat_H264 };
DWORD nFlags = MFT_ENUM_FLAG_ALL;
UINT32 nCount = 0;
CLSID* pClsids;
MFTEnum( MFT_CATEGORY_VIDEO_ENCODER, nFlags, NULL, &TransformationOutput, NULL, &pClsids, &nCount);
// Ok here we assume nCount is 1 and we got the MS encoder
ICodecAPI *pMsEncoder;
hr = CoCreateInstance(pClsids[0], NULL, CLSCTX_INPROC_SERVER, __uuidof(ICodecAPI), (void**)&pMsEncoder);
// nCodecIds is supposed to be an array of identifiers to distinguish the sender
hr = pMsEncoder->RegisterForEvent(CODECAPI_AVEncVideoOutputFrameRate, (LONG_PTR)&nCodecIds[0]);
2) Not 100% sure if the frame rate setting is also set when the input media type for the stream is set, but anyhow you can try to set the same property on the ICodecAPI you retrieved from the SinkWriter. Then after getting the event you should be able to identify the codec by comparing lParam1 to the value passed. But still this is very poor since it relies on the fact that all the encoders support the event notification and requires unneeded parameter changing if my hypothesis about the event being generated on stream construction is wrong.
Having IMFTransform you don't have a friendly name of the encoder.
One of the options you have is to check transform output type and compare to well known GUIDs to identify the encoder, in particular you are going to have a subtype of MFVideoFormat_H264 with H264 Encoder MFT.
Another option is to reach CLSID of the encoder (IMFTransform does not get you it, but you might have it otherwise such as via IMFActivate or querying MFT_TRANSFORM_CLSID_Attribute attribute, or via IPersist* interfaces). Then you could look registry up for a friendly name or enumerate transforms and look your one in that list by comparing CLSID.

How to use FFMPEG to play H.264 stream from NAL units that are stored as video AVPackets

I am writing client-server system that uses FFMPEG library to parse H.264 stream into NAL units on the server side, then uses channel coding to send them over network to client side, where my application must be able to play video.
The question is how to play received AVPackets (NAL units) in my application as video stream.
I have found this tutorial helpful and used it as base for both server and client side.
Some sample code or resource related to playing video not from file, but from data inside program using FFMPEG library would be very helpful.
I am sure that received information will be sufficient to play video, because I tried to save received data as .h264 or .mp4 file and it can be played by VLC player.
Of what I understand from your question, you have the AVPackets and want to play a video. In reality this is two problems; 1. decoding your packets, and 2. playing the video.
For decoding your packets, with FFmpeg, you should take a look at the documentation for AVPacket, AVCodecContext and avcodec_decode_video2 to get some ideas; the general idea is that you want to do something (just wrote this in the browser, take with a grain of salt) along the lines of:
//the context, set this appropriately based on your video. See the above links for the documentation
AVCodecContext *decoder_context;
std::vector<AVPacket> packets; //assume this has your packets
...
AVFrame *decoded_frame = av_frame_alloc();
int ret = -1;
int got_frame = 0;
for(AVPacket packet : packets)
{
avcodec_get_frame_defaults(frame);
ret = avcodec_decode_video2(decoder_context, decoded_frame, &got_frame, &packet);
if (ret <= 0) {
//had an error decoding the current packet or couldn't decode the packet
break;
}
if(got_frame)
{
//send to whatever video player queue you're using/do whatever with the frame
...
}
got_frame = 0;
av_free_packet(&packet);
}
It's a pretty rough sketch, but that's the general idea for your problem of decoding the AVPackets. As for your problem of playing the video, you have many options, which will likely depend more on your clients. What you're asking is a pretty large problem, I'd advise familiarizing yourself with the FFmpeg documentation and the provided examples at the FFmpeg site. Hope that makes sense