C++ FFmpeg how to continue encoding after flushing? - c++

I write the received packets in binary files. When the recording of the first file is completed, I call flush:
avcodec_send_frame(context, NULL);
This is the signal to end the stream. But when I send a new frame to the encoder, function return AVERROR_EOF (man: the encoder has been flushed, and no new frames can be sent to it). What to do to make the encoder take the frames after flushing?
Example: when decoding, you can call:
avcodec_flush_buffers(context);
This function changes the stream, but only for decoding.
Maybe analogic function for encoding?
Ideas:
1) do not call flush. But the encoder buffers frames inside and gives some packets only after flushing (using h.264 with b-frames), while some packets get into the next file.
2) Recreate codec context?
Details: use Win 7, Qt 5.10, ffmpeg 4.0.2

The correct answer is that you should create a new codec context for each file, or headache will follow. The little expense of additional headers and key frames should be small unless you are doing something very exotic.
B-frames can refer to both previous and future frames, how would you even decide such a beast?
In theory you could probably force a keyframe and hope for the best, but then there is really no point in not starting a new context, unless the hundreds of bytes or so of H264 init data is a problem.

Related

Slow motion effect when decoding OPUS audio stream

I'm capturing the audio stream of a voice chat program (it is proprietary, closed-source and I have no control over it) which is encoded with the OPUS Codec, and I want to decode it into raw PCM audio (Opus Decoder doc).
What I'm doing is:
Create an OPUS decoder: opusDecoder = opus_decoder_create(48000, 1, &opusResult);
Decode the stream: opusResult = opus_decode(opusDecoder, voicePacketBuffer, voicePacketLength, pcm, 9600, 0);
Save it to a file: pcmFile.write(pcm, opusResult * sizeof(opus_int16));
Read the file with Audacity (File > Import > Raw Data...)
Here comes the problem: sometimes it works perfectly well (I can hear the decoded PCM audio without glitch and with the original speed) but sometimes, the decoded audio stream is in "slow motion" (sometimes a little slower than normal, sometimes much slower).
I can't find out why because I don't change my program: the decoding settings remain the same. Yet, sometimes it works, sometimes it doesn't. Also, opus_decode() is always able to decode the data, it doesn't return an error code.
I read that the decoder has a "state" (opus_decoder_ctl() doc). I thought maybe time between opus_decode() calls is important?
Can you think of any parameter, be it explicit (like the parameters given to the functions) or implicit (time between two function calls), that might cause this effect?
"Slow motion" audio is almost always mismatch of sampling rate (recorded on high rate but played in low rate). For example if you record audio on 48kHz but play it as 8kHz.
Another possible reason of "slow motion" is more than one stream decoded by the same decoder. But in this case you also get distorted slow audio.
As for OPUS:
It always decode in rate that you specified in create parameters.
Inside it has pure math (without any timers or realtime related things) so it is not important when you call decode function.
Therefore some troubleshooting advises:
Make sure that you do not create decoder with different sampling rates
Make sure that when you import raw file in audacity you always import it in 48kHz mono
If any above do not help - check how many bytes you receive from decoder on each packet in normal/slow motion cases. For normal audio streams (with uniform inter-packet time) you always get the same number of raw audio samples.

h.264 I-frame loss handling in rtsp streaming

I am developing a player which open rtsp stream using Live555 and using FFMPEG to decode video stream. I am stuck at a point, where IDR frame is getting lost over the network, so that after decoding its successor B/P frames, it shows a jittering effect in video. It gives a very bad performance in video.
So my question is, How can I handle I-frame packet loss? I would like to know if there is any strategy/algorithm to handle packet loss, so that video should be smooth or clear.
Any help will be appreciated.
Thank You.
If it's a first approach, I guess you decode the frame synchronously, I mean the Live555 afterGetting callback call directly the avcodec_decode_video2 of FFMPEG.
In such case the receiving socket is not read during decoding, then packets are buffered till it overflow.
You can try different workaround like increasing the socket buffer, using RTP over TCP, but a real solution need to be more asynchronous, for instance afterGetting can push data to a fifo and the decoding thread can get from it.
Well, once an I-frame is lost, it's lost. You can't really do anything on the client side. The only way we could attack this problem was to configure the server (ie: streamer) in a way that it will send either more frequently I-frames (ie: MORE I-frames in a stream) or more infrequent I-frames (ie_ LESS I-frames in the stream) (if you use ffmpeg/libx264 it can be fine tuned to an incredible level of precision when to send I-frames).

How can arbitrary frames be discarded from an MP4 video using Media Foundation?

I am currently trying to implement an algorithm which can quickly discard unwanted frames from an MP4 video when encoding to another MP4 (using Media Foundation).
The encoding part doesn't seem to bad - the "Source Reader plus Sink Writer" approach is nice and fast. You basically just have to create an IMFSourceReader and an IMFSinkWriter, set the source native media type on the writer, yada, yada, yada, and just loop: source.ReadSample(&sample) --> writer.WriteSample(&sample). The WriteSample() calls can be conditioned on whether they're "! 2 b discarded".
That naive approach is no good if you consider that the samples read will be "predicted-frames", a.k.a., P-frames in the H.264 encoded streams. Dropping any preceding "intra-coded picture frame" (I-frame or key-frame) before that will result in garbled video.
So, my question is, is it possible to introduce an I-frame (somehow) into the sink writer before resuming the sample writing in the sink writer?
Doing something with the MFSampleExtension_CleanPoint attribute doesn't seem to help. I could manually create an IMFSample (via MFCreateSample), but getting it in the right H.264 format might be tricky.
Any ideas? Or thoughts on other approaches to dropping frames during encoding?
I think that this is not possible without reenconding the video! The Reference between P and I Frames are in the h264 Bitstream and not in the container (MP4). You can only safely skip frames, which are not referenced from other frames:
last P-Frames of a GOP (before the next I-Frame)
B-Frames
Normaly these Frames are not referrenced, but they can be! This dependes on the encoder-settings used to create the h264 stream

MPEG4 out of Raw RTP Payload

Okay I got the following problem:
I have an IP Camera which is able to stream MPEG4 data over RTP
I am able to connect to this camera via RTSP
I can receive the raw RTP data.
So what problems do I have now?
1. Extract Data
What is the data I actually want? I know that I have to trunkate the RTP Header - but is there anything else I need to cut from the RTP packets?
2. Packetization Mode
I read that I should expect a field Packetization Mode in my SDP- well it's not there. Does that mean I have to assume some kind of standard packetization mode?
3. Depacketization
If I got it right I need to buffer all incoming frames with the Marker Bit = false until I get a frame with Marker Bit = true to get a complete MPEG4 Frame. What exactly do I have to understand by MPEG4 Frame? Keyframe + data until next keyframe?
4. Decode
Do I have the decode the data any further then? In other threads I saw that people used another decoder - but what is there left to decode? I mean the camera should send the data already MPEG4 coded?
5. Libraries
If I really need to decode the data, are there any open libraries I could use for that? Or maybe there is even a library which has some functions where I can just dump my RTP data and then magic happens and I get my mp4. ( But I assume there will be nothing like that .. )
Note: Everything I want to do should be part of my own application, meaning for example, I can't use an external software to parse the data.
Well long story short - I'd really need some kind of step by step explanation for this to do. I know this is a broad question but I don't know any further. I also looked into the RFCs, but I couldnt extract much information out of them.
Also I already looked up these two Questions:
How to process raw UDP packets so that they can be decoded by a decoder filter in a directshow source filter
MPEG4 extract from RTP payload
But also the long answer from the first question could not make everything clear to me.
UPDATE: Well I informed a bit further and now I don't know where to look anymore. It seems that all the packetization stuff etc. is actually not needed for my purpose. I also recorded a stream with openRTSP. When I open those files in a Hex-Editor I see that there are 16 Bytes which I can't identify, followed by the config part of the SDP. Then the frame starts with the usual 00 00 01 B6. Also oprenRTSP adds some kind of tail to the MP4 - well I actually don't know what I need and whats just some "extra" stuff which isn't mandatory.
I know that I have to trunkate the RTP Header - but is there anything
else I need to cut from the RTP packets?
RTP packet might have stuff data from a file format (such as MP4) or it could have directly based on RFC 3640 or something similar. You need to find that out.
What exactly do I have to understand by MPEG4 Frame? Keyframe + data
until next keyframe? Do I have the decode the data any further then?
In other threads I saw that people used another decoder - but what is
there left to decode? I mean the camera should send the data already
MPEG4 coded?
You should explore basics of MPEG compression to appreciate this fully. The depacketization only give you a string of bits. This is compressed data. You need to uncompress it (decode it) to see it on the screen.
are there any open libraries I could use for that?
try ffmpeg or MPEG4IP

Service a live OpenCV H.264 stream through Live555 on Windows

Totally new to this! As the title says, I'm trying to serve a stream from OpenCV through Live555 using H.264 that is captured from a webcam.
I've tried something like:
#define LOCALADDRESS "rtsp://localhost:8081" // Address media is served
#define FOURCCCODEC CV_FOURCC('H','2','6','4') // H.264 codec
#define FPS 25 // Frame rate things run at
m_writer = cvCreateVideoWriter(LOCALADDRESS, FOURCCCODEC, FPS, cvSize(VIDEOWIDTH, VIDEOHEIGHT));
as reading a rtsp stream, is done similarly:
CvCapture *capture = cvCreateFileCapture(LOCALADDRESS);
which doesn't work so I'm turning to Live555. How do I feed a CvCapture encoded in H.264 to be served by Live555? There doesn't seem to be a straitforward way to serve a bytestream from one to another or perhaps I'm missing something.
There really isn't a straight-forward way I know of; certainly nothing that will happen in anything less than a few hundred lines of code.
I'm assuming you want to use an on-demand RTSP server (this is where the server's just sitting there, waiting for a client to connect, and then it starts streaming when the client establishes a connection and makes a request)? If so, this item in the Live555 FAQ applies.
However, Live555 is a weird (possibly misguided?) library, so it's unfortunately a bit more complicated than that. Live555 uses a single thread of operation with an event loop, so what you'll have to do is shove your raw bytestream into a buffer or queue, and then in your subsession class for streaming H.264, you'll check and see if there's available data in the queue and if so, pass it along. If not, schedule another check in a few milliseconds. You'll also need to strip off any NALU identifiers before you pass them along to live555.