Custom Media Foundation sink never receives samples - c++

I have my own MediaSink in Windows Media Foundation with one stream. In the OnClockStart method, I instruct the stream to queue (i) MEStreamStarted and (ii) MEStreamSinkRequestSample on itself. For implementing the queue, I use the IMFMediaEventQueue, and using the mtrace tool, I can also see that someone dequeues the event.
The problem is that ProcessSample of my stream is actually never called. This also has the effect that no further samples are requested, because this is done after processing a sample like in https://github.com/Microsoft/Windows-classic-samples/tree/master/Samples/DX11VideoRenderer.
Is the described approach the right way to implement the sink? If not, what would be the right way? If so, where could I search for the problem?
Some background info: The sink is an RTSP sink based on live555. Since the latter is also sink-driven, I thought it would be a good idea queuing a MEStreamSinkRequestSample whenever live555 requests more data from me. This is working as intended.
However, the solution has the problem that new samples are only requested as long as a client is connected to live555. If I now add a tee before the sink, eg to show a local preview, the system gets out of control, because the tee accumulates samples on the output of my sink which are never fetched. I then started playing around with discardable samples (cf. https://social.msdn.microsoft.com/Forums/sharepoint/en-US/5065a7cd-3c63-43e8-8f70-be777c89b38e/mixing-rate-sink-and-rateless-sink-on-a-tee-node?forum=mediafoundationdevelopment), but the problem is either that the stream does not start, queues are growing or the frame rate of the faster sink is artificially limited depending on which side is discardable.
Therefore, the next idea was rewriting my sink such that it always requests a new sample when it has processed the current one and puts all samples in a ring buffer for live555 such that whenever clients are connected, they can retrieve their data from there, and otherwise, the samples are just discarded. This does not work at all. Now, my sink does not get anything even without the tee.
The observation is: if I just request a lot of samples (as in the original approach), at some point, I get data. However, if I request only one (I also tried moderately larger numbers up to 5), ProcessSample is just not called, so no subsequent requests can be generated. I send MeStreamStarted once the clock is started or restarted exactly as described on https://msdn.microsoft.com/en-us/library/windows/desktop/ms701626, and after that, I request the first sample. In my understanding, MEStreamSinkRequestSample should not get lost, so I should get something even on a single request. Is that a misunderstanding? Should I request until I get something?

Related

Make GStreamer pipeline drop erroneous buffers

My pipeline splits in the middle to be sent over an unreliable connection. This results in some buffers having bit errors that break the pipeline if I do not account for them. To solve this, I have an appsink that parses buffers for their critical information (timestamps, duration, data, and data size), serializes them, and then sends that over the unreliable channel with a CRC. If the receiving pipeline reads a buffer from the unreliable channel and detect a bit error with the CRC, the buffer is dropped. Most decoders are able to recover fine from a dropped buffer, aside from some temporary visual artifacts.
Is there a GStreamer plugin that does this automatically? I looked into the GDPPay and GDPDepay plugins which appeared to meet my needs due to there serialization of buffers and inclusion of CRC's for their header and payload, however the plugin assumes that the data is being sent over a reliable channel (why this assumption and the inclusion of CRCs, I do not know).
I am tempted to take the time to make a plugin/make a pull request to the GDP plugins that just drop bad buffers instead of pausing the pipeline with a GST_FLOW_ERROR.
Any suggestions would be greatly appreciated. Ideally it would also be tolerant to either pipeline crashing/restarting. (The plugin also expects the Caps filter information to be the first buffer sent, which in my case I do not need to send as I have a fixed purpose and can hard-code both ends to know what to expect. This is only a problem if the receiver restarts and the sender is already sending data, but the receiver will not get the data because it is waiting for the Caps data that the sender already sent.)
When faced with similar issue (but for GstEvents), I used GstProbe. You'll probably need to install it for GST_PAD_PROBE_TYPE_BUFFER and return GST_PAD_PROBE_DROP for the buffers that doesn't satisfy your conditions. It is easier than defining a plugin and definitely it is easier to modify (GstProbe can be created and handled from the code, so changing the dropping logic is easier). Caveat: I haven't done it for the buffers, but it should be doable.
Let me know if it worked!

'transcirbe_steaming_infinite.py' mics setup(multiple channels) can have an effect on response's arriving time?

I have tried to use [transcirbe_steaming_infinite.py] module with multiple mics. The first one is equipped on my pc(mac book pro) and the other one is external one (Jabra EVOLVE 20). Through Audio MIDI setup I made an aggregate device option (Jabra for channel #1, mac for #2).
To use these mics I modified the codes like ResumableMicrophoneStream._num_channels as 2 and added two extra lines after RecognitionConfig audio_channel_count=2 and enable_seperate_recognition_per_channel=True. And the language in ja-JP
When I tried to use these codes at least work (they are able to recognize each channels) but the problem is that in a certain case, responses comes too late.
The case is when I switch the mic from one of each to the other one. For example, when I try to use mic on channel #1(Jabra) right after using the mic on channel #2, I cannot get the response in time but about 15000ms later.
When I checked the mics on Audio MIDI setup those two's sample rate was different(16kHz, 44.1kHz per each), so I thought up with a possibility it has affected on the library processing audio input streams like PyAudio and finally it has caused late request and response as well. It will be dummiest hypothesis XD.
So I want to know, as the title this problem(late response) can be fixed with good mics setup or just there is another good problem solving way for this case.
A common cause of latency is the API not detecting the end of the audio and therefore it will continue to listen and process audio until either the stream is closed directly, or the stream's limit length has been exceeded. You can discard it using single_utterance which indicates whether the request should automatically end after speech is no longer detected. Also if you are using noise filtering, that should be removed so that Cloud sees the raw audio and can properly detect isFinal.
If the latency issues only occurs when mics are changed and you are following the best practices, you can reach the STT team through the public issue tracker

GSTREAMER access video before an event

I have a SW that performs some video analysis as soon as an event (alarm) happens.
Since I have not enough space on my embedded board, I should start recording the video only when an alarm happens;
The algorithm works on a video stored offline (it is not a real time algorithm, so the video should be stored, it doesn't suffice to attach to video stream).
At present time I'm able to attach to video and to store it as soon as I detect the alarm condition.
However I would like to analyze the data 10 seconds before the event happens.
Is it possible to pre-record up to 10 seconds as a FIFO queue, without storing the whole stream on disk?
I found something similar to my requirements here:
https://developer.ridgerun.com/wiki/index.php/GStreamer_pre-record_element#Video_pre-recording_example
but I would like to know if there is some way I can have the same result, without using the ridgerun tool.
Best regards
Giovanni
I think I mixed up my ideas, and both of them seem to be the similar.
What I suggest is the following :
Have an element that behaves like ringbuffer, though which you can stream backwards in time. A good example to try out might be the èlement queue. Have a look at time-shift buffering.
Then store the contents to a file on alarm, and use another pipeline that read from it. For eg. use tee or output-selector.
| -> ring-buffer
src -> output-selector -> |
|-> (on alarm) -> ringbuffer + live-src -> file-sink
From your question, I understand that your src might be a live camera, and hence doing this can be tricky. Probably you might have to implement your own plugin as done by the RidgeRun team, otherwise this solution is more of a hack rather than a meaningful solution. Sadly there aren't many references for such a solution, you might have to try it out.

Using FFMPEG libs to UDP stream mpeg2 ts video delay / initial connection problems

Currently using the lib's from FFPMEG to stream some MPEG2 TS (h264 encoded) video. The streaming is done via UDP multicast.
The issue I am having currently is two main things. There is a long initial connection time / getting the video to show (the stream also contains metadata, and that stream is detected by my media tool immediately).
Once the video gets going things are fine but it is always delayed by that initial connection time.
I am trying to get as near to LIVE streaming as possible.
Currently using the av_dict_set(&dict, "tune", "zerolatency", 0) and "profile" -> "baseline" options.
GOP size = 12;
At first I thought the issue was an i frame issue, but the initial delay is there if gopsize is 12 or default 250. Sometimes the video will connect quickly, but it is immediately dropped, the delay occurs, then it starts back up and is good from that point on.
According to documentation the zero latency option should be sending many i frames, to limit initial syncing delays.
I am starting to think its a buffering type issue, as when I close the application and leave the media player up, it then fast forwards through the delay till it hits basically where the file stops streaming.
So while I don't completely understand what was wrong, I at least fixed the problem I was having.
The issue came from using the av_write_interleaved_frame() vs. the regular av_write_frame()(this one works for live streaming), when writing out the video frames. Ill have to dig into the differences a bit more to fully understand it, but its funny sometimes how you figure out the problem you are having on a total whim after bashing your face for a few days.
I can get pretty good live ish video streaming with the tune "zerolatency" option set.

playing incoming video stream

I am writing an application which is a kinda video streamer.The client is receiving a video stream using udp socket.Now as I am receiving the stream I want to play it simultaneous.It is different from playing local video file lying in your hard disk in which case it can be as simple as running the file using system("vlc filename").But here many issues are involved like there can be delay in receiving and player will have to wait for the incoming data.I have come to know about using vlc to run a video stream.Can you please elaborate the step for playing the stream using vlc.I am implementing my application in c++.
EDIT: Can somebody give me some idea regarding VLC API which can be used to stream a given video to particular destination and receive that stream at other end play it.
with regards,
Mawia
Well you can always take a look at VideoLan's own homepage
Other than that, streaming is quite straightforward:
Decide on a video codec that supports streaming. (ok obvious and probably already done)
Choose appropriate packet size.
Choose appropriate video quality.
At the client side: pre-buffer at least 2 secs of video and audio.
Number 2 and 3 sound strange, but they are worth thinking about:
If you have a broadband connection, you can afford to pump big packets over to the client. Note: Packets here means consistent units of data that the client needs to have completely to decode the next bit of video. If you send big packets, say 4 secs of video, you risk lag due to waiting for the complete data unit of, well, full 4 seconds, whilst small 0.5 sec packets would get you laggy but still recognizable and relatively fluent video on a bad connection.
Same goes for quality. Pixelated and artifact ridden videos are bad, stuttering video/sound desyncing videos are worse. Rather switch down to a lower quality/higher compression setting.
If your question is purely about the getting it done part, well, points 1 and 4 should do for you.
You might ask:
"If I want to do real time live video?"
All of the advice above still applies, but all of it has to be done smarter. First things first: You cannot do realtime over bad connections. It's a reality thing. If your connection is fat enough you can reach almost real time, just pump each image and a small sound sample out without much processing or any buffering at all. It is possible to get a good client experience from that, but connections like that are highly unlikely. The trick here usually is, transmit a video quality slightly lower than the connection would allow in theory and still wiggle caching and packet reordering in there... have fun. It is hard.
Unfortunately really the only API vlc has is the command line or equivalent of the command line (you can start player instances, passing them essentially what you would have on the command line). You can use libvlc if you need multiple instances or callbacks but it's pretty opaque still...