Increasing the update rate of gst_element_query_position in GStreamer - gstreamer

I am busy creating a wrapper library (for node.js) around GStreamer. I have a working player, and I am using an interval to request the pipeline position (with percent formatting) every 200ms.
My issue however, is that I only receive an updated value every 1000ms.
I am calling gst_element_query_position (And have also tried attaching to the pad with gst_pad_query_position with the same result).
Is there any way to increase the update rate of the value retrieved by gst_element_query_position in order to present more granular time information?

When calling gst_element_query_position, use GST_FORMAT_TIME over GST_FORMAT_PERCENT. GST_FORMAT_PERCENT is basically deprecated.
http://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer/html/gstreamer-GstFormat.html#GstFormat
We use the position in combination with the duration (from gst_element_query_duration) and it seems to work fine for more granular usage.

Related

'transcirbe_steaming_infinite.py' mics setup(multiple channels) can have an effect on response's arriving time?

I have tried to use [transcirbe_steaming_infinite.py] module with multiple mics. The first one is equipped on my pc(mac book pro) and the other one is external one (Jabra EVOLVE 20). Through Audio MIDI setup I made an aggregate device option (Jabra for channel #1, mac for #2).
To use these mics I modified the codes like ResumableMicrophoneStream._num_channels as 2 and added two extra lines after RecognitionConfig audio_channel_count=2 and enable_seperate_recognition_per_channel=True. And the language in ja-JP
When I tried to use these codes at least work (they are able to recognize each channels) but the problem is that in a certain case, responses comes too late.
The case is when I switch the mic from one of each to the other one. For example, when I try to use mic on channel #1(Jabra) right after using the mic on channel #2, I cannot get the response in time but about 15000ms later.
When I checked the mics on Audio MIDI setup those two's sample rate was different(16kHz, 44.1kHz per each), so I thought up with a possibility it has affected on the library processing audio input streams like PyAudio and finally it has caused late request and response as well. It will be dummiest hypothesis XD.
So I want to know, as the title this problem(late response) can be fixed with good mics setup or just there is another good problem solving way for this case.
A common cause of latency is the API not detecting the end of the audio and therefore it will continue to listen and process audio until either the stream is closed directly, or the stream's limit length has been exceeded. You can discard it using single_utterance which indicates whether the request should automatically end after speech is no longer detected. Also if you are using noise filtering, that should be removed so that Cloud sees the raw audio and can properly detect isFinal.
If the latency issues only occurs when mics are changed and you are following the best practices, you can reach the STT team through the public issue tracker

Flink RocksDB Performance issues

I have a flink job (scala) that is basically reading from a kafka-topic (1.0), aggregating data (1 minute event time tumbling window, using a fold function, which I know is deprecated, but is easier to implement than an aggregate function), and writing the result to 2 different kafka topics.
The question is - when I'm using a FS state backend, everything runs smoothly, checkpoints are taking 1-2 seconds, with an average state size of 200 mb - that is, until the state size is increasing (while closing a gap, for example).
I figured I would try rocksdb (over hdfs) for checkpoints - but the throughput is SIGNIFICANTLY less than fs state backend. As I understand it, flink does not need to ser/deserialize for every state access when using fs state backend, because the state is kept in memory (heap), rocks db DOES, and I guess that is what is accounting for the slowdown (and backpressure, and checkpoints take MUCH longer, sometimes timeout after 10 minutes).
Still, there are times that the state cannot fit in memory, and I am trying to figure out basically how to make rocksdb state backend perform "better".
Is it because of the deprecated fold function? Do I need to fine tune some parameters that are not easily searchable in documentation? any tips?
Each state backend holds the working state somewhere, and then durably persists its checkpoints in a distributed filesystem. The RocksDB state backend holds its working state on disk, and this can be a local disk, hopefully faster than hdfs.
Try setting state.backend.rocksdb.localdir (see https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/state/state_backends.html#rocksdb-state-backend-config-options) to somewhere on the fastest local filesystem on each taskmanager.
Turning on incremental checkpointing could also make a large difference.
Also see Tuning RocksDB.

Custom Media Foundation sink never receives samples

I have my own MediaSink in Windows Media Foundation with one stream. In the OnClockStart method, I instruct the stream to queue (i) MEStreamStarted and (ii) MEStreamSinkRequestSample on itself. For implementing the queue, I use the IMFMediaEventQueue, and using the mtrace tool, I can also see that someone dequeues the event.
The problem is that ProcessSample of my stream is actually never called. This also has the effect that no further samples are requested, because this is done after processing a sample like in https://github.com/Microsoft/Windows-classic-samples/tree/master/Samples/DX11VideoRenderer.
Is the described approach the right way to implement the sink? If not, what would be the right way? If so, where could I search for the problem?
Some background info: The sink is an RTSP sink based on live555. Since the latter is also sink-driven, I thought it would be a good idea queuing a MEStreamSinkRequestSample whenever live555 requests more data from me. This is working as intended.
However, the solution has the problem that new samples are only requested as long as a client is connected to live555. If I now add a tee before the sink, eg to show a local preview, the system gets out of control, because the tee accumulates samples on the output of my sink which are never fetched. I then started playing around with discardable samples (cf. https://social.msdn.microsoft.com/Forums/sharepoint/en-US/5065a7cd-3c63-43e8-8f70-be777c89b38e/mixing-rate-sink-and-rateless-sink-on-a-tee-node?forum=mediafoundationdevelopment), but the problem is either that the stream does not start, queues are growing or the frame rate of the faster sink is artificially limited depending on which side is discardable.
Therefore, the next idea was rewriting my sink such that it always requests a new sample when it has processed the current one and puts all samples in a ring buffer for live555 such that whenever clients are connected, they can retrieve their data from there, and otherwise, the samples are just discarded. This does not work at all. Now, my sink does not get anything even without the tee.
The observation is: if I just request a lot of samples (as in the original approach), at some point, I get data. However, if I request only one (I also tried moderately larger numbers up to 5), ProcessSample is just not called, so no subsequent requests can be generated. I send MeStreamStarted once the clock is started or restarted exactly as described on https://msdn.microsoft.com/en-us/library/windows/desktop/ms701626, and after that, I request the first sample. In my understanding, MEStreamSinkRequestSample should not get lost, so I should get something even on a single request. Is that a misunderstanding? Should I request until I get something?

DirectShow: How to syncronize stream time to system time for video capture devices

I am creating a program where I show some graphical content, and I record the face of the viewer with the webcam using DirectShow. It is very important that I know the time difference between what's on the screen to when the webcam records a frame.
I don't care at all about reducing latency or anything like that, it can be whatever it's going to be, but I need to know the capture latency as accurately as possible.
When frames come in, I can get the stream times of the frames, but all those frames are relative to some particular stream start time. How can I access the stream start time, for a capture device? That value is obviously somewhere in the bowels of directshow, because the filter graph computes it for every frame, but how can I get at it? I've searched through the docs but haven't found it's secret yet.
I've created my own IBaseFilter IReferenceClock implementing classes, which do little more than report tons of debugging info. Those seem to be doing what they need to be doing, but they don't provide enough information.
For what it is worth, I have tried to investigate this by inspecting the DirectShow Event Queue, but no events concerning the starting of the filter graph seem to be triggered, even when I start the graph.
The following image recorded using the test app might help understand what I'm doing. The graphical content right now is just a timer counting seconds.
The webcam is recording the screen. At the particular moment that frame was captured, the system time was about 1.35 seconds or so. The time of the sample recorded in DirectShow was 1.1862 seconds (ignore the caption in the picture). How can I account for the difference of .1637 seconds in this example? The stream start time is key to deriving that value.
The system clock and the reference clock are both using the QueryPerformanceCounter() function, so I would not expect it to be timer wonkyness.
Thank you.
Filters in the graph share reference clock (unless you remove it, which is not what you want anyway) and stream times are relative to certain base start time of this reference clock. Start time corresponds to stream time of zero.
Normally, controlling application does not have access to this start time as filter graph manager chooses the value itself internally and passes to every filter in the graph as a parameter in IBaseFilter::Run call. If you have at least one filter of your own, you can get the value.
Getting absolute capture time in this case is a matter of simple math: frame time is base time + stream time, and you can always do IReferenceClock::GetTime to check current effective time.
If you don't have access to start time anyway and you don't want to add your own filter to the graph, there is a trick you can employ to define base start time yourself. This is what filter graph manager is doing anyway.
Starting the graphs in sync means using IMediaFilter::Run instead of IMediaControl::Run... Call IMediaFilter::Run on all graphs, passing this time... as the parameter.
try IReferenceClock::GetTime
Reference Clocks: https://msdn.microsoft.com/en-us/library/dd377506(v=vs.85).aspx
For more information here:
https://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/1dc4123a-05cf-4036-a17e-a7648ca5db4e/how-do-i-know-current-time-stamp-referencetime-on-directshow-source-filter?forum=windowsdirectshowdevelopment

DirectShow IReferenceClock implementation

How exactly are you meant to implement an IReferenceClock that can be set via IMediaFilter::SetSyncSource?
I have a system that implements GetTime and AdviseTime, UnadviseTime. When a stream starts playing it sets a base time via AdviseTime and then increases Stream Time for each subsequent advise.
However how am I supposed to know when a new graph has run? I need to set a zero point for a given reference clock. Otherwise if I create a reference clock and then, 10 seconds later, I start the graph I am now in the position that I don't know whether I should be 10 seconds down the playback or whether I should be starting from 0. Obviously the base time will say that I am starting from 0 but have I just stalled for 10 seconds and do I need to drop a bunch of frames?
I really can't seem to figure out how to write a proper IReferenceClock so any hints or ideas would be hugely appreciated.
Edit: One example of a problem I am having is that I have 2 graphs and 2 videos. The audio from both videos is going to a null renderer. The Video to a standard CLSID_VideoRenderer. Now If i set the same reference clock to both and then Run graph 1 all seems to be fine. However if 10 seconds down the line I run graph 2 then it will run as though the SetSyncSource is NULL for the first 10 seconds or so until it has caught up with the other video.
Obviously if the graphs called GetTime to get their "base time" this would solve the problem but this is not what I'm seeing happening. Both videos end up with a base time of 0 because thats the point I run them from.
Its worth noting that if I set no clock at all (or call SetDefaultSyncSource) then both graphs run as fast as they can. I assume this is due to the lack of an Audio Renderer ...
However how am I supposed to know when a new graph has run?
The clock runs on its own, it is the graph that aligns its operation against the clock and not otherwise. The graph receives outer Run call, then it checks current clock time and assigns base time, which is distributed among filters, as "current clock time + some time for the things to take off". The clock itself doesn't have to have a faintest idea about all this and its task is to keep running and keep incrementing time.
In particular, clock time does not have to reset to zero at any time.
From documentation:
The clock's baseline—the time from which it starts counting—depends on the implementation, so the value returned by GetTime is not inherently meaningful. What matters is the delta from when the graph started running.
When an application calls IMediaControl::Run to run the filter graph, the Filter Graph Manager calls IMediaFilter::Run on each filter. To compensate for the slight amount of time it takes for the filters to start running, the Filter Graph Manager specifies a start time slightly in the future.
BaseClasses offer CBaseReferenceClock class, which you can use as reference implementation (in refclock.*).
Comment to your edit:
You obviously not describing the case in full and you are omitting important details. There is a simple test: you can instantiate standard clock (CLSID_SystemClock) and use it on two regular graphs - they WILL run fine, even with time-separated Run times.
I suspect that you are doing some sync'ing or matching between the graphs and you are time stamping the samples, also using the clock. Presumably you are doing something wrong at that point and then you have hard time fixing it through the clock.