Multiple applications using GStreamer - gstreamer

I want to write (but first I want to understand how to do it) applications (more than one) based on GStreamer framework that would share the same hardware resource at the same time.
For example: there is a hardware with HW acceleration for video decoding. I want to start simultaneously two applications that are able to decode different video streams, using HW acceleration. Of course I assume that HW is able to handle such requests, there is appropriate driver (but not GStreamer element) for doing this, but how to write GStreamer element that would support such resource sharing between separate processes?
I would appreciate any links, suggestions where to start...

You have h/w that can be accessed concurrently. Hence two gstreamer elements accessing it concurrently should work! There is nothing Gstreamer specific here.
Say you wanted to write a decoding element, it is like any decoding element and you access your hardware correctly. Your drivers should take care of the concurrent access.
The starting place is the Gstreamer plugin writer's guide.

So you need a single process that controls the HW decoder, and decodes streams from multiple sources.
I would recommend building a daemon, possibly itself based on GStreamer also. The gdppay and gdpdepay provide quite simple ways to pass data through sockets to the daemon and back. The daemon would wait for connections on a specified port (or unix socket) and open a virtual decoder per each connection. The video decoder elements in the separate applications would internally connect to the daemon and get back the decoded video.

Related

What are command, data and comp ring in vmxnet3 PMD in DPDK

I am starting to work and understand the basics of DPDK and it's working with VMWare (VMXNET3 PMD). I started browsing through the code base and found reference to 3 ring structures in vmxnet3_tx_queue_t (at vmxnet3_ring.h), namely cmd_ring, data_ring and comp_ring.
I tried surfing to understand the use case and working of them, but didn't get quite get the documentation on it or was unable to understand.
Any pointers / direction would be of great help.
The vmxnet3 is pretty decently described in the DPDK NIC documentation:
http://doc.dpdk.org/guides/nics/vmxnet3.html
The driver pre-allocates the packet buffers and loads the command ring descriptors in advance. The hypervisor fills those packet buffers on packet arrival and write completion ring descriptors, which are eventually pulled by the PMD. After reception, the DPDK application frees the descriptors and loads new packet buffers for the coming packets.
In the transmit routine, the DPDK application fills packet buffer pointers in the descriptors of the command ring and notifies the hypervisor. In response the hypervisor takes packets and passes them to the vSwitch, It writes into the completion descriptors ring. The rings are read by the PMD in the next transmit routine call and the buffers and descriptors are freed from memory.
Not sure though if those details are the "basics of DPDK", as those low level queues are abstracted by the DPDK Poll Mode Driver API:
https://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html
So you better refer this document and use this API, as you won't be able to use vmxnet3 rings directly in your app anyway...

Recording both rendering and recording device

I'm writing a program in C++, on Windows. I need to support Windows Vista+.
I want to record both the microphone and speaker simultaneously.
I'm using the WASAPI and can record the microphone and speaker separately, but I would like to have just one stream supplying me the input from both streams (for example, for recording a client play the guitar along with the music he hears on his headphones), instead of merging the two buffers together somehow (which I guess will lead me to timing issues).
Is there a way to do this?
I'm actually working on a library which can do exactly that, merge streams from multiple devices. You might want to give it a try: see xt-audio.com. If you're implementing this yourself, here's some things to consider:
If you're capturing the speakers through a WASAPI loopback interface you're operating in shared mode, in this case latency might be unacceptable for live performance. If possible stick to exclusive mode and use a loopback cable or hardware loopback device if you have one (e.g. the old fashioned "stereo mix" devices etc).
If you're merging buffers then yes, you're going to have timing issues. This is generally unavoidable when syncing independent devices. Pops/clicks can largely be avoided using a secondary intermediate buffer which introduces additional latency, but eventually you're going to have to pad/drop some samples to keep streams in sync.
Do NOT use separate threads for each independent stream. This will increase context switches and thereby increase the minimum achievable latency. Instead, designate one device as the master device, wait for that device's event to be raised, then read input from all devices whether they are "ready" or not (this is were dropping/padding comes into play).
In general you can get really decent performance from WASAPI exclusive mode, even running multiple streams together. But for something as critical as live performance you might want to consider a pro audio interface with ASIO drivers where everything just ticks off the same clock, or synchronization is at least handled at the driver level.

Tee/passthrough DirectShow data as video source

I have an application that gets video samples from a frame grabber card via DirectShow. The application then does some processing and sends the video signal over a network. I now want to duplicate this video signal such that another DirectShow-enabled software (like Skype) can use the original input signal, too.
I know that you can create Tee filters in DirectShow like the one used to split a video signal for recording and preview. However, as I understand, this filter is only useful within a single graph, ie I cannot use it to forward the video from my process to eg Skype.
I also know that I could write my own video source, but this would run in the process of the consuming application. The problem is that I cannot put the logic of my original application in such a video source filter.
The only solution I could think of is my application writing the frames to a shared memory block and a video source filter reading it from there. Synchronisation would be done using a shared mutex or so. Could that work? I specifically do not like the synchronisation part?
And more importantly, is there a better solution to solve this problem?
The APIs work as you identified: a video capture application, such as Skype, is requesting video stream without interprocess communication in mind, there is no IPC involved to consume output generated in another process. Your challenge here is to provide this IPC yourself so that one application is generating the data, and then another extends existing API (virtual video source device) and picks existing data, then delivers as generated.
With video, you have a relatively big stream of data and you are interested in avoiding its excessive copying. File mappings (AKA shared memory) are the right thing to do: you put bytes in one process and they are immediately visible in another. You can synchronize access to the data using names events and mutexes which both processes use collaboratively - to signal availability of new buffer of data, as indication that used buffer is no longer in use etc.

Can I write Ethernet based network programs in C++?

I would like to write a program and run it on two machines, and send some data from one machine to another in an Ethernet frame.
Typically application data is at layer 7 of the OSI model, is there anything like a kernel restriction or API restriction, that would stop me from writing a program in which I can specify a destination MAC address and have some data sent to that MAC as the Ethernet payload? Then write a program to listen for incoming frames and grab the frames from a specified source MAC address, extracting the payload of data from the frame?
(So I don't want any other overhead like IP or TCP/UDP headers, I don't want to go higher than layer 2).
Can this be done in C++, or must all communication happen at the IP layer, and can this be done on Ubuntu? Extra love for pointing or providing examples! :D
My problem is obviously I'm new to network programming in c++ and as far as I know, if I want to communicate across a network I have to use a socket() call or similar, which works at an IP layer, so can I write a c++ program to work at OSI layer 2, are there APIs for this, does the Linux kernel even allow this?
As you already mentioned sockets, probably you would just like to use a raw socket. Maybe this page with C example code is of some help.
In case you are looking for an idea for a program only using Ethernet while still being useful:
Wake on LAN in it's original form is quite simple. Note however that most current implementations actually send UDP packets (exploiting that the receiver does not parse for packet headers etc. but just a string in the packet's payload).
Also the use of raw sockets is usually restricted to privileged users. You might need to either
call your program as root
or have it owned by root and setuid bit set
or set the capability for creating raw socket using setcap CAP_NET_RAW+ep /path/to/your/program-file
The last option gives more fine grained privileges (just raw sockets, not write access to your whole file system etc.) than the other two. It is still less widely known however, since it is "only" supported from kernel 2.6.24 on (which came with Ubuntu 8.04).
Yes, actually linux has a very nice feature that makes it easy to deal with layer 2 packets. You can use a TAP device, which allows your userspace program to read/write ethernet traffic through the kernel.
http://www.kernel.org/pub/linux/kernel/people/marcelo/linux-2.4/Documentation/networking/tuntap.txt
http://en.wikipedia.org/wiki/TUN/TAP

WaitCommEvent compatible with pipes?

I am working with legacy C++/MFC/Win32 code. The project multiplexes various serial protocols over separate physical serial ports, one per client system, to a common front end data repository.
Since the program was originally designed to communicate over serial ports there are many assumptions in the code as far as setup and management of serial events go: ACK/NAK transport verification, inner-byte delay checks, etc…
The existing architecture leverages overlapped reads and writes with event notification via WaitCommEvent.
I have been tasked to add another client interface, using a single client pipe server; which, like the serial ports, will support one client per “file”.
In reading the docs for WaitCommEvent it looks like it was designed to work with OS abstracted physical communications devices; like serial ports.
The simple question, can I leverage the existing serial skewed “wait” model to work with a pipe, or should I go ahead and virtualize it so that it can be overridden it with specific pipe logic?
Thanks to those (the minority for sure) of developers who know what I am asking.
I can't find a good reference right now, but it is my understanding that WaitCommEvent only works with communications resources and that a pipe is not defined to be a communications resource in the same sense as e.g. a serial port. WaitCommEvent waits for the underlying driver to set certain bit-flags, like when new characters arrive, and I don't believe pipes (or files) work that way internally.