I have my x264 encoder, producing NALUs from a raw video stream. I need to send those NALUs over the network. What is the best way of doing so?
Encoder is inserted into a DirectShow graph, it's a transform filter and downstream I have the filter which handles networking. Can I pass NALUs, created by transform filter directly to network "render" filter? Will it create some memory issues?
I would like to know how memory allocated for NALUs is handled inside x264 - who is responsible for freeing it? Also I'm wondering if I can just serialize NALU to a bit stream manually and then rebuild it in the same way?
I need to send those NALUs over the network. What is the best way of doing so?
"Best" needs clarification: easiest to do, best in terms of compatibility, compatible to specific counterpart implementation etc.
Can I pass NALUs, created by transform filter directly to network "render" filter? Will it create some memory issues?
There is no stock network renderer, you should read up on how it needs to be done with specific renderer you are going to use.
I would like to know how memory allocated for NALUs is handled inside x264 - who is responsible for freeing it?
x264 manages buffers it fills, x264_encoder_encode returns you references on those buffers and you don't need to free data, just be sure to copy it out timely since it will be invalidated with next call. Don't forget x264_encoder_close afterwards - it will release all resources managed internally.
Also I'm wondering if I can just serialize NALU to a bit stream manually and then rebuild it in the same way?
Yes you can do it. If your network pair of filters can reproduce the same stream doing network stuff on their inner connection, then it is going to work out fine. The best network protocol in terms of interoperability with H.264 is RTP. It is however pretty complicated if compared to simply accept/send/receive/reproduce steps for a bitstream.
RTP: A Transport Protocol for Real-Time Applications
RTP Payload Format for H.264 Video
The best way to send out NALU on to the network would be through an RTP stream. Look at RFC 6184 for details on RTP packetization for H.264. I think you can safely pass NALU to your renderer provided your media buffers are large enough to hold you NALUs.
Related
My pipeline splits in the middle to be sent over an unreliable connection. This results in some buffers having bit errors that break the pipeline if I do not account for them. To solve this, I have an appsink that parses buffers for their critical information (timestamps, duration, data, and data size), serializes them, and then sends that over the unreliable channel with a CRC. If the receiving pipeline reads a buffer from the unreliable channel and detect a bit error with the CRC, the buffer is dropped. Most decoders are able to recover fine from a dropped buffer, aside from some temporary visual artifacts.
Is there a GStreamer plugin that does this automatically? I looked into the GDPPay and GDPDepay plugins which appeared to meet my needs due to there serialization of buffers and inclusion of CRC's for their header and payload, however the plugin assumes that the data is being sent over a reliable channel (why this assumption and the inclusion of CRCs, I do not know).
I am tempted to take the time to make a plugin/make a pull request to the GDP plugins that just drop bad buffers instead of pausing the pipeline with a GST_FLOW_ERROR.
Any suggestions would be greatly appreciated. Ideally it would also be tolerant to either pipeline crashing/restarting. (The plugin also expects the Caps filter information to be the first buffer sent, which in my case I do not need to send as I have a fixed purpose and can hard-code both ends to know what to expect. This is only a problem if the receiver restarts and the sender is already sending data, but the receiver will not get the data because it is waiting for the Caps data that the sender already sent.)
When faced with similar issue (but for GstEvents), I used GstProbe. You'll probably need to install it for GST_PAD_PROBE_TYPE_BUFFER and return GST_PAD_PROBE_DROP for the buffers that doesn't satisfy your conditions. It is easier than defining a plugin and definitely it is easier to modify (GstProbe can be created and handled from the code, so changing the dropping logic is easier). Caveat: I haven't done it for the buffers, but it should be doable.
Let me know if it worked!
I have an application that gets video samples from a frame grabber card via DirectShow. The application then does some processing and sends the video signal over a network. I now want to duplicate this video signal such that another DirectShow-enabled software (like Skype) can use the original input signal, too.
I know that you can create Tee filters in DirectShow like the one used to split a video signal for recording and preview. However, as I understand, this filter is only useful within a single graph, ie I cannot use it to forward the video from my process to eg Skype.
I also know that I could write my own video source, but this would run in the process of the consuming application. The problem is that I cannot put the logic of my original application in such a video source filter.
The only solution I could think of is my application writing the frames to a shared memory block and a video source filter reading it from there. Synchronisation would be done using a shared mutex or so. Could that work? I specifically do not like the synchronisation part?
And more importantly, is there a better solution to solve this problem?
The APIs work as you identified: a video capture application, such as Skype, is requesting video stream without interprocess communication in mind, there is no IPC involved to consume output generated in another process. Your challenge here is to provide this IPC yourself so that one application is generating the data, and then another extends existing API (virtual video source device) and picks existing data, then delivers as generated.
With video, you have a relatively big stream of data and you are interested in avoiding its excessive copying. File mappings (AKA shared memory) are the right thing to do: you put bytes in one process and they are immediately visible in another. You can synchronize access to the data using names events and mutexes which both processes use collaboratively - to signal availability of new buffer of data, as indication that used buffer is no longer in use etc.
I was taking a look at using PF_RING for sending and receiving in my application.
If I plan to use PF_RING for maintaining a TCP connection, it looks like I'll need to manually "forge" the IP and TCP messages myself, as pfring_send sends raw packets. Does this mean I'll have to manually reimplement TCP on top of PF_RING?
I understand there is a clear advantage for receiving using PF_RING, has anyone tried sending data with PF_RING? Is there a clear advantage over normal send calls?
note: I am not using DNA (Direct NIC Access), I am just using the kernel partial bypass with NIC aware drivers.
To answer your first question, yes, you will have to manually build the TCP/IP messages from the ground up, MAC address and all. For an example take a look at pfsend.c from ntop.org.
ntop.org has also made a PF_RING user guide available that contains explanations.
As for sending data using PF_RING, it is absolutely possible, the idea is to bypass any and all notion of what is actually data on the wire and send as fast as possible, see wire speed traffic generation from ntop.org. The only advantage it has over normal sending calls using the kernel for TCP/IP is that you can send data 1. faster and 2. completely unformatted onto the wire. 2 can be handy for example when you want to play back a previously captured packet/multiple packets onto the network.
Unless you have a specific use case that requires you to get access to the raw underlying data without kernel intervention there is absolutely no good reason to use PF_RING in any way. Your best bet would be to use the standard socket()'s that are available, in most cases the performance you can achieve with that is more than adequate.
What specific use case did you have in mind?
I want to write (but first I want to understand how to do it) applications (more than one) based on GStreamer framework that would share the same hardware resource at the same time.
For example: there is a hardware with HW acceleration for video decoding. I want to start simultaneously two applications that are able to decode different video streams, using HW acceleration. Of course I assume that HW is able to handle such requests, there is appropriate driver (but not GStreamer element) for doing this, but how to write GStreamer element that would support such resource sharing between separate processes?
I would appreciate any links, suggestions where to start...
You have h/w that can be accessed concurrently. Hence two gstreamer elements accessing it concurrently should work! There is nothing Gstreamer specific here.
Say you wanted to write a decoding element, it is like any decoding element and you access your hardware correctly. Your drivers should take care of the concurrent access.
The starting place is the Gstreamer plugin writer's guide.
So you need a single process that controls the HW decoder, and decodes streams from multiple sources.
I would recommend building a daemon, possibly itself based on GStreamer also. The gdppay and gdpdepay provide quite simple ways to pass data through sockets to the daemon and back. The daemon would wait for connections on a specified port (or unix socket) and open a virtual decoder per each connection. The video decoder elements in the separate applications would internally connect to the daemon and get back the decoded video.
I am trying to receive a file (audio, .CAF) from a socket in C (C++ solution ok as well). I have the socket communication working, having tested it with strings. My problem is I don't know what to supply to the 2nd arg in recv(socket, buffer, buffer_size, 0). What type should I make "buffer"? I basically want to receive an audio file, and will then play it. But don't know how to receive the audio file itself.
Any thoughts?
Thanks,
Robin
Typically, you'll have the audio encoded in some format. For simplicity, let's assume it's Wave format.
One way of doing things would be to encapsulate chunks of the audio (Say, 50 ms chunks) for sending over the network.
However, you can't blindly send data over the network and expect it to work. On your computer, data may be organized one way (little or big endian), and it could be organized in the opposite way on the other computer.
In that case, the client will get data that he interprets as being completely different than what you intended. So, you'll need to properly serialize it somehow.
Once you properly serialize the data though (or not, if both computers use the same endianess!), you can just send() it and rcev() it, then just pass it off to a decoder to deal with.
I'd love to offer more information, but that's about the extent of my knowledge on the subject. It really depends on what exactly you're doing, so it's hard to give any more information without some more specifics as to what you're doing (with regards to audio format, for one).
Some more information:
http://en.wikipedia.org/wiki/Serialization
http://en.wikipedia.org/wiki/Endianness
Edit: As pointed out in the comments of this answer, you should probably not worry about serialization at all if you're using a standard format. Just pass it over the network in chunks that are usable by your decoder (Send a frame at a time, for example) then decode those individual frames (or possibly multiple frames) on the client side.
buffer is going to be a pointer to an array you've allocated to store the data the comes across the wire.
It depends on the socket library you're using, but usually it expects void* (which is just a generic pointer type).
You might do something like this:
uint8[1000] myBuffer;
recv(sock,myBuffer,1000,0);
It gets tricky because this only gives you enough room for 8,000bytes, which might not be enough to hold your audio file, so you'll have to handle multiple recv() calls until you get the entire audio file.