c++ threading in UDP for real-time video transission - c++

I am currently working on a project of real-time camera video transmission. I have done the camera capturing using opencv and real-time encoding decoding using ffmepg.
Now my problem is that, at the decoder side, I want to let the decoder keep playing the decoded video while keep receiving the encoded packet at the same time. My supervisor told me I need to apply threading to realize this function.
Any idea how should I do this or any example program?

Related

Live streaming and processing with opencv

I am having a hard time figuring out a seemingly simple problem : my aim is to send a video stream to a server, process it using opencv, then send back the processed feed to be displayed.
I am thinking of using kafka to send and receive the feed since I already have some experience with it. However, this is raising a problem : opencv process video streams using the VideoCapture method, which is different from just reading a single image using the Read method.
If I stream my video feed frame by frame, will I be able to process my feed on the server as a video rather than a single image at time ? And when I get back the processed frame, can I display it again as a video ?
I am sure I misunderstood some concepts so please let me know if you need further explanations.
Apologies for the late response. I have built a Live-streaming project with a basic Analytics (Face Detection) using Kafka and OpenCV.
The publisher application has OpenCV to access the Live video from Webcam/Ip Camera / USB camera. Like you have mentioned VideoCapture.read(frame) fetches a continuous stream of frames/Images of the video as a Mat. Mat is then converted into a String (JSON) and published it to Kafka.
You can then, transform these objects as per their requirement (into Buffered Image for live streaming application) or work with the raw form (for face detection application). This will be the desired solution as it exhibits reusability by allowing a publisher application to produce data for multiple consumers.

How to use Intel Hardware MJPEG Decoder MFT in MediaFoundation SourceReader for Window Desktop application?

I'm developing USB camera streaming Desktop application using MediaFoundation SourceReader technique. The camera is having USB3.0 support and gives 60fps for 1080p MJPG video format resolution.
I used Software MJPEG Decoder MFT to convert MJPG to YUY2 frames and then converted into the RGB32 frame to draw on the window. Instead of 60fps, I'm able to render only 30fps on the window when using this software decoder. I have posted a question on this site and got some suggestion to use Intel Hardware MJPEG Decoder MFT to solve frame drop issue.
I have faced an error 0xC00D36B5 - MF_E_NOTACCEPTING when calling IMFTransform::ProcessInput() method. To solve this error, MSDN suggested using IMFTranform interface asynchronously. So, I used IMFMediaEventGenerator interface to GetEvent for every In/Out sample. Successfully, I can process only one input sample and then continuously IMFMediaEventGenerator:: GetEvent() methods returns MF_E_NO_EVENTS_AVAILABLE error(GetEvent() is synchronous).
I have tried to configure an asynchronous callback for SourceReader as well as IMFTransform but MFAsyncCallback:: Invoke method is not invoking, hence I planned to use GetEvent method.
Am I missing anything?If Yes, Someone guides me to use Intel Hardware Decoder into my project?
Intel Hardware MJPEG Decoder MFT is an asynchronous MFT and if you are managing it directly, you are responsible to apply asynchronous model. You seem to be doing this but you don't provide information that allows nailing the problem down. Yes, you are supposed to use event model described in ProcessInput, ProcessOutput sections of the article linked above. As you get the first frame, you should debug further to make it work with smooth continuous processing.
When you use APIs like media session our source reader, you have Media Foundation itself dealing with the MFTs. It is capable of doing synchronous and asynchronous consumption when appropriate. In this case, however, you don't do IMFTransform calls and even from your vague description it comes you are doing it wrong way.

how to get raw mjpg stream from webcam

I have logitech webcam, which streams 1080p#30fps using MJPG compression via USB2.0. I need to write this raw stream to hard drive or send via network. I do NOT need to decompress it. OpenCV gives me decompressed frames, so i need to compress them back. This leads to heavy CPU utilization waste. How to get raw MJPEG stream instead as it comes from camera? (Windows 7, Visual Studio, C++)
Windows native video capture related APIs DirectShow and Media Foundation let you capture video from a webcam in original format. It is a natural task for these APIs and is done in a straightforward way (specifically, if a web camera gets hardware compressed M-JPEG feed, you can have that programmatically).
About Video Capture in DirectShow
Audio/Video Capture in Media Foundation
You are free to do whatever you want with the data afterwards: decompress, send over network, compose a Motion JPEG over HTTP response feed etc.

DXGI Desktop Duplication: encoding frames to send them over the network

I'm trying to write an app which will capture a video stream of the screen and send it to a remote client. I've found out that the best way to capture a screen on Windows is to use DXGI Desktop Duplication API (available since Windows 8). Microsoft provides a neat sample which streams duplicated frames to screen. Now, I've been wondering what is the easiest, but still relatively fast way to encode those frames and send them over the network.
The frames come from AcquireNextFrame with a surface that contains the desktop bitmap and metadata which contains dirty and move regions that were updated. From here, I have a couple of options:
Extract a bitmap from a DirectX surface and then use an external library like ffmpeg to encode series of bitmaps to H.264 and send it over RTSP. While straightforward, I fear that this method will be too slow as it isn't taking advantage of any native Windows methods. Converting D3D texture to a ffmpeg-compatible bitmap seems like unnecessary work.
From this answer: convert D3D texture to IMFSample and use MediaFoundation's SinkWriter to encode the frame. I found this tutorial of video encoding, but I haven't yet found a way to immediately get the encoded frame and send it instead of dumping all of them to a video file.
Since I haven't done anything like this before, I'm asking if I'm moving in the right direction. In the end, I want to have a simple, preferably low latency desktop capture video stream, which I can view from a remote device.
Also, I'm wondering if I can make use of dirty and move regions provided by Desktop Duplication. Instead of encoding the frame, I can send them over the network and do the processing on the client side, but this means that my client has to have DirectX 11.1 or higher available, which is impossible if I would want to stream to a mobile platform.
You can use IMFTransform interface for H264 encoding. Once you get IMFSample from ID3D11Texture2D just pass it to IMFTransform::ProcessInput and get the encoded IMFSample from IMFTransform::ProcessOutput.
Refer this example for encoding details.
Once you get the encoded IMFSamples you can send them one by one over the network.

C++ ffmpeg real-time video transmisson

I am a student currently working on my final project. Our project is focusing on new type network coding research. Now my task is to do a real-time video transmission to test the network coding. I have learned something of ffmepg and opencv and have finished a c++ program which can divide the video into frames and send it frame by frame. However, by this way, the transmission data (the frames)size are quite much more than the original video file size. My prof advise me try to find the keyframe and inter frame diff of the video (mjpeg format), so that transmit the keyframe and interframe diff only instead of all the frames with large amount of redundancy, and therefore reduce the transmission data. I have no idea in how to do this in c++ and ffmpeg or opencv. Can any one give any advice?
For my old program, please refer to here. C++ Video streaming and transimisson
I would recommend against using ffmpeg/libav* at all. I would recommend using libx264 directly. By using x264 you can have greater control of NALU slice sizes as well as lower encoder latency by utilizing callbacks.
Two questions which already may help yourself:
How are you interfacing from c++ to ffmpeg? ffmpeg generally refers to the command line tool, from c++ you generally use the individual libs which are part of ffmpeg. You should use libavcodec to encode your frames and possibly libavformat to packetize them into a container format.
Which codec do you use?