Capturing real time images from a network camera - c++

What is the best way to capture streamed MJPEG from a network IP camera?
I'd like to get frames and process them, using c++ (or python extended with c++).
Is OpenCV my best option?

Appart from OpenCV, you can use mplayer with -vo yuv4mpeg redirected to a pipe to get a stream of uncompressed yuv images. You can create the mplayer process and pipe from C++.
Another way is to use a RTSP library (your IP camera probably uses it as protocol)

Related

How to make a mjpeg stream from pixelarray

I want to make a Raspi based networkcamera but WITHOUT the original raspi Camera. I have a camera that gives me single frames which I store in a 2d array. 1 Byte/Pixel because it is grayscale.
What are the steps I have to do to convert it to a Stream that I can watch via HTTP request or Socket Stream or rtsp.
How do I need to convert the single Frames into something that I can stream via Network?
Do I take a single frame and convert it to jpeg (or mjpeg) or something different?
Programming language is C or C++ nothing else.
Thanks in Advance

how to get raw mjpg stream from webcam

I have logitech webcam, which streams 1080p#30fps using MJPG compression via USB2.0. I need to write this raw stream to hard drive or send via network. I do NOT need to decompress it. OpenCV gives me decompressed frames, so i need to compress them back. This leads to heavy CPU utilization waste. How to get raw MJPEG stream instead as it comes from camera? (Windows 7, Visual Studio, C++)
Windows native video capture related APIs DirectShow and Media Foundation let you capture video from a webcam in original format. It is a natural task for these APIs and is done in a straightforward way (specifically, if a web camera gets hardware compressed M-JPEG feed, you can have that programmatically).
About Video Capture in DirectShow
Audio/Video Capture in Media Foundation
You are free to do whatever you want with the data afterwards: decompress, send over network, compose a Motion JPEG over HTTP response feed etc.

DXGI Desktop Duplication: encoding frames to send them over the network

I'm trying to write an app which will capture a video stream of the screen and send it to a remote client. I've found out that the best way to capture a screen on Windows is to use DXGI Desktop Duplication API (available since Windows 8). Microsoft provides a neat sample which streams duplicated frames to screen. Now, I've been wondering what is the easiest, but still relatively fast way to encode those frames and send them over the network.
The frames come from AcquireNextFrame with a surface that contains the desktop bitmap and metadata which contains dirty and move regions that were updated. From here, I have a couple of options:
Extract a bitmap from a DirectX surface and then use an external library like ffmpeg to encode series of bitmaps to H.264 and send it over RTSP. While straightforward, I fear that this method will be too slow as it isn't taking advantage of any native Windows methods. Converting D3D texture to a ffmpeg-compatible bitmap seems like unnecessary work.
From this answer: convert D3D texture to IMFSample and use MediaFoundation's SinkWriter to encode the frame. I found this tutorial of video encoding, but I haven't yet found a way to immediately get the encoded frame and send it instead of dumping all of them to a video file.
Since I haven't done anything like this before, I'm asking if I'm moving in the right direction. In the end, I want to have a simple, preferably low latency desktop capture video stream, which I can view from a remote device.
Also, I'm wondering if I can make use of dirty and move regions provided by Desktop Duplication. Instead of encoding the frame, I can send them over the network and do the processing on the client side, but this means that my client has to have DirectX 11.1 or higher available, which is impossible if I would want to stream to a mobile platform.
You can use IMFTransform interface for H264 encoding. Once you get IMFSample from ID3D11Texture2D just pass it to IMFTransform::ProcessInput and get the encoded IMFSample from IMFTransform::ProcessOutput.
Refer this example for encoding details.
Once you get the encoded IMFSamples you can send them one by one over the network.

How to use live555 streaming media forwarding

I use Live555 h.264 stream client to query the frame packets from an IP camera, I use ffmpeg to decode the buffer and analysis the frame by OpenCV.(those pipeline are based on testRTSPClient sample, I decode the h.264 frame buffer in DummySink::afterGettingFrame() by ffmpeg)
And now I wanna stream the frame to another client(remote client) OnDemand mode in real-time, the frame may added the analysis result(boundingboxs, text, etc), how to use Live555 to achieve this?
Well, your best bet is to re-encode the resultant frame (with bounding boxes etc), and pass this to an RTSPServer process which will allow you to connect to it using an rtsp url, and stream the encoded data to any compatible rtsp client. There is a good reference on the FAQ for how to do this http://www.live555.com/liveMedia/faq.html#liveInput which walks you through the steps taken, and provides example source code which you can modify for your needs.

Getting a snapshot from an rtsp video stream from an IP camera

Normally, I can get a still snapshot from an IP camera with a vendor provided url. However, the jpegs served this way are not of good enough quality and the vendor says there is no facility provided for serving snapshots in other image formats or smaller/lossless compression.
I noticed when I open an rtsp h264 stream from the camera with VLC then manually take a screenshot, the resulting image has none of the jpeg artifacts observed previously.
The question is, how would I obtain these superior snapshots from an h264 stream with a c++ program? I need to perform multiple operations on the image (annotations, cropping, face recognition) but those have to come after getting as high quality as possible initial image.
(note that this is related to my previous question. I obtained jpeg images with CURL but would now like to replace the snapshot getter with this new one if possible. I am again running on linux, Fedora 11)
You need an RTSP client implementation to connect to the camera, start receiving video feed, defragment/depacketize the video frame and then you will get it and save/process/present as needed.
You might want to look towards live555 library as a well known RTSP library/implemetnation.