Effective motion detection with OpenCV with stream received from IP Camera - c++

I have two questions which I was struggling finding answers on the net for more than a week.
I'm writing a Windows service on Visual C++ 2017 which connects to Axis IP Cameras on our network and queries MJPEG streams using regular sockets. It successfully parses the streams and decodes JPEG images. Decoding done with OpenCV;
frame = cv::imdecode(data, IMREAD_GRAYSCALE)).
Q1. Although OpenCV uses a performance JPEG library as it claims: build-libjpeg-turbo (ver 1.5.3-62), decoding performance is surprisingly slower than .Net's System.Drawing.Image.FromStream(ms). Do you have any recommendation for a really fast JPEG decompression?
Q2. All I need to do with the received JPEG's is to check "regions of interest" if there is motion in there. These are production lines in a factory actually. The factory runs 24 hours and six days a week so there will be changing lighting conditions. Sometimes there won't be light at all so JPEG's will be with plenty of noise on them. Which OpenCV operations and algorithms you would recommend applying on the frames to have an understanding of if there is a motion at the ROI? Of course you can use plenty of operations on your matrices one after another but I need the shortest and most effective way to keep the resource requirements low as it will be doing this operation for plenty cameras and ROI's at the same time.
My system is with NVIDIA Video Card (I can use CUDA), Intel i7-7700, 16GB Ram.
Thank you!

This is not exactly an answer to your question, but it may even be a better approach.
Axis IP cameras have since long time an on-board motion detection engine than can be configured both via the camera web UI (on old camera models/firmware version it may require using Internet Explorer and the use of an embedded ActiveX control to do that) and via the VAPIX Axis HTTP camera API.
The same VAPIX HTTP API also has commands to receive motion levels and threshold for each configured motion area/window on the camera.
If you don't have a recent model that supports VAPIX version 3, you may still rely on VAPIX version 2, you can try issuing an HTTP GET request such as:
to get a HTTP multipart stream of the motion level and threshold data (i.e. for motion area 0 and 1).
For more detailed information, you can download the relevant VAPIX PDF documentation from the Axis website (may require an account and login).


Feed GStreamer sink into OpenPose

I have a custom USB camera with a custom driver on a custom board Nvidia Jetson TX2 that is not detected through openpose examples. I access the data using GStreamer custom source. I currently pull frames into a CV mat, color convert them and feed into OpenPose on a per picture basis, it works fine but 30 - 40% slower than a comparable video stream from a plug and play camera. I would like to explore things like tracking that is available for streams since Im trying to maximize the fps. I believe the stream feed is superior due to better (continuous) use of the GPU.
In particular the speedup would come at confidence expense and would be addressed later. 1 frame goes through pose estimation and 3 - 4 subsequent frames are just tracking the object with decreasing confidence levels. I tried that on a plug and play camera and openpose example and the results were somewhat satisfactory.
The point where I stumbled is that I can put the video stream into CV VideoCapture but I do not know, however, how to provide the CV video capture to OpenPose for processing.
If there is a better way to do it, I am happy to try different things but the bottom line is that the custom camera stays (I know ;/). Solutions to the issue described or different ideas are welcome.
Things I already tried:
Lower resolution of the camera (the camera crops below certain res instead of binning so cant really go below 1920x1080, its a 40+ MegaPixel video camera by the way)
use CUDA to shrink the image before feeding it to OpenPose (the shrink + pose estimation time was virtually equivalent to the pose estimation on the original image)
since the camera view is static, check for changes between frames, crop the image down to the area that changed and run pose estimation on that section (10% speedup, high risk of missing something)

How to turn any camera into a Depth Camera?

I want to build a depth camera that finds out any image from particular distance. I have already read the following link.
But couldn't understand clearly which hardware requirements need & how to integrated into all together?
Certainly, a depth sensor needs an IR sensor, just like in Kinect or Asus Xtion and other cameras available that provides the depth or range image. However, Microsoft came up with machine learning techniques and using algorithmic modification and research which you can find here. Also here is a video link which shows the mobile camera that has been modified to get depth rendering. But some hardware changes might be necessary if you make a standalone 2D camera into a new performing device. So I would suggest you to see the hardware design of the existing market devices as well.
one way or the other you would need two angles to the same points to get a depth. So search for depth sensors and examples e.g. kinect with ros or openCV or here
also you could transfere two camera streams into a point cloud but that's another story
Here's what I know:
3D Cameras
RGBD and Stereoscopic cameras are popular for these applications but are not always practical / available. I've prototyped with Kinects (v1,v2) and intel cameras (r200,d435). Certainly those are preferred even today.
2D Cameras
IF YOU WANT TO USE RGB DATA FOR DEPTH INFO then you need to have an algorithm that will process the math for each frame; try an RGB SLAM. A good algo will not process ALL the data every frame but it will process all the data once and then look for clues to support evidence of changes to your scene. A number of BIG companies have already done this (it's not that difficult if you have a big team w big money) think Google, Apple, MSFT, etc etc.
Good luck out there, make something amazing!

Opencv Vs FFmpeg performance comparision for web cam feed recording

I'm working on my academic project in which I've to record a video from the webcam.
In my google search, I've found that FFmpeg uses pipelining to record the video from the camera while OpenCV, AVbin use ctypes.
I don't know the pros and cons of either method, and wonder if you can help me decide which one to choose to write such kind of program in Linux.
I need to record 1024x768 30hz video and not experience latency/lag issues. Performance is the high priority.

How to connect two kinect v.2 sensor to one computer

I'm updating an application which use 3 kinect v1 with sdk 1.8 connected to the same computer.
Actually i am updating my application with kinect v2, to improve the performance of my system. The last version of microsoft sdk 2.0 does not support multi sensor connection.
The only solution that i tried which works is to use three different pc,
each for kinect v.2, and exchange data through Ethernet connection.
The problem of this solution is that is too expensive.
The minimum specs of kinect 2 require expensive pc, while i was considering to use this solution just with smart small computer like raspberry 2.
My questions are:
Do you know any hack solution to provide mulitple kinect v2 sensor connection to the same computer?
Do you know any low cost, raspberry likes, solution, which respect the minimum kinect v2 requirements? (http://www.microsoft.com/en-us/kinectforwindows/purchase/sensor_setup.aspx)
When you only need the video and depth data, perhaps you could investigate to use https://github.com/OpenKinect/libfreenect2
Here I can understand if the maximum framerate could be a bit lower than what you get on an intel i5 system with USB 3.0.
The rest of the high requirements is also necessary for skeleton tracking. So this won't be available then, also as this is not present in the libfreenect2.

FCam - low light / HDR photos

I am using FCam to take pictures and right now without modification, the pictures are par for a smartphone camera. FCam advertises HDR and low-light performance, but I don't see any examples of how to use that when taking pictures.
How do I take HDR pictures? From my experience with SLRs, you normally take 3 pictures, 1 under, 1 over, and 1 exposed properly for the scene.
Since I will be taking many pictures, how should I blend those pixels together? An average?
The FCam project page includes a complete camera app - FCamera, check the FCam Download Page, last item, which for "HDR Viewfinder" simply averages a long/short exposure image together, and for "HDR Capture" automatically records a burst of suitably-exposed shots. See src/CameraThread.cpp in the sources, I'm not sure how appropriate it is to quote from that but you'll find both pieces in CameraThread::run().
It doesn't average the HDR images for you, it records them as sequence. I guess that's by intent - much of the "HDR" appeal you achieve by carefully tuning the tone mapping after the averaging process, i.e. adjust how exactly the dynamic range compression back to 8bit is performed. If you'd do that in a hardcoded way on camera, you'll restrict the photographer's options with respect to achieving the optimal output. The MPI has a research group on HDR imaging techniques that provides sourcecode for this purpose.
In short, a "poor man's HDR" would just be an average. A "proper HDR" will never be 8-bit JPEG because that throws away the "high" bit in "high dynamic range" - for that reason, the conversion from HDR (which will have 16bit/color or even more) to e.g. JPEG is usually done as postprocessing (off-camera) step, from the HDR image sequence.
Note on HDR video
For HDR video, if you're recording with a single sensor on a hand-held you'll normally introduce motion between the images that form the "HDR sequence" (your total exposure time equals the sum of all subexposures, plus latency from sensor data reads and camera controller reprogramming).
That means image registration should be attempted before the actual overlay and final tone mapping operation as well, unless you're ok with the blur. Registration is somewhat compute intensive and another good reason to record the image stream first and perform the HDR video creation later (with some manual adjustment offered). The OpenCV library provides registration / matching functions.
The abovementioned MPI software is PFSTools, particularly the Tone Mapping operators (PFStmo) library. The research papers by one of the authors provide a good starting point; as to your question on how to perform the postprocessing, PFSTools are command-line utilities that interoperate/pass data via UNIX pipes; on Maemo / the N900, their use is straightforward thanks to the full Linux environment; just spawn a shell script via system().