Creating mosaic of camera feeds in opencv without significant delay

Creating mosaic of camera feeds in opencv without significant delay - c++

Im working on project with opencv and c++. Version of Opencv is 3.1. HW setup is Nvidia gt460 and Intel i7 3820, 64Gb ram. Im trying to achieve multiple camera setup where all camera feeds will be merged in one big mosaic. In early stages maybe 4x4 later even bigger. After that I will be analyzing this mosaic and tracking multiple objects.
The problem is that when I create camera feed with capture command in Opencv and then store it to matrix, analyze it and show it. There's big FPS issue already with two camera feeds. I have tested three USB feeds as well as multiple UDP or RTSP streams. When using USB, delay is not the biggest problem but FPS are something like spliting between feeds. And using stream method is giving me low FPS and high delay (around 15 seconds). I also realized there is different delay between camera feeds even if I have cameras pointed on the same thing.
Is there anybody, who could help me or solved similiar problem?
Is it problem of Opencv that it cannot analyze more live feeds simultaneously?
Heres my merging code:
merged_frame = Mat(Size(1280, 960), CV_8UC3);
roi = Mat(merged_frame, Rect(0, 0, 640, 480));
cameraFeed.copyTo(roi);
roi = Mat(merged_frame, Rect(640, 0, 640, 480));
cameraFeed2.copyTo(roi);
roi = Mat(merged_frame, Rect(0, 480, 640, 480));
cameraFeed3.copyTo(roi);
roi = Mat(merged_frame, Rect(640, 480, 640, 480));
cameraFeed4.copyTo(roi);

There exists two functions hconcat and vconcat that are not in the documentation.
You can see an example of their use (which is quite easy if all your camera feeds provide frames that have the same resolution) here.
This will probably ask you to create temporary Mat objects to store intermediate results, but I think it's a more intuitive way to create a mosaic of frames.

Related

OpenCV: Difference between cap.set(CAP_PROP_FRAME_WIDTH or CAP_PROP_FRAME_HEIGHT) and resize()

I am working on OpenCV (4.3.0) and trying to understand how I can change the resolution of the image.
To my understanding there are 2 ways that I can do this,
Using "cap.set()" function
cv::VideoCapture cap(0)
cap.set(CAP_PROP_FRAME_WIDTH, 320);//Setting the width of the video
cap.set(CAP_PROP_FRAME_HEIGHT, 240);//Setting the height of the video
Using "resize()" function,
int up_width = 600;
int up_height = 400;
Mat resized_up;
resize(image, resized_up, Size(up_width, up_height), INTER_LINEAR);
I wanted to understand if they both are the same or if they are different. What are the exact differences between them?

cap means capability of a camera. we set the required resolution before capturing from camera. If the resolution is supported only then we get a valid frame from the camera. For example a webcam might support some fixed number of resolutions like 1280x720, 640x480 etc and we can only set the cam for those resolutions.
resize is an interpolation (bilinear or bicubic or anyohter)function which resizes(upscale or downscale) a frame to any desired size.

Here is one of your question's answer. CAP_PROP_FRAME_WIDTH and CAP_PROP_FRAME_HEIGHT are some of capture properties. Documentation says:
Reading / writing properties involves many layers. Some unexpected
result might happens along this chain. Effective behaviour depends
from device hardware, driver and API Backend.
So if your camera backend matches with the opencv supported backends, then you will be able to change the resolution of your camera (if your camera configuration supports different resolution). For example a camera can support 640x480 and 1920x1080 at the same time. If opencv backend support this camera backend you can switch the resolution configurations by the code:
cap.set(CAP_PROP_FRAME_WIDTH, 640);
cap.set(CAP_PROP_FRAME_HEIGHT, 480);
or
cap.set(CAP_PROP_FRAME_WIDTH, 1920);
cap.set(CAP_PROP_FRAME_HEIGHT, 1080);
What about resize ?
resize() is totally different than the concept we talked above. Video properties are based on hardware, if you use a camera with 640x480 resolution. It means that camera sensor has specified perceptron cells inside for each pixel. However, resize deal with the resulted image via on software. What resize is doing is that interpolating(manipulating) image's height and width. Otherwords, it looks like you are looking at somewhere very close(zoom in) or very far(zoom out).

using OpenCV to capture images, not video

I'm using OpenCV4 to read from a camera. Similar to a webcam. Works great, code is somewhat like this:
cv::VideoCapture cap(0);
cap.set(cv::CAP_PROP_FRAME_WIDTH , 1600);
cap.set(cv::CAP_PROP_FRAME_HEIGHT, 1200);
while (true)
{
cv::Mat mat;
// wait for some external event here so I know it is time to take a picture...
cap >> mat;
process_image(mat);
}
Problem is, this gives many video frames, not a single image. This is important because in my case I don't want nor need to be processing 30 FPS. I actually have specific physical events that trigger reading the image from the camera at certain times. Because OpenCV is expecting the caller to want video -- not surprising considering the class is called cv::VideoCapture -- it has buffered many seconds of frames.
What I see in the image is always from several seconds ago.
So my questions:
Is there a way to flush the OpenCV buffer?
Or to tell OpenCV to discard the input until I tell it to take another image?
Or to get the most recent image instead of the oldest one?
The other option I'm thinking of investigating is using V4L2 directly instead of OpenCV. Will that let me take individual pictures or only stream video like OpenCV?

Feed GStreamer sink into OpenPose

I have a custom USB camera with a custom driver on a custom board Nvidia Jetson TX2 that is not detected through openpose examples. I access the data using GStreamer custom source. I currently pull frames into a CV mat, color convert them and feed into OpenPose on a per picture basis, it works fine but 30 - 40% slower than a comparable video stream from a plug and play camera. I would like to explore things like tracking that is available for streams since Im trying to maximize the fps. I believe the stream feed is superior due to better (continuous) use of the GPU.
In particular the speedup would come at confidence expense and would be addressed later. 1 frame goes through pose estimation and 3 - 4 subsequent frames are just tracking the object with decreasing confidence levels. I tried that on a plug and play camera and openpose example and the results were somewhat satisfactory.
The point where I stumbled is that I can put the video stream into CV VideoCapture but I do not know, however, how to provide the CV video capture to OpenPose for processing.
If there is a better way to do it, I am happy to try different things but the bottom line is that the custom camera stays (I know ;/). Solutions to the issue described or different ideas are welcome.
Things I already tried:
Lower resolution of the camera (the camera crops below certain res instead of binning so cant really go below 1920x1080, its a 40+ MegaPixel video camera by the way)
use CUDA to shrink the image before feeding it to OpenPose (the shrink + pose estimation time was virtually equivalent to the pose estimation on the original image)
since the camera view is static, check for changes between frames, crop the image down to the area that changed and run pose estimation on that section (10% speedup, high risk of missing something)

Why does a full screen window resolution in OpenCV (# Banana Pi, Raspbian) slow down the camera footage and let it lag?

Currently I’m working on a project to mirror a camera for a blind spot.
The camera got 640 x 480 NTSC signal.
The output screen is 854 x 480 NTSC.
I grab the camera with an EasyCAP video grabber.
On the Banana Pi I installed open cv 2.4.9.
The critical point of this project is that the video on the display needs to be real time.
Whenever I comment the line that puts the window into fullscreen, there pop ups a small window and the footage runs without delay and lagg.
But when I set the video to full screen, the footage becomes slow, and lags.
Part of the code:
namedWindow("window",0);
setWindowProperty("window",CV_WND_PROP_FULLSCREEN,CV_WINDOW_FULLSCREEN);
while(1){
cap>>image;
flip(image, destination,1);
imshow("window",destination);
waitKey(33); //delay 33 ms
}
How can I fill the screen with the camera footage without losing speed and frames?
Is it possible to output the footage directly to the composite output?

The problem is that upscaling and drawing is done in software here. The Banana Pi processor is not powerful enough to process the needed throughput with 30 frames per second.
This is an educated guess on my side, as even desktop systems can run into lag problems when processing and simultaneously displaying video.
A common solution in the computer vision community for this problem is to use OpenGL for display. Here, the upscaling and display is offloaded to the graphics processor. You can do the same thing on a Banana Pi.
If you compiled OpenCV with OpenGL support, you can try it like this:
namedWindow("window", WINDOW_OPENGL);
imshow("window", destination);
Note that if you use OpenGL, you can also save on the flip operation by using an approprate modelview matrix. For this however you probably need to dive into GL code yourself instead of using imshow.

I fixed the whole problem by using:
namedWindow("window",1);
With FLAG 1 stands for WINDOW_AUTOSIZE.
The footage is more real-time now.
I’m using a small monitor, so the window size is nearly the same as the monitor.

Combining Direct3D, Axis to make multiple IP camera GUI

Right now, what I'm trying to do is to make a new GUI, essentially a software using directX (more exact, direct3D), that display streaming images from Axis IP cameras.
For the time being I figured that the flow for the entire program would be like this:
1. Get the Axis program to get streaming images
2. Pass the images to the Direct3D program.
3. Display the program, on the screen.
Currently I have made a somewhat basic Direct3D app that loads and display video frames from avi videos(for testing). I dunno how to load images directly from videos using DirectX, so I used OpenCV to save frames from the video and have DX upload them up. Very slow.
Right now I have some unclear things:
1. How to Get an Axis program that works in C++ (gonna look up examples later, prolly no big deal)
2. How to upload images directly from the Axis IP camera program.
So guys, do you have any recommendations or suggestions on how to make my program work more efficiently? Anything just let me know.

Well you may find it faster to use directshow and add a custom renderer at the far end that, directly, copies the decompressed video data directly to a Direct3D texture.
Its well worth double buffering that texture. ie have texture 0 displaying and texture 1 being uploaded too and then swap the 2 over when a new frame is available (ie display texture 1 while uploading to texture 0).
This way you can de-couple the video frame rate from the rendering frame rate which makes dropped frames a little easier to handle.

I use in-place update of Direct3D textures (using IDirect3DTexture9::LockRect) and it works very fast. What part of your program works slow?

For capture images from Axis cams you may use iPSi c++ library: http://sourceforge.net/projects/ipsi/
It can be used for capturing images and control camera zoom and rotation (if available).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js