CUDA equivalent of estimateRigidTransform in OpenCV 3 - c++

I'm working on a video stabilisation project using OpenCV, and I've got a CPU implementation working but the performance needs improvement so I'm trying to move most of the processing to the GPU.
The current implementation primarily uses these four OpenCV functions:
So far I've found the following equivalents on the GPU:
Is there a CUDA equivalent of estimateRigidTransform?

OpenCV doesn't have implementation for estimateRigidTransform on CUDA.
There is opencv based project on github, which has functions for computing homographies and estimating rigid transforms:
Here is function you need:


Fast Panorama Image Compositing

I am working on a live panorama algorithm on C++. Basically, I took Stitching_detailed.cpp from OpenCV as reference and started to modify it according to my needs. I have been carefully studying the stitching pipeline on which it is based ( as detailed in Images stitching by OpenCV and Automatic Panoramic Image Stitching using Invariant Features ).
My main problem now is the execution time. I tried to implement as much as I could in CUDA. However, I got some problems with the compositing block. It seems that there is no CUDA implementation for the Graph Cut Seam Finder algorithm.
I am aiming to stitch the images from 6 different cameras in ~60 ms (~15FPS). However, in my current implementation the CPU Version of GraphCut Seam Finder takes about 90 ms for only 3 images ( being 0.1 MP each).
I have tried to use other Seam Finder algorithms which are less computationally expensive as Voronoi and DP, but unfortunately the result is an unpleasant stitched image.
I am kind of lost here now, what could I do in order to speed up this part? Are there any other seam_finding/blending techniques I could make use of?

dlib vs opencv which one to use when

I am currently learning OpenCV API with Python and its all good. I am making decent progress. Part of it comes from Python syntax's simplicity as against using it with C++ which I haven't attempted yet. I have come to realize that I have to get dirty with C++ bindings for OpenCV at some point if I intend to do anything production quality.
Just recently I came across dlib which also claims to do all the things OpenCV does and more. Its written in C++ and offers Python API too (surprise). Can anybody vouch for dlib based on their own implementation experience?
I have used both OpenCV and dlib extensively for face detection and face recognition and dlib is much accurate as compared to OpenCV Haar based face detector. ( Note that OpenCV now has a DNN module where we get Deep Learning based Face Detector and Face Recognizer models. )
I'm in the middle of comparing the OpenCV-DNN vs Dlib for face detection / recognition. Will post the results once I'm done with it.
There are many useful functions available in dlib, but I prefer OpenCV for any other CV tasks.
EDIT : As promised, I have made a detailed comparison of OpenCV vs Dlib Face Detection methods.
Here is my conclusion :
General Case
In most applications, we won’t know the size of the face in the image before-hand. Thus, it is better to use OpenCV – DNN method as it is pretty fast and very accurate, even for small sized faces. It also detects faces at various angles. We recommend to use OpenCV-DNN in most
For medium to large image sizes
Dlib HoG is the fastest method on CPU. But it does not detect small sized faces ( < 70x70 ). So, if you know that your application will not be dealing with very small sized faces ( for example a selfie app ), then HoG based Face detector is a better option. Also, If you can use a GPU, then MMOD face detector is the best option as it is very fast on GPU and also provides detection at various angles.
For more details, you can have a look at this blog

Isn't there a OpenCV Cuda function similar to findContours?

There are several OpenCV CPU functions which have a direct CUDA counterpart like cv::cvtColor & cv::cuda::cvtColor.
But I found no direct or indirect (GPU) Cuda counterpart for cv::findContours CPU.
Isn't there a OpenCV Cuda function similar to findContours? Or does findContours work on both cv::Mat and cv::cuda::GpuMat?
Unfortunately, not. Not even in the latest OpenCV 3.2.0 version. But they have this update, as shown here:
findContours can now find contours on a 32-bit integer image of labels (not only on a black-and-white 8-bit image). This is a step towards more convenient connected component analysis.
No. OpenCV 4.6.0 does not have it.
Nobody has dared to implement this with CUDA for years.

Does OpenGL display image faster than OpenCV?

I am using OpenCV to show image on the projector. But it seems the cv::imshow is not fast enough or maybe the data transfer is slow from my CPU to GPU then to projector, so I wonder if there is a faster way to display than OpenCV?
I considered OpenGL, since OpenGL directly uses GPU, the command may be faster than from CPU which is used by OpenCV. Correct me if I am wrong.
OpenCV already supports OpenGL for image output by itself. No need to write this yourself!
See the documentation:
Create the window first with namedWindow, where you can pass the WINDOW_OPENGL flag.
Then you can even use OpenGL buffers or GPU matrices as input to imshow (the data never leaves the GPU). But it will also use OpenGL to show regular matrix data.
Please note:
To enable OpenGL support, configure OpenCV using CMake with
WITH_OPENGL=ON . Currently OpenGL is supported only with WIN32, GTK
and Qt backends on Windows and Linux (MacOS and Android are not
supported). For GTK backend gtkglext-1.0 library is required.
Note that this is OpenCV 2.4.8 and this functionality has changed quite recently. I know there was OpenGL support in earlier versions in conjunction with the Qt backend, but I don't remember when it was introduced.
About the performance: It is a quite popular optimization in the CV community to output images using OpenGL, especially when outputting video sequences.
OpenGL is optimised for rendering images, so it's likely faster. It really depends if the OpenCV implementation uses any GPU acceleration AND if the bottleneck is on rendering side of things.
Have you tried GPU accelerated OpenCV? -
How big is the image you are displaying? How long does it take to display the image using cv::imshow now?
I know it's an old question, but I happened to have exactly the same problem. And from my observations I've concluded that the root of the problem is the projector's own latency, especially if one is using an older model.
How have I concluded it?
I displayed the same video sequence with cv::imshow() on the laptop monitor and on the projector. Then I waved my hand. It was obvious, that projector introduces significant latency.
To double-check, I've opended a webcam video, waved my hand in front of it and observed the difference on the monitor and on the projector. Webcam does no processing, no opencv operations, so in my understanding the only thing that would explain the latency would be the projector itself.

SIFT, HOG and SURF c++, opencv

I have a simple question, which I want to know, what kind of libraries are available and can give good results for implementing SIFT, HOG(Histogram Oriented Gradient) and SURF in c++ or opencv?
Hence: 1- Give me the link for the code if you can, which I will be so appreciated.
2- If you know one of them or any kind of information to lead me to what I want, I will be so appreciated as well.
check these:
- great article
- great source, I tried it on the iPhone
- fast - fast corner detection library
Example of surf code in openCV
Not sure if this is still relevant, but you also get two implementations of computing HOG descriptors in opencv i.e. both GPU and CPU versions of the HOG code.
for the CPU version you can check this blog post
however in the CPU version you would need to write your own logic for sliding windows.
and the GPU version is fairly straightforward you can read the documentation here
Might help you to know that SIFT and SURF implementations are already integrated into OpenCV.
Be careful about OpenCV implementations, because latest versions of OpenCV have classified SIFT and SURF implementations as nonfree
Now you can use them, but probably they are subject to licensing and cannot be used for commercial solutions.
This one uses descriptors based on HoG, Sobel and Lab channels for detection Class-Specific Hough Forests for Object Detection (opencv/c source code).
Rather then performing detection at every possible location this approach calculates a vote for each descriptor, then when putted together they produce a voting cloud where maximum will correspond to most probable location of the target. When combined with cvGoodFeaturesToTrack can produce very good results, even with a small training database.