What algorithm does OpenCV's Bayer conversion use? - c++

I would like to implement a GPU Bayer to RGB image conversion algorithm, and I was wondering what algorithm the OpenCV cvtColor function uses. Looking at the source I see what appears to be a variable number of gradients algorithm and a basic algorithm that could maybe be bilinear interpolation? Does anyone have experience with this that they could share with me, or perhaps know of GPU code to convert from Bayer to BGR format?
The source code is in imgproc/src/color.cpp. I'm looking for a link to it. Bayer2RGB_ and Bayer2RGB_VNG_8u are the functions I'm looking at.
Edit: Here's a link to the source.
http://code.opencv.org/projects/opencv/repository/revisions/master/entry/modules/imgproc/src/color.cpp
I've already implemented a bilinear interpolation algorithm, but it doesn't seem to work very well for my purposes. The picture looks ok, but I want to compute HOG features from it and in that respect it doesn't seem like a good fit.

Default is 4way linear interpolation or variable number of gradients if you specify the VNG version.
see ..\modules\imgproc\src\color.cpp for details.
I submitted a simple linear CUDA Bayer->RGB(A) to opencv, haven't followed if it's been accepted but it should be in the bugs tracker.
It's based on the code in Cuda Bayer/CFA demosaicing example.
Here is a sample of howto use cv::GPU in your own code.
/*-------RG ccd BGRA output ----------------------------*/
__global__ void bayerRG(const cv::gpu::DevMem2Db in, cv::gpu::PtrStepb out)
{
// Note called for every pair, so x/y are for start of cell so need x+1,Y+1 for right/bottom pair
// R G
// G B
// src
int x = 2 * ((blockIdx.x*blockDim.x) + threadIdx.x);
int y = 2 * ((blockIdx.y*blockDim.y) + threadIdx.y);
uchar r,g,b;
// 'R'
r = (in.ptr(y)[x]);
g = (in.ptr(y)[x-1]+in.ptr(y)[x+1]+(in.ptr(y-1)[x]+in.ptr(y+1)[x]))/4;
b = (in.ptr(y-1)[x-1]+in.ptr(y-1)[x+1]+(in.ptr(y+1)[x-1]+in.ptr(y+1)[x+1]))/4;
((uchar4*)out.ptr(y))[x] = make_uchar4( b,g,r,0xff);
// 'G' in R
r = (in.ptr(y)[x]+in.ptr(y)[x+2])/2;
g = (in.ptr(y)[x+1]);
b = (in.ptr(y-1)[x+1]+in.ptr(y+1)[x+1])/2;
((uchar4*)out.ptr(y))[x+1] = make_uchar4( b,g,r,0xff);
// 'G' in B
r = (in.ptr(y)[x]+in.ptr(y+2)[x])/2;
g = (in.ptr(y+1)[x]);
b = (in.ptr(y+1)[x-1]+in.ptr(y+1)[x+2])/2;
((uchar4*)out.ptr(y+1))[x] = make_uchar4( b,g,r,0xff);
// 'B'
r = (in.ptr(y)[x]+in.ptr(y)[x+2]+in.ptr(y+2)[x]+in.ptr(y+2)[x+2])/4;;
g = (in.ptr(y+1)[x]+in.ptr(y+1)[x+2]+in.ptr(y)[x+1]+in.ptr(y+2)[x+1])/4;
b = (in.ptr(y+1)[x+1]);
((uchar4*)out.ptr(y+1))[x+1] = make_uchar4( b,g,r,0xff);
}
/* called from */
extern "C" void cuda_bayer(const cv::gpu::DevMem2Db& img, cv::gpu::PtrStepb out)
{
dim3 threads(16,16);
dim3 grid((img.cols/2)/(threads.x), (img.rows/2)/(threads.y));
bayerGR2<<<grid,threads>>>(img,out);
cudaThreadSynchronize();
}

Currently, to my knowledge, the best debayer out there is DFPD (directional filtering with posteriori decision) as explained in this paper. The paper is quite explanatory and you can easily prototype this approach on Matlab. Here's a blog post comparing the results of DFPD to debayer based on linear approach. You can visibly see the improvement in artifacts, colors and sharpness.

As far as I know at this point it is using adaptive homogeneity directed demosaicing. Explained in a paper by Hirakawa and many other sources on the web.

Related

Adjust OpenCV's SGBM parameters to avoid striped ground

I am building a disparity map with OpenCV's SGBM implementation. In particular, I'm using the OpenCV version 2.4.9. Currently, I am working with the following parameters:
int numberOfDisparities = 64;
sgbm.preFilterCap = 25;
sgbm.SADWindowSize = 14;
sgbm.P1 = 8*sgbm.SADWindowSize*sgbm.SADWindowSize;
sgbm.P2 = 32*sgbm.SADWindowSize*sgbm.SADWindowSize;
sgbm.minDisparity = 0;
sgbm.numberOfDisparities = numberOfDisparities;
sgbm.uniquenessRatio = 10;
sgbm.speckleWindowSize = 100;
sgbm.speckleRange = 2;
sgbm.disp12MaxDiff = 1;
sgbm.fullDP = false;
However, the result is not as good as I need:
As you can see, the ground appears striped, which causes problems with the obstacle detection algorithm I am using, which is an implementation of the paper Fast and reliable obstacle detection and segmentation for cross-country navigation. The problem is that, as these strips have a rather constant disparity, which means that teorically they are almost perpendicular to the ground, the algorithm clasifies them as obstacles.
I played a little bit with the smoothing parameters P1 and specially P2, testing if increasing P2 improved it, but it doesn't seem to solve the problem, so I just left as recommended by the OpenCV sample.
The images are from the KITTI dataset and I downloaded them already rectified, so I discarded a bad calibration or camera alignment.
I wonder if it can be due to an untextured ground, although I hope not, since if that's the case it means I would have to change the obstacle detection algorithm.
Any ideas?

CImg - Saliency with spectral approach

I am trying to implement the spectral approach to get the saliency of an image with CImg, but I'm having trouble getting there.
This might seems like a repost from this question (spectral residual saliency detection in C++ with CImg) but I think I got right the two mistakes from this question (atan2 and FFT arguments).
Here's my code:
int main(int argc, char * argv[]) {
const char * input_file = "img/pic.png";
CImg<float> input = CImg<float>(input_file);
const CImg<float> mask(3,3,1,1,1.0f/9.0f);
resize_fft(input); // Resize for fft
CImg<float> gray = any2gray(input); // to single channel grayscale
CImgList<float> fft = gray.get_FFT();
CImg<float> amp = (fft[0].get_pow(2) + fft[1].get_pow(2)).get_sqrt();
CImg<float> amp_log = (amp + 1.0f).get_log().get_normalize(0, 255);
CImg<float> phase = fft[1].get_atan2(fft[0]);
CImg<float> residual = amp_log - amp_log.get_convolve(mask);
CImg<float> real = residual.get_exp();
CImg<float>::FFT(real, phase, true);
real.save("img/001.png");
real.normalize(0, 255).save("img/002.png");
return 1;
}
Both save pictures 001 and 002 end up being noise-like picture, like still in the frequency space.
I don't what's something wrong with what I'm doing, if yuo guys can help me ?
Thanks.
Firstly, It is obvious that you forget to smooth real with Gaussian filter.
Secondly, the line CImg<float>::FFT(real, phase, true); is suspectable. I don't know about CImg library but I can understand what you are expressing. When you do inverse fft, I think that both the real part and the imaginary part are wrong. The formulae in the paper are kind of misleading, reading the matlab code is clearer.
If you are familiar with complex number, you will find that getting variable phase is not necessary here.
The pseudo code replacing the line are here:
fft[0] = fft[0] ./ amp .* residual;
fft[1] = fft[1] ./ amp .* residual;
//Inverse Fourier Transform
CImg<float>::FFT(fft[0], fft[1], true);
real = fft[0].get_pow(2) + fft[1].get_pow(2);
real.get_convolve(Gaussian filter with sigma = 8)
All the operators with left dot mean element-wise operation.

ITK Fast Marching output

I'm using ITK to do some preprocessing and I wanted to test something with the Fast Marching filter and the Geodesic Active Contour filter.
I'm following the algorithm described in the ITK software guide, section 9.3.3.
However, I'm not getting the expected results. I'm working with a 3D image.
Here is my code:
AnisotropicDiffusionFilter::Pointer anisotropic_filter = AnisotropicDiffusionFilter::New();
anisotropic_filter->SetInput(itk_image_in);
anisotropic_filter->SetTimeStep(0.0625);
anisotropic_filter->SetNumberOfIterations(5);
anisotropic_filter->SetConductanceParameter(3.0);
anisotropic_filter->Update();
GradientFilter::Pointer gradient_filter = GradientFilter::New();
gradient_filter->SetInput(anisotropic_filter->GetOutput());
gradient_filter->SetSigma(0.5);
gradient_filter->Update();
SigmoidFilter::Pointer sigmoid_filter = SigmoidFilter::New();
sigmoid_filter->SetInput(gradient_filter->GetOutput());
sigmoid_filter->SetOutputMinimum(0.0);
sigmoid_filter->SetOutputMaximum(1.0);
sigmoid_filter->SetAlpha(-1.5);
sigmoid_filter->SetBeta(4.0);
sigmoid_filter->Update();
FastMarchingFilter::Pointer fast_marching = FastMarchingFilter::New();
NodeContainer::Pointer seeds = NodeContainer::New();
Node node;
const double seedValue = -50.0;
node.SetValue(seedValue);
seeds->Initialize();
vector<GeoVec3s>::iterator it = m_clicks_.begin();
int i=0;
for(; it != m_clicks_.end(); it++)
{
itkIndex index;
index[0] = (*it)[0];
index[1] = (*it)[1];
index[2] = (*it)[2];
node.SetIndex(index);
seeds->InsertElement(i++, node);
}
fast_marching->SetTrialPoints(seeds);
fast_marching->SetSpeedConstant(1.0);
fast_marching->SetStoppingValue(100);
//fast_marching->SetInput(sigmoid_filter->GetOutput());
fast_marching->SetOutputSize(sigmoid_filter->GetOutput()->GetBufferedRegion().GetSize());
fast_marching->Update();
GeodesicFilter::Pointer geodesic_filter = GeodesicFilter::New();
geodesic_filter->SetInput(fast_marching->GetOutput());
geodesic_filter->SetFeatureImage(sigmoid_filter->GetOutput());
geodesic_filter->SetPropagationScaling(0.5);
geodesic_filter->SetCurvatureScaling(5.0);
geodesic_filter->SetAdvectionScaling(1.0);
geodesic_filter->SetMaximumRMSError( 0.02 );
geodesic_filter->Update();
BinaryThresholdFilter::Pointer thresholder = BinaryThresholdFilter::New();
thresholder->SetLowerThreshold(-1000);
thresholder->SetUpperThreshold(0);
thresholder->SetOutsideValue(0);
thresholder->SetInsideValue(255);
thresholder->SetInput( geodesic_filter->GetOutput() );
I'm using metrics described in this paper which goal is the same as mine.
I have a few questions:
The fast marching filter should output a distance map right? Instead, when I output my volume to a series of png (between values 0 and 4095) I have a binary image (pixels are either 0 or 4095). I think I should get a greyscale volume indicating the time needed for each pixel to be attained from the seeds.
Following the procedure described by Suzuki I succeeded to make the algorithm work more or less however I changed the values of the parameters of the geodesic filter. I don't remember the exact values but it wasn't close to those described in the paper. As we are working with the sigmoid input which is normalized between 0 and 1, what is happening?
Should I rather use a constant speed function for the fast marching filter or the sigmoid image? When should either method be preferred?
I'm using a re-scaler to output my float images (output from the filters). Could this be the reason for the inconsistencies I'm seeing?
Any advice on what I could be doing wrong?
Thanks.
Ok so I found my problem. The Fast Marching filter does output a time crossing map (distance map) but as I specified a stopping value in the algorithm all the pixels that weren't visited had a high value (1.7e+38 as it is half the max value of the type used for the output image which were float in my case 3.4e+38). So it squeezed all my image dynamic when I used the rescale filter and the result was an binary image.
I think better results are achieved with a sigmoid image as input for the fast marching filter.
Thanks #nav for the advice.

How to correctly use cv::triangulatePoints()

I am trying to triangulate some points with OpenCV and I found this cv::triangulatePoints() function. The problem is that there is almost no documentation or examples of it.
I have some doubts about it.
What method does it use?
I've making a small research about triangulations and there are several methods (Linear, Linear LS, eigen, iterative LS, iterative eigen,...) but I can't find which one is it using in OpenCV.
How should I use it? It seems that as an input it needs a projection matrix and 3xN homogeneous 2D points. I have them defined as std::vector<cv::Point3d> pnts, but as an output it needs 4xN arrays and obviously I can't create a std::vector<cv::Point4d> because it doesn't exist, so how should I define the output vector?
For the second question I tried: cv::Mat pnts3D(4,N,CV_64F); and cv::Mat pnts3d;, neither seems to work (it throws an exception).
1.- The method used is Least Squares. There are more complex algorithms than this one. Still it is the most common one, as the other methods may fail in some cases (i.e. some others fails if points are on plane or on infinite).
The method can be found in Multiple View Geometry in Computer Vision by Richard Hartley and Andrew Zisserman (p312)
2.-The usage:
cv::Mat pnts3D(1,N,CV_64FC4);
cv::Mat cam0pnts(1,N,CV_64FC2);
cv::Mat cam1pnts(1,N,CV_64FC2);
Fill the 2 chanel point Matrices with the points in images.
cam0 and cam1 are Mat3x4 camera matrices (intrinsic and extrinsic parameters). You can construct them by multiplying A*RT, where A is the intrinsic parameter matrix and RT the rotation translation 3x4 pose matrix.
cv::triangulatePoints(cam0,cam1,cam0pnts,cam1pnts,pnts3D);
NOTE: pnts3D NEEDs to be a 4 channel 1xN cv::Mat when defined, throws exception if not, but the result is a cv::Mat(4,N,cv_64FC1) matrix. Really confusing, but it is the only way I didn't got an exception.
UPDATE: As of version 3.0 or possibly earlier, this is no longer true, and pnts3D can also be of type Mat(4,N,CV_64FC1) or may be left completely empty (as usual, it is created inside the function).
A small addition to #Ander Biguri's answer. You should get your image points on a non-undistorted image, and invoke undistortPoints() on the cam0pnts and cam1pnts, because cv::triangulatePoints expects the 2D points in normalized coordinates (independent from the camera) and cam0 and cam1 should be only [R|t^T] matricies you do not need to multiple it with A.
Thanks to Ander Biguri! His answer helped me a lot. But I always prefer the alternative with std::vector, I edited his solution to this:
std::vector<cv::Point2d> cam0pnts;
std::vector<cv::Point2d> cam1pnts;
// You fill them, both with the same size...
// You can pick any of the following 2 (your choice)
// cv::Mat pnts3D(1,cam0pnts.size(),CV_64FC4);
cv::Mat pnts3D(4,cam0pnts.size(),CV_64F);
cv::triangulatePoints(cam0,cam1,cam0pnts,cam1pnts,pnts3D);
So you just need to do emplace_back in the points. Main advantage: you do not need to know the size N before start filling them. Unfortunately, there is no cv::Point4f, so pnts3D must be a cv::Mat...
I tried cv::triangulatePoints, but somehow it calculates garbage. I was forced to implement a linear triangulation method manually, which returns a 4x1 matrix for the triangulated 3D point:
Mat triangulate_Linear_LS(Mat mat_P_l, Mat mat_P_r, Mat warped_back_l, Mat warped_back_r)
{
Mat A(4,3,CV_64FC1), b(4,1,CV_64FC1), X(3,1,CV_64FC1), X_homogeneous(4,1,CV_64FC1), W(1,1,CV_64FC1);
W.at<double>(0,0) = 1.0;
A.at<double>(0,0) = (warped_back_l.at<double>(0,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,0) - mat_P_l.at<double>(0,0);
A.at<double>(0,1) = (warped_back_l.at<double>(0,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,1) - mat_P_l.at<double>(0,1);
A.at<double>(0,2) = (warped_back_l.at<double>(0,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,2) - mat_P_l.at<double>(0,2);
A.at<double>(1,0) = (warped_back_l.at<double>(1,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,0) - mat_P_l.at<double>(1,0);
A.at<double>(1,1) = (warped_back_l.at<double>(1,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,1) - mat_P_l.at<double>(1,1);
A.at<double>(1,2) = (warped_back_l.at<double>(1,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,2) - mat_P_l.at<double>(1,2);
A.at<double>(2,0) = (warped_back_r.at<double>(0,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,0) - mat_P_r.at<double>(0,0);
A.at<double>(2,1) = (warped_back_r.at<double>(0,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,1) - mat_P_r.at<double>(0,1);
A.at<double>(2,2) = (warped_back_r.at<double>(0,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,2) - mat_P_r.at<double>(0,2);
A.at<double>(3,0) = (warped_back_r.at<double>(1,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,0) - mat_P_r.at<double>(1,0);
A.at<double>(3,1) = (warped_back_r.at<double>(1,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,1) - mat_P_r.at<double>(1,1);
A.at<double>(3,2) = (warped_back_r.at<double>(1,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,2) - mat_P_r.at<double>(1,2);
b.at<double>(0,0) = -((warped_back_l.at<double>(0,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,3) - mat_P_l.at<double>(0,3));
b.at<double>(1,0) = -((warped_back_l.at<double>(1,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,3) - mat_P_l.at<double>(1,3));
b.at<double>(2,0) = -((warped_back_r.at<double>(0,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,3) - mat_P_r.at<double>(0,3));
b.at<double>(3,0) = -((warped_back_r.at<double>(1,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,3) - mat_P_r.at<double>(1,3));
solve(A,b,X,DECOMP_SVD);
vconcat(X,W,X_homogeneous);
return X_homogeneous;
}
the input parameters are two 3x4 camera projection matrices and a corresponding left/right pixel pair (x,y,w).
Additionally to Ginés Hidalgo comments,
if you did a stereocalibration and could estimate exactly Fundamental Matrix from there, which was calculated based on checkerboard.
Use correctMatches function refine detected keypoints
std::vector<cv::Point2f> pt_set1_pt_c, pt_set2_pt_c;
cv::correctMatches(F,pt_set1_pt,pt_set2_pt,pt_set1_pt_c,pt_set2_pt_c)

Finding extrinsics between cameras

I'm in the situation where I need to find the relative camera poses between two/or more cameras based on image correspondences (so the cameras are not in the same point). To solve this I tried the same approach as described here (code below).
cv::Mat calibration_1 = ...;
cv::Mat calibration_2 = ...;
cv::Mat calibration_target = calibration_1;
calibration_target.at<float>(0, 2) = 0.5f * frame_width; // principal point
calibration_target.at<float>(1, 2) = 0.5f * frame_height; // principal point
auto fundamental_matrix = cv::findFundamentalMat(left_matches, right_matches, CV_RANSAC);
fundamental_matrix.convertTo(fundamental_matrix, CV_32F);
cv::Mat essential_matrix = calibration_2.t() * fundamental_matrix * calibration_1;
cv::SVD svd(essential_matrix);
cv::Matx33f w(0,-1,0,
1,0,0,
0,0,1);
cv::Matx33f w_inv(0,1,0,
-1,0,0,
0,0,1);
cv::Mat rotation_between_cameras = svd.u * cv::Mat(w) * svd.vt; //HZ 9.19
But in most of my cases I get extremly weird results. So my next thought was using a full fledged bundle adjuster (which should do what i am looking for?!). Currently my only big dependency is OpenCV and they only have a undocumented bundle adjustment implementation.
So the question is:
Is there a bundle adjuster which has no dependencies and uses a licence which allows commerical use?
Are there other easy way to find the extrinsics?
Are objects with very different distances to the cameras a problem? (heavy parallax)
Thanks in advance
I'm also working on same problem and facing slimier issues.
Here are some suggestions -
Modify Essential Matrix Before Decomposition:
Modify Essential matrix before decomposition [U W Vt] = SVD(E), and new E' = diag(s,s,0) where s = W(0,0) + W(1,1) / 2
2-Stage Fundamental Matrix Estimation:
Recalculate the fundamental matrix with the RANSAC inliers
These steps should make the Rotation estimation more susceptible to noise.
you have to get 4 different solutions and select the one with the most # points having positive Z coordinates. The solution are generated by inverting the sign of the fundamental matrix an substituting w with w_inv which you did not do though you calculated w_inv. Are you reusing somebody else code?