OpenCV::LMSolver getting a simple example to run - c++

PROBLEM: The documentation for cv::LMSolver at opencv.org is very thin, to say the least. Finding some useful examples on the internet, also, was not possible.
APPROACH: So, I did write some simple code:
#include <opencv2/calib3d.hpp>
#include <iostream>
using namespace cv;
using namespace std;
struct Easy : public LMSolver::Callback {
Easy() = default;
virtual bool compute(InputArray f_param, OutputArray f_error, OutputArray f_jacobian) const override
{
Mat param = f_param.getMat();
if( f_error.empty() ) f_error.create(1, 1, CV_64F); // dim(error) = 1
Mat error = f_error.getMat();
vector<double> x{param.at<double>(0,0), param.at<double>(1,0)}; // dim(param) = 2
double error0 = calc(x);
error.at<double>(0,0) = error0;
if( ! f_jacobian.needed() ) return true;
else if( f_jacobian.empty() ) f_jacobian.create(1, 2, CV_64F);
Mat jacobian = f_jacobian.getMat();
double e = 1e-10; // estimate derivatives in epsilon environment
jacobian.at<double>(0, 0) = (calc({x[0] + e, x[1] }) - error0) / e; // d/dx0 (error)
jacobian.at<double>(0, 1) = (calc({x[0], x[1] + e}) - error0) / e; // d/dx1 (error)
return true;
}
double calc(const vector<double> x) const { return x[0]*x[0] + x[1]*x[1]; }
};
int main(int argc, char** argv)
{
Ptr<Easy> callback = makePtr<Easy>();
Ptr<LMSolver> solver = LMSolver::create(callback, 100000, 1e-37);
Mat parameters = (Mat_<double>(2,1) << 5, 100);
solver->run(parameters);
cout << parameters << endl;
}
QUESTIONS:
What does the return value of LMSolver::Callback::compute() report to the caller?
Currently, it finds the minimum at (-9e-07,4e-5), instead of the expected (0.0, 0.0). How can the precision be improved?

What does the return value of LMSolver::Callback::compute() report to the caller?
Thankfully, opencv is opensource, so we might be able to figure this out simply by checking out the code.
Looking at the source code on Github, I found that all of the calls to compute() look like:
if( !cb->compute(x, r, J) )
return -1;
Returning false simply causes the solver to bail out. So it seems that the return value of the callback's compute() is simply whether the generation of the jacobian was successful or not.
Currently, it finds the minimum at (-9e-07,4e-5). How can the precision be improved?
If anything, you should at least compare the return value of run() against your maximum iteration count to make sure that it did, in fact, converge as much as it could.

I suppose OP wants to minimize x^2 + y^2 with respect to [x, y].
Because Levenberg-Marquardt method solves a least square problem, the error should be defined as [x, y] so that it minimizes || [x, y] ||^2 = x^2 + y^2.
Another suggestion is that the Jacobian matrix should be provided analytically whenever possible, although this is not crucial in this particular case.
struct Easy : public LMSolver::Callback {
Easy() = default;
virtual bool compute(InputArray f_param, OutputArray f_error, OutputArray f_j
acobian) const override
{
Mat param = f_param.getMat();
if( f_error.empty() ) f_error.create(2, 1, CV_64F); // dim(error) = 2
Mat error = f_error.getMat();
vector<double> x{param.at<double>(0,0), param.at<double>(1,0)}; // dim(param) = 2
error.at<double>(0, 0) = x[0];
error.at<double>(1, 0) = x[1];
if( ! f_jacobian.needed() ) return true;
else if( f_jacobian.empty() ) f_jacobian.create(2, 2, CV_64F);
Mat jacobian = f_jacobian.getMat();
jacobian.at<double>(0, 0) = 1;
jacobian.at<double>(0, 1) = 0;
jacobian.at<double>(1, 0) = 0;
jacobian.at<double>(1, 1) = 1;
return true;
}
};

Related

What should you store or append a batch of tensors to in C++ when using LibTorch?

In C++, when using LibTorch (The C++ version of PyTorch), what should you store a batch of tensors in? I'm running into the problem of not being able to reset the batch on the next step because C++ doesn't allow storing a new variable over an existing variable.
In my attempt my batch of tensors is one single 385x385 tensor. The batch size is 385. In a for loop I use torch::cat to concatenate 385 smaller 1D tensors, which are 385 numbers long. (Maybe 'stack' or 'append' are better terms for what I'm doing since the are stacked together picket fence style more than 'concatenated', but that's what I'm using.) Anyways, there is not problem with this shape. It seems to work fine for one forward and backward pass but then the tensor becomes 770x385 on the next pass instead of a 385x385 tensor of the next 385, 385 long arrays. I hope I am painting a picture and not being too verbose.
The code.
Near the bottom I have the line all_step_obs = torch::tensor({}); to try to wipe out the contents of the tensor, AKA, the batch, but this gives me a Segmentation fault (core dumped). I guess for trying to access the tensor outside of the loop(?)
If I don't have this line I get a 770x385 tensor after the next step.
The model
#include "mujoco/mujoco.h"
struct Net : torch::nn::Module {
torch::Tensor action_high, action_low;
public:
Net(torch::Tensor action_high, torch::Tensor action_low) : action_high(action_high), action_low(action_low){
// Construct and register two Linear submodules.
fc1 = torch::nn::Linear(385, 385);
fc2 = torch::nn::Linear(385, 385);
fc3 = torch::nn::Linear(385, 42);
// cholesky_layer = torch::nn::Linear(385, (42 * (42 + 1)) / 2);
cholesky_layer = torch::nn::Linear(385, 385);
}
// Implement the Net's algorithm.
torch::Tensor forward(torch::Tensor x) {
// Use one of many tensor manipulation functions.
x = torch::relu(fc1->forward(x));
x = torch::dropout(x, /*p=*/0.2, /*train=*/is_training());
x = torch::relu(fc2->forward(x));
auto mean_layer = fc3->forward(x);
auto mean = action_low + (action_high - action_low) * mean_layer;
auto chol_l = cholesky_layer->forward(x);
// auto chol = torch::rand({385, 385});
auto chol = torch::matmul(chol_l, chol_l.transpose(0, 1));
chol = torch::nan_to_num(chol, 0, 2.0);
chol = chol.add(torch::eye(385));
auto cholesky = torch::linalg::cholesky(chol);
// return torch::cat({mean, cholesky}, 0);
return mean_layer;
}
// Use one of many "standard library" modules.
torch::nn::Linear fc1{nullptr}, fc2{nullptr}, fc3{nullptr}, cholesky_layer{nullptr};
};
The training
auto high = torch::ones({385, 42}) * 0.4;
auto low = torch::ones({385, 42}) * -0.4;
auto actor = Net(low, high);
int max_steps = 385;
int steps = 2000;
auto l1_loss = torch::smooth_l1_loss;
auto optimizer = torch::optim::Adam(actor.parameters(), 3e-4);
torch::Tensor train() {
torch::Tensor all_step_obs;
for (int i = 0; i<steps; ++i)
{
for (int i = 0; i<max_steps; ++i)
{
all_step_obs = torch::cat({torch::rand({385}).unsqueeze(0), all_step_obs});
}
auto mean = actor.forward(all_step_obs);
auto loss = l1_loss(mean, torch::rand({385, 42}), 1, 0);
optimizer.zero_grad();
loss.backward();
optimizer.step();
all_step_obs = torch::tensor({});
if (steps == 1999) {
return loss;
}
}
};
int main (int argc, const char** argv) {
std::cout << train();
}

OpenCV undistortPoints not giving the exact inverse of distortion model

I was doing some tests using the distortion model of OpenCV. Basically what I did is, implement the distortion equations and see if the cv::undistortPoints function gives me the inverse of these equations. I realized that cv::undistortPoints does not exactly give you the inverse of the distortion equations. When I saw this, I went to the implementation of cv::undistortPoints and realized that in the end condition of the iterative process of computing the inverse of the distortion model, OpenCV always does 5 iterations (if there are no distortion coefficients provided to the function it actually does 0 iterations) and does not use any error metric on the undistorted point to see if it is precisely undistorted. Haveing this in mind, I copied and modified the termination condition of the iteration process to take and error metrics into account. This gave me the exact inverse of the distortion model. The code showing this is attached at the end of this post. My question is:
Does this happen because OpenCV prefers performance (spending a bit less time) over accuracy (spending a bit more time) or is this just a "bug"? (it is obvious that with the termination condition that I propose the function will take more time to undistort each point)
Thank you very much!
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <iostream>
using namespace cv;
// This is a copy of the opencv implementation
void cvUndistortPoints_copy( const CvMat* _src, CvMat* _dst, const CvMat* _cameraMatrix,
const CvMat* _distCoeffs,
const CvMat* matR, const CvMat* matP )
{
double A[3][3], RR[3][3], k[8]={0,0,0,0,0,0,0,0}, fx, fy, ifx, ify, cx, cy;
CvMat matA=cvMat(3, 3, CV_64F, A), _Dk;
CvMat _RR=cvMat(3, 3, CV_64F, RR);
const CvPoint2D32f* srcf;
const CvPoint2D64f* srcd;
CvPoint2D32f* dstf;
CvPoint2D64f* dstd;
int stype, dtype;
int sstep, dstep;
int i, j, n, iters = 1;
CV_Assert( CV_IS_MAT(_src) && CV_IS_MAT(_dst) &&
(_src->rows == 1 || _src->cols == 1) &&
(_dst->rows == 1 || _dst->cols == 1) &&
_src->cols + _src->rows - 1 == _dst->rows + _dst->cols - 1 &&
(CV_MAT_TYPE(_src->type) == CV_32FC2 || CV_MAT_TYPE(_src->type) == CV_64FC2) &&
(CV_MAT_TYPE(_dst->type) == CV_32FC2 || CV_MAT_TYPE(_dst->type) == CV_64FC2));
CV_Assert( CV_IS_MAT(_cameraMatrix) &&
_cameraMatrix->rows == 3 && _cameraMatrix->cols == 3 );
cvConvert( _cameraMatrix, &matA );
if( _distCoeffs )
{
CV_Assert( CV_IS_MAT(_distCoeffs) &&
(_distCoeffs->rows == 1 || _distCoeffs->cols == 1) &&
(_distCoeffs->rows*_distCoeffs->cols == 4 ||
_distCoeffs->rows*_distCoeffs->cols == 5 ||
_distCoeffs->rows*_distCoeffs->cols == 8));
_Dk = cvMat( _distCoeffs->rows, _distCoeffs->cols,
CV_MAKETYPE(CV_64F,CV_MAT_CN(_distCoeffs->type)), k);
cvConvert( _distCoeffs, &_Dk );
iters = 5;
}
if( matR )
{
CV_Assert( CV_IS_MAT(matR) && matR->rows == 3 && matR->cols == 3 );
cvConvert( matR, &_RR );
}
else
cvSetIdentity(&_RR);
if( matP )
{
double PP[3][3];
CvMat _P3x3, _PP=cvMat(3, 3, CV_64F, PP);
CV_Assert( CV_IS_MAT(matP) && matP->rows == 3 && (matP->cols == 3 || matP->cols == 4));
cvConvert( cvGetCols(matP, &_P3x3, 0, 3), &_PP );
cvMatMul( &_PP, &_RR, &_RR );
}
srcf = (const CvPoint2D32f*)_src->data.ptr;
srcd = (const CvPoint2D64f*)_src->data.ptr;
dstf = (CvPoint2D32f*)_dst->data.ptr;
dstd = (CvPoint2D64f*)_dst->data.ptr;
stype = CV_MAT_TYPE(_src->type);
dtype = CV_MAT_TYPE(_dst->type);
sstep = _src->rows == 1 ? 1 : _src->step/CV_ELEM_SIZE(stype);
dstep = _dst->rows == 1 ? 1 : _dst->step/CV_ELEM_SIZE(dtype);
n = _src->rows + _src->cols - 1;
fx = A[0][0];
fy = A[1][1];
ifx = 1./fx;
ify = 1./fy;
cx = A[0][2];
cy = A[1][2];
for( i = 0; i < n; i++ )
{
double x, y, x0, y0;
if( stype == CV_32FC2 )
{
x = srcf[i*sstep].x;
y = srcf[i*sstep].y;
}
else
{
x = srcd[i*sstep].x;
y = srcd[i*sstep].y;
}
x0 = x = (x - cx)*ifx;
y0 = y = (y - cy)*ify;
// compensate distortion iteratively
int max_iters(500);
double e(1);
for( j = 0; j < max_iters && e>0; j++ )
{
double r2 = x*x + y*y;
double icdist = (1 + ((k[7]*r2 + k[6])*r2 + k[5])*r2)/(1 + ((k[4]*r2 + k[1])*r2 + k[0])*r2);
double deltaX = 2*k[2]*x*y + k[3]*(r2 + 2*x*x);
double deltaY = k[2]*(r2 + 2*y*y) + 2*k[3]*x*y;
double xant = x;
double yant = y;
x = (x0 - deltaX)*icdist;
y = (y0 - deltaY)*icdist;
e = pow(xant - x,2)+ pow(yant - y,2);
}
double xx = RR[0][0]*x + RR[0][1]*y + RR[0][2];
double yy = RR[1][0]*x + RR[1][1]*y + RR[1][2];
double ww = 1./(RR[2][0]*x + RR[2][1]*y + RR[2][2]);
x = xx*ww;
y = yy*ww;
if( dtype == CV_32FC2 )
{
dstf[i*dstep].x = (float)x;
dstf[i*dstep].y = (float)y;
}
else
{
dstd[i*dstep].x = x;
dstd[i*dstep].y = y;
}
}
}
void undistortPoints_copy( InputArray _src, OutputArray _dst,
InputArray _cameraMatrix,
InputArray _distCoeffs,
InputArray _Rmat=noArray(),
InputArray _Pmat=noArray() )
{
Mat src = _src.getMat(), cameraMatrix = _cameraMatrix.getMat();
Mat distCoeffs = _distCoeffs.getMat(), R = _Rmat.getMat(), P = _Pmat.getMat();
CV_Assert( src.isContinuous() && (src.depth() == CV_32F || src.depth() == CV_64F) &&
((src.rows == 1 && src.channels() == 2) || src.cols*src.channels() == 2));
_dst.create(src.size(), src.type(), -1, true);
Mat dst = _dst.getMat();
CvMat _csrc = src, _cdst = dst, _ccameraMatrix = cameraMatrix;
CvMat matR, matP, _cdistCoeffs, *pR=0, *pP=0, *pD=0;
if( R.data )
pR = &(matR = R);
if( P.data )
pP = &(matP = P);
if( distCoeffs.data )
pD = &(_cdistCoeffs = distCoeffs);
cvUndistortPoints_copy(&_csrc, &_cdst, &_ccameraMatrix, pD, pR, pP);
}
// Distortion model implementation
cv::Point2d distortPoint(cv::Point2d undistorted_point,cv::Mat camera_matrix, std::vector<double> distort_coefficients){
// Check that camera matrix is double
if (!(camera_matrix.type() == CV_64F || camera_matrix.type() == CV_64FC1)){
std::ostringstream oss;
oss<<"distortPoint(): Camera matrix type is wrong. It has to be a double matrix (CV_64)";
throw std::runtime_error(oss.str());
}
// Create distorted point
cv::Point2d distortedPoint;
distortedPoint.x = (undistorted_point.x - camera_matrix.at<double>(0,2))/camera_matrix.at<double>(0,0);
distortedPoint.y = (undistorted_point.y - camera_matrix.at<double>(1,2))/camera_matrix.at<double>(1,1);
// Get model
if (distort_coefficients.size() < 4 || distort_coefficients.size() > 8 ){
throw std::runtime_error("distortPoint(): Invalid numbrer of distortion coefficitnes.");
}
double k1(distort_coefficients[0]);
double k2(distort_coefficients[1]);
double p1(distort_coefficients[2]);// tangent distortion first coeficinet
double p2(distort_coefficients[3]);// tangent distortion second coeficinet
double k3(0);
double k4(0);
double k5(0);
double k6(0);
if (distort_coefficients.size() > 4)
k3 = distort_coefficients[4];
if (distort_coefficients.size() > 5)
k4 = distort_coefficients[5];
if (distort_coefficients.size() > 6)
k5 = distort_coefficients[6];
if (distort_coefficients.size() > 7)
k6 = distort_coefficients[7];
// Distort
double xcx = distortedPoint.x;
double ycy = distortedPoint.y;
double r2 = pow(xcx, 2) + pow(ycy, 2);
double r4 = pow(r2,2);
double r6 = pow(r2,3);
double k = (1+k1*r2+k2*r4+k3*r6)/(1+k4*r2+k5*r4+k5*r6);
distortedPoint.x = xcx*k + 2*p1*xcx*ycy + p2*(r2+2*pow(xcx,2));
distortedPoint.y = ycy*k + p1*(r2+2*pow(ycy,2)) + 2*p2*xcx*ycy;
distortedPoint.x = distortedPoint.x*camera_matrix.at<double>(0,0)+camera_matrix.at<double>(0,2);
distortedPoint.y = distortedPoint.y*camera_matrix.at<double>(1,1)+camera_matrix.at<double>(1,2);
// Exit
return distortedPoint;
}
int main(int argc, char** argv){
// Camera matrix
double cam_mat_da[] = {1486.58092,0,1046.72507,0,1489.8659,545.374244,0,0,1};
cv::Mat cam_mat(3,3,CV_64FC1,cam_mat_da);
// Distortion coefficients
double dist_coefs_da[] ={-0.13827409,0.29240721,-0.00088197,-0.00090189,0};
std::vector<double> dist_coefs(dist_coefs_da,dist_coefs_da+5);
// Distorted Point
cv::Point2d p0(0,0);
std::vector<cv::Point2d> p0_v;
p0_v.push_back(p0);
// Undistort Point
std::vector<cv::Point2d> ud_p_v;
cv::undistortPoints(p0_v,ud_p_v,cam_mat,dist_coefs);
cv::Point2d ud_p = ud_p_v[0];
ud_p.x = ud_p.x*cam_mat.at<double>(0,0)+cam_mat.at<double>(0,2);
ud_p.y = ud_p.y*cam_mat.at<double>(1,1)+cam_mat.at<double>(1,2);
// Redistort Point
cv::Point2d p = distortPoint(ud_p, cam_mat,dist_coefs);
// Undistort Point using own termination of iterative process
std::vector<cv::Point2d> ud_p_v_local;
undistortPoints_copy(p0_v,ud_p_v_local,cam_mat,dist_coefs);
cv::Point2d ud_p_local = ud_p_v_local[0];
ud_p_local.x = ud_p_local.x*cam_mat.at<double>(0,0)+cam_mat.at<double>(0,2);
ud_p_local.y = ud_p_local.y*cam_mat.at<double>(1,1)+cam_mat.at<double>(1,2);
// Redistort Point
cv::Point2d p_local = distortPoint(ud_p_local, cam_mat,dist_coefs);
// Display results
std::cout<<"Distorted original point: "<<p0<<std::endl;
std::cout<<"Undistorted point (CV): "<<ud_p<<std::endl;
std::cout<<"Distorted point (CV): "<<p<<std::endl;
std::cout<<"Erorr in the distorted point (CV): "<<sqrt(pow(p.x-p0.x,2)+pow(p.y-p0.y,2))<<std::endl;
std::cout<<"Undistorted point (Local): "<<ud_p_local<<std::endl;
std::cout<<"Distorted point (Local): "<<p_local<<std::endl;
std::cout<<"Erorr in the distorted point (Local): "<<sqrt(pow(p_local.x-p0.x,2)+pow(p_local.y-p0.y,2))<<std::endl;
// Exit
return 0;
}
As suggested, you could get actual motivation from the OpenCV forums. Note however that historically OpenCV has been developed with real-time or near-real-time applications in mind (for example, the Darpa Grand Challenge), hence you'll find easily code that optimizes for speed over accuracy.
In most cases 5 iterations are good enough. What is "enough" can be argued about, but for cases such as finding the optimal camera matrix one can argue that 0.1 pixel does not change much for many applications.
An important thing to note is that in some cases the function does not converge in 5 iterations. I don't know if there can be a case where it will not converge at all. This happens, for example, when the distortion parameters do not fit well the distortion, and therefore, there is no exact solution for some coordinates.
See Jensenb's answer here for a discussion.

clustering image segments in opencv

I am working on motion detection with non-static camera using opencv.
I am using a pretty basic background subtraction and thresholding approach to get a broad sense of all that's moving in a sample video. After thresholding, I enlist all separable "patches" of white pixels, store them as independent components and color them randomly with red, green or blue. The image below shows this for a football video where all such components are visible.
I create rectangles over these detected components and I get this image:
So I can see the challenge here. I want to cluster all the "similar" and close-by components into a single entity so that the rectangles in the output image show a player moving as a whole (and not his independent limbs). I tried doing K-means clustering but since ideally I would not know the number of moving entities, I could not make any progress.
Please guide me on how I can do this. Thanks
this problem can be almost perfectly solved by dbscan clustering algorithm. Below, I provide the implementation and result image. Gray blob means outlier or noise according to dbscan. I simply used boxes as input data. Initially, box centers were used for distance function. However for boxes, it is insufficient to correctly characterize distance. So, the current distance function use the minimum distance of all 8 corners of two boxes.
#include "opencv2/opencv.hpp"
using namespace cv;
#include <map>
#include <sstream>
template <class T>
inline std::string to_string (const T& t)
{
std::stringstream ss;
ss << t;
return ss.str();
}
class DbScan
{
public:
std::map<int, int> labels;
vector<Rect>& data;
int C;
double eps;
int mnpts;
double* dp;
//memoization table in case of complex dist functions
#define DP(i,j) dp[(data.size()*i)+j]
DbScan(vector<Rect>& _data,double _eps,int _mnpts):data(_data)
{
C=-1;
for(int i=0;i<data.size();i++)
{
labels[i]=-99;
}
eps=_eps;
mnpts=_mnpts;
}
void run()
{
dp = new double[data.size()*data.size()];
for(int i=0;i<data.size();i++)
{
for(int j=0;j<data.size();j++)
{
if(i==j)
DP(i,j)=0;
else
DP(i,j)=-1;
}
}
for(int i=0;i<data.size();i++)
{
if(!isVisited(i))
{
vector<int> neighbours = regionQuery(i);
if(neighbours.size()<mnpts)
{
labels[i]=-1;//noise
}else
{
C++;
expandCluster(i,neighbours);
}
}
}
delete [] dp;
}
void expandCluster(int p,vector<int> neighbours)
{
labels[p]=C;
for(int i=0;i<neighbours.size();i++)
{
if(!isVisited(neighbours[i]))
{
labels[neighbours[i]]=C;
vector<int> neighbours_p = regionQuery(neighbours[i]);
if (neighbours_p.size() >= mnpts)
{
expandCluster(neighbours[i],neighbours_p);
}
}
}
}
bool isVisited(int i)
{
return labels[i]!=-99;
}
vector<int> regionQuery(int p)
{
vector<int> res;
for(int i=0;i<data.size();i++)
{
if(distanceFunc(p,i)<=eps)
{
res.push_back(i);
}
}
return res;
}
double dist2d(Point2d a,Point2d b)
{
return sqrt(pow(a.x-b.x,2) + pow(a.y-b.y,2));
}
double distanceFunc(int ai,int bi)
{
if(DP(ai,bi)!=-1)
return DP(ai,bi);
Rect a = data[ai];
Rect b = data[bi];
/*
Point2d cena= Point2d(a.x+a.width/2,
a.y+a.height/2);
Point2d cenb = Point2d(b.x+b.width/2,
b.y+b.height/2);
double dist = sqrt(pow(cena.x-cenb.x,2) + pow(cena.y-cenb.y,2));
DP(ai,bi)=dist;
DP(bi,ai)=dist;*/
Point2d tla =Point2d(a.x,a.y);
Point2d tra =Point2d(a.x+a.width,a.y);
Point2d bla =Point2d(a.x,a.y+a.height);
Point2d bra =Point2d(a.x+a.width,a.y+a.height);
Point2d tlb =Point2d(b.x,b.y);
Point2d trb =Point2d(b.x+b.width,b.y);
Point2d blb =Point2d(b.x,b.y+b.height);
Point2d brb =Point2d(b.x+b.width,b.y+b.height);
double minDist = 9999999;
minDist = min(minDist,dist2d(tla,tlb));
minDist = min(minDist,dist2d(tla,trb));
minDist = min(minDist,dist2d(tla,blb));
minDist = min(minDist,dist2d(tla,brb));
minDist = min(minDist,dist2d(tra,tlb));
minDist = min(minDist,dist2d(tra,trb));
minDist = min(minDist,dist2d(tra,blb));
minDist = min(minDist,dist2d(tra,brb));
minDist = min(minDist,dist2d(bla,tlb));
minDist = min(minDist,dist2d(bla,trb));
minDist = min(minDist,dist2d(bla,blb));
minDist = min(minDist,dist2d(bla,brb));
minDist = min(minDist,dist2d(bra,tlb));
minDist = min(minDist,dist2d(bra,trb));
minDist = min(minDist,dist2d(bra,blb));
minDist = min(minDist,dist2d(bra,brb));
DP(ai,bi)=minDist;
DP(bi,ai)=minDist;
return DP(ai,bi);
}
vector<vector<Rect> > getGroups()
{
vector<vector<Rect> > ret;
for(int i=0;i<=C;i++)
{
ret.push_back(vector<Rect>());
for(int j=0;j<data.size();j++)
{
if(labels[j]==i)
{
ret[ret.size()-1].push_back(data[j]);
}
}
}
return ret;
}
};
cv::Scalar HSVtoRGBcvScalar(int H, int S, int V) {
int bH = H; // H component
int bS = S; // S component
int bV = V; // V component
double fH, fS, fV;
double fR, fG, fB;
const double double_TO_BYTE = 255.0f;
const double BYTE_TO_double = 1.0f / double_TO_BYTE;
// Convert from 8-bit integers to doubles
fH = (double)bH * BYTE_TO_double;
fS = (double)bS * BYTE_TO_double;
fV = (double)bV * BYTE_TO_double;
// Convert from HSV to RGB, using double ranges 0.0 to 1.0
int iI;
double fI, fF, p, q, t;
if( bS == 0 ) {
// achromatic (grey)
fR = fG = fB = fV;
}
else {
// If Hue == 1.0, then wrap it around the circle to 0.0
if (fH>= 1.0f)
fH = 0.0f;
fH *= 6.0; // sector 0 to 5
fI = floor( fH ); // integer part of h (0,1,2,3,4,5 or 6)
iI = (int) fH; // " " " "
fF = fH - fI; // factorial part of h (0 to 1)
p = fV * ( 1.0f - fS );
q = fV * ( 1.0f - fS * fF );
t = fV * ( 1.0f - fS * ( 1.0f - fF ) );
switch( iI ) {
case 0:
fR = fV;
fG = t;
fB = p;
break;
case 1:
fR = q;
fG = fV;
fB = p;
break;
case 2:
fR = p;
fG = fV;
fB = t;
break;
case 3:
fR = p;
fG = q;
fB = fV;
break;
case 4:
fR = t;
fG = p;
fB = fV;
break;
default: // case 5 (or 6):
fR = fV;
fG = p;
fB = q;
break;
}
}
// Convert from doubles to 8-bit integers
int bR = (int)(fR * double_TO_BYTE);
int bG = (int)(fG * double_TO_BYTE);
int bB = (int)(fB * double_TO_BYTE);
// Clip the values to make sure it fits within the 8bits.
if (bR > 255)
bR = 255;
if (bR < 0)
bR = 0;
if (bG >255)
bG = 255;
if (bG < 0)
bG = 0;
if (bB > 255)
bB = 255;
if (bB < 0)
bB = 0;
// Set the RGB cvScalar with G B R, you can use this values as you want too..
return cv::Scalar(bB,bG,bR); // R component
}
int main(int argc,char** argv )
{
Mat im = imread("c:/data/football.png",0);
std::vector<std::vector<cv::Point> > contours;
std::vector<cv::Vec4i> hierarchy;
findContours(im.clone(), contours, hierarchy, CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE);
vector<Rect> boxes;
for(size_t i = 0; i < contours.size(); i++)
{
Rect r = boundingRect(contours[i]);
boxes.push_back(r);
}
DbScan dbscan(boxes,20,2);
dbscan.run();
//done, perform display
Mat grouped = Mat::zeros(im.size(),CV_8UC3);
vector<Scalar> colors;
RNG rng(3);
for(int i=0;i<=dbscan.C;i++)
{
colors.push_back(HSVtoRGBcvScalar(rng(255),255,255));
}
for(int i=0;i<dbscan.data.size();i++)
{
Scalar color;
if(dbscan.labels[i]==-1)
{
color=Scalar(128,128,128);
}else
{
int label=dbscan.labels[i];
color=colors[label];
}
putText(grouped,to_string(dbscan.labels[i]),dbscan.data[i].tl(), FONT_HERSHEY_COMPLEX,.5,color,1);
drawContours(grouped,contours,i,color,-1);
}
imshow("grouped",grouped);
imwrite("c:/data/grouped.jpg",grouped);
waitKey(0);
}
I agree with Sebastian Schmitz: you probably shouldn't be looking for clustering.
Don't expect an uninformed method such as k-means to work magic for you. In particular one that is as crude a heuristic as k-means, and which lives in an idealized mathematical world, not in messy, real data.
You have a good understanding of what you want. Try to put this intuition into code. In your case, you seem to be looking for connected components.
Consider downsampling your image to a lower resolution, then rerunning the same process! Or running it on the lower resolution right away (to reduce compression artifacts, and improve performance). Or adding filters, such as blurring.
I'd expect best and fastest results by looking at connected components in the downsampled/filtered image.
I am not entirely sure if you are really looking for clustering (in the Data Mining sense).
Clustering is used to group similar objects according to a distance function. In your case the distance function would only use the spatial qualities. Besides, in k-means clustering you have to specify a k, that you probably don't know beforehand.
It seems to me you just want to merge all rectangles whose borders are closer together than some predetermined threshold. So as a first idea try to merge all rectangles that are touching or that are closer together than half a players height.
You probably want to include a size check to minimize the risk of merging two players into one.
Edit: If you really want to use a clustering algorithm use one that estimates the number of clusters for you.
I guess you can improve your original attempt by using morphological transformations. Take a look at http://docs.opencv.org/master/d9/d61/tutorial_py_morphological_ops.html#gsc.tab=0. Probably you can deal with a closed set for each entity after that, specially with separate players as you got in your original image.

Creating vignette filter in opencv?

How we can make vignette filter in opencv? Do we need to implement any algorithm for it or only to play with the values of BGR ? How we can make this type of filters. I saw its implementation here but i didn't understand it clearly . Anyone with complete algorithms guidance and implementation guidance is highly appriciated.
After Abid rehman K answer I tried this in c++
int main()
{
Mat v;
Mat img = imread ("D:\\2.jpg");
img.convertTo(v, CV_32F);
Mat a,b,c,d,e;
c.create(img.rows,img.cols,CV_32F);
d.create(img.rows,img.cols,CV_32F);
e.create(img.rows,img.cols,CV_32F);
a = getGaussianKernel(img.cols,300,CV_32F);
b = getGaussianKernel(img.rows,300,CV_32F);
c = b*a.t();
double minVal;
double maxVal;
cv::minMaxLoc(c, &minVal, &maxVal);
d = c/maxVal;
e = v*d ; // This line causing error
imshow ("venyiet" , e);
cvWaitKey();
}
d is displaying right but e=v*d line is causing runtime error of
OpenCV Error: Assertion failed (type == B.type() && (type == CV_32FC1 || type ==
CV_64FC1 || type == CV_32FC2 || type == CV_64FC2)) in unknown function, file ..
\..\..\src\opencv\modules\core\src\matmul.cpp, line 711
First of all, Abid Rahman K describes the easiest way to go about this filter. You should seriously study his answer with time and attention. Wikipedia's take on Vignetting is also quite clarifying for those that had never heard about this filter.
Browny's implementation of this filter is considerably more complex. However, I ported his code to the C++ API and simplified it so you can follow the instructions yourself.
#include <math.h>
#include <vector>
#include <cv.hpp>
#include <highgui/highgui.hpp>
// Helper function to calculate the distance between 2 points.
double dist(CvPoint a, CvPoint b)
{
return sqrt(pow((double) (a.x - b.x), 2) + pow((double) (a.y - b.y), 2));
}
// Helper function that computes the longest distance from the edge to the center point.
double getMaxDisFromCorners(const cv::Size& imgSize, const cv::Point& center)
{
// given a rect and a line
// get which corner of rect is farthest from the line
std::vector<cv::Point> corners(4);
corners[0] = cv::Point(0, 0);
corners[1] = cv::Point(imgSize.width, 0);
corners[2] = cv::Point(0, imgSize.height);
corners[3] = cv::Point(imgSize.width, imgSize.height);
double maxDis = 0;
for (int i = 0; i < 4; ++i)
{
double dis = dist(corners[i], center);
if (maxDis < dis)
maxDis = dis;
}
return maxDis;
}
// Helper function that creates a gradient image.
// firstPt, radius and power, are variables that control the artistic effect of the filter.
void generateGradient(cv::Mat& mask)
{
cv::Point firstPt = cv::Point(mask.size().width/2, mask.size().height/2);
double radius = 1.0;
double power = 0.8;
double maxImageRad = radius * getMaxDisFromCorners(mask.size(), firstPt);
mask.setTo(cv::Scalar(1));
for (int i = 0; i < mask.rows; i++)
{
for (int j = 0; j < mask.cols; j++)
{
double temp = dist(firstPt, cv::Point(j, i)) / maxImageRad;
temp = temp * power;
double temp_s = pow(cos(temp), 4);
mask.at<double>(i, j) = temp_s;
}
}
}
// This is where the fun starts!
int main()
{
cv::Mat img = cv::imread("stack-exchange-chefs.jpg");
if (img.empty())
{
std::cout << "!!! Failed imread\n";
return -1;
}
/*
cv::namedWindow("Original", cv::WINDOW_NORMAL);
cv::resizeWindow("Original", img.size().width/2, img.size().height/2);
cv::imshow("Original", img);
*/
What img looks like:
cv::Mat maskImg(img.size(), CV_64F);
generateGradient(maskImg);
/*
cv::Mat gradient;
cv::normalize(maskImg, gradient, 0, 255, CV_MINMAX);
cv::imwrite("gradient.png", gradient);
*/
What maskImg looks like:
cv::Mat labImg(img.size(), CV_8UC3);
cv::cvtColor(img, labImg, CV_BGR2Lab);
for (int row = 0; row < labImg.size().height; row++)
{
for (int col = 0; col < labImg.size().width; col++)
{
cv::Vec3b value = labImg.at<cv::Vec3b>(row, col);
value.val[0] *= maskImg.at<double>(row, col);
labImg.at<cv::Vec3b>(row, col) = value;
}
}
cv::Mat output;
cv::cvtColor(labImg, output, CV_Lab2BGR);
//cv::imwrite("vignette.png", output);
cv::namedWindow("Vignette", cv::WINDOW_NORMAL);
cv::resizeWindow("Vignette", output.size().width/2, output.size().height/2);
cv::imshow("Vignette", output);
cv::waitKey();
return 0;
}
What output looks like:
As stated in the code above, by changing the values of firstPt, radius and power you can achieve stronger/weaker artistic effects.
Good luck!
You can do a simple implementation using Gaussian Kernels available in OpenCV.
Load the image, Get its number of rows and columns
Create two Gaussian Kernels of size rows and columns, say A,B. Its variance depends upon your needs.
C = transpose(A)*B, ie multiply a column vector with a row vector such that result array should be same size as that of the image.
D = C/C.max()
E = img*D
See the implementation below (for a grayscale image):
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread('temp.jpg',0)
row,cols = img.shape
a = cv2.getGaussianKernel(cols,300)
b = cv2.getGaussianKernel(rows,300)
c = b*a.T
d = c/c.max()
e = img*d
cv2.imwrite('vig2.png',e)
Below is my result:
Similarly for Color image:
NOTE : Of course, it is centered. You will need to make additional modifications to move focus to other places.
Similar one close to Abid's Answer. But the code is for the colored image
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread('turtle.jpg',1)
rows,cols = img.shape[:2]
zeros = np.copy(img)
zeros[:,:,:] = 0
a = cv2.getGaussianKernel(cols,900)
b = cv2.getGaussianKernel(rows,900)
c = b*a.T
d = c/c.max()
zeros[:,:,0] = img[:,:,0]*d
zeros[:,:,1] = img[:,:,1]*d
zeros[:,:,2] = img[:,:,2]*d
cv2.imwrite('vig2.png',zeros)
Original Image (Taken from Pexels under CC0 Licence)
After Applying Vignette with a sigma of 900 (i.e `cv2.getGaussianKernel(cols,900))
After Applying Vignette with a sigma of 300 (i.e `cv2.getGaussianKernel(cols,300))
Additionally you can focus the vignette effect to the cordinates of your wish by simply shifting the mean of the gaussian to your focus point as follows.
import cv2
import numpy as np
img = cv2.imread('turtle.jpg',1)
fx,fy = 1465,180 # Add your Focus cordinates here
fx,fy = 145,1000 # Add your Focus cordinates here
sigma = 300 # Standard Deviation of the Gaussian
rows,cols = img.shape[:2]
fxn = fx - cols//2 # Normalised temperory vars
fyn = fy - rows//2
zeros = np.copy(img)
zeros[:,:,:] = 0
a = cv2.getGaussianKernel(2*cols ,sigma)[cols-fx:2*cols-fx]
b = cv2.getGaussianKernel(2*rows ,sigma)[rows-fy:2*rows-fy]
c = b*a.T
d = c/c.max()
zeros[:,:,0] = img[:,:,0]*d
zeros[:,:,1] = img[:,:,1]*d
zeros[:,:,2] = img[:,:,2]*d
zeros = add_alpha(zeros)
cv2.imwrite('vig4.png',zeros)
The size of the turtle image is 1980x1200 (WxH). The following is an example focussing at the cordinate 1465,180 (i.e fx,fy = 1465,180) (Note that I have reduced the variance to exemplify the change in focus)
The following is an example focussing at the cordinate 145,1000 (i.e fx,fy = 145,1000)
Here is my c++ implementation of Vignette filter on Colored Image using opencv. It is faster than the accepted answer.
#include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
using namespace cv;
using namespace std;
double fastCos(double x){
x += 1.57079632;
if (x > 3.14159265)
x -= 6.28318531;
if (x < 0)
return 1.27323954 * x + 0.405284735 * x * x;
else
return 1.27323954 * x - 0.405284735 * x * x;
}
double dist(double ax, double ay,double bx, double by){
return sqrt((ax - bx)*(ax - bx) + (ay - by)*(ay - by));
}
int main(int argv, char** argc){
Mat src = cv::imread("filename_of_your_image.jpg");
Mat dst = Mat::zeros(src.size(), src.type());
double radius; //value greater than 0,
//greater the value lesser the visible vignette
//for a medium vignette use a value in range(0.5-1.5)
cin << radius;
double cx = (double)src.cols/2, cy = (double)src.rows/2;
double maxDis = radius * dist(0,0,cx,cy);
double temp;
for (int y = 0; y < src.rows; y++) {
for (int x = 0; x < src.cols; x++) {
temp = fastCos(dist(cx, cy, x, y) / maxDis);
temp *= temp;
dst.at<Vec3b>(y, x)[0] =
saturate_cast<uchar>((src.at<Vec3b>(y, x)[0]) * temp);
dst.at<Vec3b>(y, x)[1] =
saturate_cast<uchar>((src.at<Vec3b>(y, x)[1]) * temp );
dst.at<Vec3b>(y, x)[2] =
saturate_cast<uchar>((src.at<Vec3b>(y, x)[2]) * temp);
}
}
imshow ("Vignetted Image", dst);
waitKey(0);
}
Here is a C++ implementation of Vignetting for Grayscale Image
#include "opencv2/opencv.hpp"
#include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
using namespace cv;
using namespace std;
int main(int argv, char** argc)
{
Mat test = imread("test.jpg", IMREAD_GRAYSCALE);
Mat kernel_X = getGaussianKernel(test.cols, 100);
Mat kernel_Y = getGaussianKernel(test.rows, 100);
Mat kernel_X_transpose;
transpose(kernel_X, kernel_X_transpose);
Mat kernel = kernel_Y * kernel_X_transpose;
Mat mask_v, proc_img;
normalize(kernel, mask_v, 0, 1, NORM_MINMAX);
test.convertTo(proc_img, CV_64F);
multiply(mask_v, proc_img, proc_img);
convertScaleAbs(proc_img, proc_img);
imshow ("Vignette", proc_img);
waitKey(0);
return 0;
}

Video Stabilization

I 'm researching about Video Stabilization field. I implement a application using OpenCV.
My progress such as:
Surf points extraction
Matching
estimateRigidTransform
warpAffine
But the result video is not be stable. Can anyone help me this problem or provide me some source code link to improve?
Sample video: Hippo video
Here is my code [EDIT]
#include "stdafx.h"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>
#include <opencv2/nonfree/features2d.hpp>
#include <opencv2/opencv.hpp>
const double smooth_level = 0.7;
using namespace cv;
using namespace std;
struct TransformParam
{
TransformParam() {}
TransformParam(double _dx, double _dy, double _da) {
dx = _dx;
dy = _dy;
da = _da;
}
double dx; // translation x
double dy; // translation y
double da; // angle
};
int main( int argc, char** argv )
{
VideoCapture cap ("test12.avi");
Mat cur, cur_grey;
Mat prev, prev_grey;
cap >> prev;
cvtColor(prev, prev_grey, COLOR_BGR2GRAY);
// Step 1 - Get previous to current frame transformation (dx, dy, da) for all frames
vector <TransformParam> prev_to_cur_transform; // previous to current
int k=1;
int max_frames = cap.get(CV_CAP_PROP_FRAME_COUNT);
VideoWriter writeVideo ("stable.avi",0,30,cvSize(prev.cols,prev.rows),true);
Mat last_T;
double avg_dx = 0, avg_dy = 0, avg_da = 0;
Mat smooth_T(2,3,CV_64F);
while(true) {
cap >> cur;
if(cur.data == NULL) {
break;
}
cvtColor(cur, cur_grey, COLOR_BGR2GRAY);
// vector from prev to cur
vector <Point2f> prev_corner, cur_corner;
vector <Point2f> prev_corner2, cur_corner2;
vector <uchar> status;
vector <float> err;
goodFeaturesToTrack(prev_grey, prev_corner, 200, 0.01, 30);
calcOpticalFlowPyrLK(prev_grey, cur_grey, prev_corner, cur_corner, status, err);
// weed out bad matches
for(size_t i=0; i < status.size(); i++) {
if(status[i]) {
prev_corner2.push_back(prev_corner[i]);
cur_corner2.push_back(cur_corner[i]);
}
}
// translation + rotation only
Mat T = estimateRigidTransform(prev_corner2, cur_corner2, false);
// in rare cases no transform is found. We'll just use the last known good transform.
if(T.data == NULL) {
last_T.copyTo(T);
}
T.copyTo(last_T);
// decompose T
double dx = T.at<double>(0,2);
double dy = T.at<double>(1,2);
double da = atan2(T.at<double>(1,0), T.at<double>(0,0));
prev_to_cur_transform.push_back(TransformParam(dx, dy, da));
avg_dx = (avg_dx * smooth_level) + (dx * (1- smooth_level));
avg_dy = (avg_dy * smooth_level) + (dy * (1- smooth_level));
avg_da = (avg_da * smooth_level) + (da * (1- smooth_level));
smooth_T.at<double>(0,0) = cos(avg_da);
smooth_T.at<double>(0,1) = -sin(avg_da);
smooth_T.at<double>(1,0) = sin(avg_da);
smooth_T.at<double>(1,1) = cos(avg_da);
smooth_T.at<double>(0,2) = avg_dx;
smooth_T.at<double>(1,2) = avg_dy;
Mat stable;
warpAffine(prev,stable,smooth_T,prev.size());
Mat canvas = Mat::zeros(cur.rows, cur.cols*2+10, cur.type());
prev.copyTo(canvas(Range::all(), Range(0, prev.cols)));
stable.copyTo(canvas(Range::all(), Range(prev.cols+10, prev.cols*2+10)));
imshow("before and after", canvas);
waitKey(20);
writeVideo.write(stable);
cur.copyTo(prev);
cur_grey.copyTo(prev_grey);
k++;
}
}
First, you can just blur you image. It will helps a bit. Second, you can easily smooth your matrix by simplest implementation of exponential smooth A(t+1) = a*A(t)+(1-a)*A(t+1) and play with a-value in [0;1] range. Third, you can turn off some type of transformations like rotation, shift etc.
Here is code example:
t = estimateRigidTransform(new, old, 0); // 0 means not all transformations (5 of 6)
if(!t.empty()){
// t(Range(0,2), Range(0,2)) = Mat::eye(2, 2, CV_64FC1); // turning off rotation
// t.at<double>(0,2) = 0; t.at<double>(1,2) = 0; // turning off shift dx and dy
tAvrg = tAvrg*a + t*(1-a); // a - smooth level in [0;1] range, play with it
warpAffine(new, stable, tAvrg, Size(new.cols, new.rows));
}