OpenCV Kalman filter - c++

I have three gyroscope values, pitch, roll and yaw. I would like to add Kalman filter to get more accurate values. I found the opencv library, which implements a Kalman filter, but I can't understand it how is it really work.
Could you give me any help which can help me? I didn't find any related topics on the internet.
I tried to make it work for one axis.
const float A[] = { 1, 1, 0, 1 };
CvKalman* kalman;
CvMat* state = NULL;
CvMat* measurement;
void kalman_filter(float FoE_x, float prev_x)
{
const CvMat* prediction = cvKalmanPredict( kalman, 0 );
printf("KALMAN: %f %f %f\n" , prev_x, prediction->data.fl[0] , prediction->data.fl[1] );
measurement->data.fl[0] = FoE_x;
cvKalmanCorrect( kalman, measurement);
}
in main
kalman = cvCreateKalman( 2, 1, 0 );
state = cvCreateMat( 2, 1, CV_32FC1 );
measurement = cvCreateMat( 1, 1, CV_32FC1 );
cvSetIdentity( kalman->measurement_matrix,cvRealScalar(1) );
memcpy( kalman->transition_matrix->data.fl, A, sizeof(A));
cvSetIdentity( kalman->process_noise_cov, cvRealScalar(2.0) );
cvSetIdentity(kalman->measurement_noise_cov, cvRealScalar(3.0));
cvSetIdentity( kalman->error_cov_post, cvRealScalar(1222));
kalman->state_post->data.fl[0] = 0;
And I call this everytime, when I receive data from gyro:
kalman_filter(prevr, mpe->getGyrosDegrees().roll);
I thought in kalman_filter the first parameter is the previous value and the second is the currect value. I'm not and this code doesn't work... I know I have a lot of work with it, but I don't know how to continue, what to change...

It seems like you are giving too high values to the covariance matrices.
kalman->process_noise_cov is the 'process noise covariance matrix' and it is often referred in the Kalman literature as Q. The result will be smoother with lower values.
kalman->measurement_noise_cov is the 'measurement noise covariance matrix' and it is often referred in the Kalman literature as R. The result will be smoother with higher values.
The relation between those two matrices defines the amount and shape of filtering you are performing.
If the value of Q is high, it will mean that the signal you are measuring varies quickly and you need the filter to be adaptable. If it is small, then big variations will be attributed to noise in the measure.
If the value of R is high (compared to Q), it will indicate that the measuring is noisy so it will be filtered more.
Try lower values like q = 1e-5 and r = 1e-1 instead of q = 2.0 and r = 3.0.

Related

Fast, good quality pixel interpolation for extreme image downscaling

In my program, I am downscaling an image of 500px or larger to an extreme level of approx 16px-32px. The source image is user-specified so I do not have control over its size. As you can imagine, few pixel interpolations hold up and inevitably the result is heavily aliased.
I've tried bilinear, bicubic and square average sampling. The square average sampling actually provides the most decent results but the smaller it gets, the larger the sampling radius has to be. As a result, it gets quite slow - slower than the other interpolation methods.
I have also tried an adaptive square average sampling so that the smaller it gets the greater the sampling radius, while the closer it is to its original size, the smaller the sampling radius. However, it produces problems and I am not convinced this is the best approach.
So the question is: What is the recommended type of pixel interpolation that is fast and works well on such extreme levels of downscaling?
I do not wish to use a library so I will need something that I can code by hand and isn't too complex. I am working in C++ with VS 2012.
Here's some example code I've tried as requested (hopefully without errors from my pseudo-code cut and paste). This performs a 7x7 average downscale and although it's a better result than bilinear or bicubic interpolation, it also takes quite a hit:
// Sizing control
ctl(0): "Resize",Range=(0,800),Val=100
// Variables
float fracx,fracy;
int Xnew,Ynew,p,q,Calc;
int x,y,p1,q1,i,j;
//New image dimensions
Xnew=image->width*ctl(0)/100;
Ynew=image->height*ctl(0)/100;
for (y=0; y<image->height; y++){ // rows
for (x=0; x<image->width; x++){ // columns
p1=(int)x*image->width/Xnew;
q1=(int)y*image->height/Ynew;
for (z=0; z<3; z++){ // channels
for (i=-3;i<=3;i++) {
for (j=-3;j<=3;j++) {
Calc += (int)(src(p1-i,q1-j,z));
} //j
} //i
Calc /= 49;
pset(x, y, z, Calc);
} // channels
} // columns
} // rows
Thanks!
The first point is to use pointers to your data. Never use indexes at every pixel. When you write: src(p1-i,q1-j,z) or pset(x, y, z, Calc) how much computation is being made? Use pointers to data and manipulate those.
Second: your algorithm is wrong. You don't want an average filter, but you want to make a grid on your source image and for every grid cell compute the average and put it in the corresponding pixel of the output image.
The specific solution should be tailored to your data representation, but it could be something like this:
std::vector<uint32_t> accum(Xnew);
std::vector<uint32_t> count(Xnew);
uint32_t *paccum, *pcount;
uint8_t* pin = /*pointer to input data*/;
uint8_t* pout = /*pointer to output data*/;
for (int dr = 0, sr = 0, w = image->width, h = image->height; sr < h; ++dr) {
memset(paccum = accum.data(), 0, Xnew*4);
memset(pcount = count.data(), 0, Xnew*4);
while (sr * Ynew / h == dr) {
paccum = accum.data();
pcount = count.data();
for (int dc = 0, sc = 0; sc < w; ++sc) {
*paccum += *i;
*pcount += 1;
++pin;
if (sc * Xnew / w > dc) {
++dc;
++paccum;
++pcount;
}
}
sr++;
}
std::transform(begin(accum), end(accum), begin(count), pout, std::divides<uint32_t>());
pout += Xnew;
}
This was written using my own library (still in development) and it seems to work, but later I changed the variables names in order to make it simpler here, so I don't guarantee anything!
The idea is to have a local buffer of 32 bit ints which can hold the partial sum of all pixels in the rows which fall in a row of the output image. Then you divide by the cell count and save the output to the final image.
The first thing you should do is to set up a performance evaluation system to measure how much any change impacts on the performance.
As said precedently, you should not use indexes but pointers for (probably) a substantial
speed up & not simply average as a basic averaging of pixels is basically a blur filter.
I would highly advise you to rework your code to be using "kernels". This is the matrix representing the ratio of each pixel used. That way, you will be able to test different strategies and optimize quality.
Example of kernels:
https://en.wikipedia.org/wiki/Kernel_(image_processing)
Upsampling/downsampling kernel:
http://www.johncostella.com/magic/
Note, from the code it seems you apply a 3x3 kernel but initially done on a 7x7 kernel. The equivalent 3x3 kernel as posted would be:
[1 1 1]
[1 1 1] * 1/9
[1 1 1]

Using Opencv and C++ to find areas of very small objects, possibly self-intersecting

How do I calculate areas of very small objects, sometimes 2 pixels in area? MATLAB's regionprops() seems to do this well, and will return values of even 1 for a point. I've read widely on this issue and everyone seems to caution against self-intersecting contours but offers no alternative around that. Here is a sample of my code:
void cMcDetect::shapeFeats(vector<Point> contours, const cv::Mat &img)
{
// % Area, Compactness, Orientation, Eccentricity, Solidity
double A, C, O=NAN, E, S;
vector<Point> ch; convexHull(contours, ch);
Moments mu = moments(contours,0);
double CHA = contourArea(ch);
A=contourArea(contours); // Object Area
S = A/CHA; // Solidity
E = (mu.m20+mu.m02 + sqrt( pow(mu.m20-mu.m02,2)+4*pow(mu.m11,2) ))/ // Eccentricity
(mu.m20+mu.m02 - sqrt( pow(mu.m20-mu.m02,2)+4*pow(mu.m11,2) ));
Rect boundrect = boundingRect(contours); // Compactness
C = A/(boundrect.width*boundrect.height);
if(contours.size()>=5){
RotatedRect rotrect = fitEllipse(contours); // Orientation
O = rotrect.angle;
}
printf("%f %f %f %f %f\n",A, C, O, E, S);
}
I get very strange area values such as 0, 1.5 for object areas. I don't expect decimal areas as I expect the function to return the raw sum of pixels such that a point will have an area of 1. Any developments as far as this issue is concerned? It also seems to be affecting other derived values such as Eccentricity. I guess I can summarize the question as: how do I get the raw pixel count of connected components and make Opencv use this in calculating Hu moments and other derived values correctly where the area is needed? If not possible, can you suggest some design/approach readjustments to circumvent the issue? I would have liked to have opencv do it so I can take advantage of its other functions such as hu moments and ellipse fitting.

How to retrieve outliers from ceres solver result?

I try to compare images using method similar to Features2D + Homography to find a known object but replace findHomography() by self-writed findAffine() function.
I use Ceres Solver to obtain optimal affine matrix considering outliers.
double translation[] = {0, 0};
double angle = 0;
double scaleFactor = 1;
ceres::Problem problem;
for (size_t i = 0; i < points1.size(); ++i) {
problem.AddResidualBlock(
new ceres::AutoDiffCostFunction<AffineResidual, 1, 2, 1, 1>(
new AffineResidual(Eigen::Vector2d(points1[i].x, points1[i].y),
Eigen::Vector2d(points2[i].x, points2[i].y))),
new ceres::HuberLoss(1.0),
translation,
&angle,
&scaleFactor);
}
ceres::Solver::Options options;
options.linear_solver_type = ceres::DENSE_QR;
options.minimizer_progress_to_stdout = true;
ceres::Solver::Summary summary;
Solve(options, &problem, &summary);
Ceres solver provide LossFunction:
Loss functions reduce the influence of residual blocks with high residuals, usually the ones corresponding to outliers.
Of course, I can transform keypoints coordinates from first image by obtained matrix, compare with second and get deviation. But ceres solver already done it inside during work.
How I can retrieve it? Did not find it in the documentation.
I had the similar problem. After looking into Ceres library sources (particularly into ResidualBlock::Evaluate() method) I had a conclusion that there is no explicit "outlier" status for residual block. It seems that the loss function just affects resulting cost value for a block (which is exactly described by the phrase from documentation you have quoted - "Loss functions reduce the influence of residual blocks with high residuals"). So the answer is that you cannot retrieve outliers from Ceres, there is no such feature.
Workaround might be calculating residuals for your data with the solved result, and apply loss function to them. The comment from LossFunction::Evaluate() might help:
// For a residual vector with squared 2-norm 'sq_norm', this method
// is required to fill in the value and derivatives of the loss
// function (rho in this example):
//
// out[0] = rho(sq_norm),
// out[1] = rho'(sq_norm),
// out[2] = rho''(sq_norm),
//
// Here the convention is that the contribution of a term to the
// cost function is given by 1/2 rho(s), where
//
// s = ||residuals||^2.
//
// Calling the method with a negative value of 's' is an error and
// the implementations are not required to handle that case.
//
// Most sane choices of rho() satisfy:
//
// rho(0) = 0,
// rho'(0) = 1,
// rho'(s) < 1 in outlier region,
// rho''(s) < 0 in outlier region,
//
// so that they mimic the least squares cost for small residuals.
virtual void Evaluate(double sq_norm, double out[3]) const = 0;

Kinect Scaling in C++

I am experimenting with the kinect, however I am having some problems with scaling. The below is code from the kinect-kcb and although the face tracking works for the 'mesh' I am having problems returning the scaling value for my own classes. The below code returns a correct rotation and translation which function perfectly, but the scale only ever returns 1 for a long period (despite the mesh clearly changing size) and then slowly gets smaller 0.98... etc but clearly not correct scaling values.
float scale;
float rotation[ 3 ];
float translation[ 3 ];
hr = mResult->Get3DPose( &scale, rotation, translation );
if ( SUCCEEDED( hr ) ) {
Vec3f r( rotation[ 0 ], rotation[ 1 ], rotation[ 2 ] );
Vec3f t( translation[ 0 ], translation[ 1 ], translation[ 2 ] );
face.mPoseMatrix.translate( t );
face.mPoseMatrix.rotate( r );
face.mPoseMatrix.translate( -t );
face.mPoseMatrix.translate( t );
face.mPoseMatrix.scale( Vec3f::one() * scale );
}
This scale value is used repeatedly thoughout the code, but does not seem to change often enough (example functions - not in order):
hr = mModel->Get3DShape( shapeUnits, numShapeUnits, animationUnits, numAnimationUnits, scale, rotation, translation, pts, numVertices );
hr = mModel->GetProjectedShape( &mConfigColor, mSensorData.ZoomFactor, viewOffset, shapeUnits, numShapeUnits, animationUnits,
numAnimationUnits, scale, rotation, translation, pts, numVertices );
The kinect has a function FaceModel.Scale(), however this only returns a constant value which I assume is the initial scaling value for the 3D model, and then I assumed the above scaling value would change as the user moved closer and further away from the camera.
The method IFTResult::Get3DPose among other things, gives you the face scale value. If it is equal to 1.0 then the face scale is equal to the loaded 3D model (so nothing to do?).
If when reloading the 3d model, the face value is not equal to 1.0 then you need to do work on the model.
have you tried outputting some debug info of what IFTResult::Get3DPose assigns to pScale?
its also possible that the system is failing to track, you can check this with IFTResult::GetStatus.
It may be that what you are after is the magnitude of the face rectangle. This would scale with the proximity of the image subject.
Heres a relevant code project link.

OpenCV: Output of the predict function of Expectation Maximization

Background:
I have 2 sets of color pixels from an image, one corresponding to the background, another corresponding to the foreground. Next, I train 2 Gaussian Mixture Models using EM from OpenCV for each set. My aim is to find the probability of a random pixel to belong to the foreground and to the background. Thus, I use the function "predict" for each EM on my pixel.
Question:
I don't understand the values returned by this function. In the documentation of OpenCV, it is written:
The method returns a two-element double vector. Zero element is a likelihood logarithm value for the sample. First element is an index of the most probable mixture component for the given sample.
http://docs.opencv.org/modules/ml/doc/expectation_maximization.html?highlight=predict#Vec2d%20EM::predict%28InputArray%20sample,%20OutputArray%20probs%29%20const
I don't understand what means "likehood logarithm". In my results, I have sometimes negative values and values > 1. Is anyone who used the same function has this kind of results or resuts between 0 and 1 ? What can I conclude from my results ?
How can I get the probability of a pixel to belong to the whole GMM (not the probality to belong to each cluster of the GMM) ?
Here is my code:
Mat mask = imread("mask.tif", 0);
Mat formerImage = imread("ImageFormer.tif");
Mat currentImage = imread("ImageCurrent.tif");
// number of cluster in the GMM
int nClusters = 5;
int countB=0, countF=0;
Vec3b color;
Vec2d probFg, probBg; // probabilities to belong to the foreground or background from GMMs
//count the number of pixels for each training data
for(int c=0; c<=40;c++) {
for(int l=0; l<=40;l++) {
if(mask.at<BYTE>(l, c)==255) {
countF++;
} else if(mask.at<BYTE>(l, c)==0) {
countB++;
}
}
}
printf("countB %d countF %d \n", countB, countF);
Mat samplesForeground = Mat(countF,3, CV_64F);
Mat samplesBackground = Mat(countB,3, CV_64F);
// Expectation-Maximisation able to resolve the GMM and to predict the probability for a pixel to belong to the GMM.
EM em_foreground= EM(nClusters);
EM em_background= EM(nClusters);
countB=0;
countF=0;
// fill the training data from the former image depending of the mask
for(int c=0; c<=40;c++) {
for(int l=0; l<=40;l++) {
if(mask.at<BYTE>(l, c)==255) {
color = formerImage.at<Vec3b>(l, c);
samplesForeground.at<double>(countF,0)=color[0];
samplesForeground.at<double>(countF,1)=color[1];
samplesForeground.at<double>(countF,2)=color[2];
countF++;
} else if(mask.at<BYTE>(l, c)==0) {
color = formerImage.at<Vec3b>(l, c);
samplesBackground.at<double>(countB, 0)=color[0];
samplesBackground.at<double>(countB, 1)=color[1];
samplesBackground.at<double>(countB, 2)=color[2];
countB++;
}
}
}
printf("countB %d countF %d \n", countB, countF);
em_foreground.train(samplesForeground);
em_background.train(samplesBackground);
Mat sample(1, 3, CV_64F);
// try every pixel of the current image and get the log likelihood
for(int c=0; c<=40;c++) {
for(int l=0; l<=40;l++) {
color = currentImage.at<Vec3b>(l,c);
sample.at<double>(0)=color[0];
sample.at<double>(1)=color[1];
sample.at<double>(2)=color[2];
probFg=em_foreground.predict(sample);
probBg=em_background.predict(sample);
if(probFg[0]>0 || probBg[0]>0)
printf("probFg[0] %f probBg[0] %f \n", probFg[0], probBg[0]);
}
}
EDIT
After #BrianL explained, I now understand the log likelihood.
My problem is the log probability of the predict function is sometimes >0. But it should be <=0. Has anyone met this problem before?
I have edited the code above to show the problem. I have tried the program with images below:
The first image is the ImageCurrent.tif, the second is the ImageFormer.tif and the last one is mask.tif.
Is this can be considered a bug in OpenCV? Should I open a ticket on OpenCV bug tracker?
The "likelihood logarithm" means the log of the probability. Since for a probability p we expect 0 ≤ p ≤ 1, I would expect the values to be negative: log(p) ≤ 0. Larger negative numbers imply smaller probabilities.
This form is helpful when you are dealing with products of very small probabilities: if you multiplied the normal way, you could easily get underflow and lose precision because the probability becomes very small. But in log space the multiplication turns into an addition, which improves the accuracy and also potentially the speed of the calculation.
The predict function is for classifying a data point. If you want to give a point a score for how likely it is to belong to any component in the model, you can use the model parameters (see the get documentation) to calculate it yourself.
As I understand you have two separate GMMs for the foreground and background part of the image.The total probability of a sample pixel 'x' in the test image when evaluated in the foreground GMM is
P_fg(x) = sum_over_j_1_to_k ( Wj_fg * Pj_fg( x ))
where
k = number of clusters in foreground GMM
x = test sample
Pj_fg(x) = probability that sample x is in j-th cluster according to the foreground GMM
Wj_fg = weight of the j-th cluster in foreground GMM
also, sum of all weights should be 1 for each GMM.
We can do a similar calculation for the background GMM.
From looking at the EM code in opencv, it looks like the first part of the 2 values that EM returns is the log likelihood. For the foreground GMM this is
log(P_fg(x_i))
I implemented your algorithm and for each pixel in the test image, I compared the log-likelihoods returned for each of the two GMM-s and classified the pixel with the GMM with higher value. I got decent results.
In that respect, yes this value is an indication of the pixel to be belonging to the entire GMM.
2)
In my implementation of your problem, I always got the log likelihoods of all GMMS of all test-sample pixels under 0.
I notice that you are doing graphcut based image segmentation.
You might want to take a look at the following blog post which use OpenCV and its GMM class in a very similar way as what you are doing to perform graph cut-based image segmentation. Code is given in C++ with detailed explanations. Here is the link: link
Basically, I can only say that the log probability, whether it is correct or not, is not what you are looking for. Check out the above link for details.