i'm going to implement the equalization proposed in a paper.
the method consists of substitute each value of each channel with the formula
in the 16th slide of this presentantion slides
First of all i've implemeted this equalization function in Matlab in two ways: in the first i compute the histograms (counts) of each channel in order to know
the number of values less then a specific value in the range [0 255]. Alternatively, in the second way i use some matrix operations (R<=value... G<=value .... V<=value).
Initially, i've thought that the second method was the best in terms of time to execution but it seems not.. and I was surprised for the first time.
Then i've implemented this function in OpenCV, and i'm now surprised because the execution in Matlab is faster then C++!! Using Matlab i have these timer values
Matlab, method 1: 1,36 seconds
Matlab, method 2: 1,74 seconds
In C++ using OpenCV i found these values:
OpenCV, method 1: 2380 milliseconds
OpenCV, method 2: 4651 milliseconds
I obtained the same results so the function is correct, but i think there is something wrong or something that could be enhanched in terms of computation time due to my inexperience in OpenCV because i expect that a compiled C++ function is faster than Matlab!!.... So my question is about how can i optimize the C++ code. In the following i put the C++ code using both methods
//I have an RGB image in the Mat 'image'
Mat channel[3];
// Splitting method 1
split(image, channel);
Mat Red, Green, Blue;
Blue = channel[0];
Green = channel[1];
Red = channel[2];
//Splitting method 2
// Separate the image in 3 places ( B, G and R )
// vector<Mat> bgr_planes;
// split(image, bgr_planes);
double maxB, maxG, maxR, Npx;
double min;
double coeffB, coeffG, coeffR;
Mat newB, newG, newR;
Mat mapB, mapG, mapR;
int P_Bi, P_Gi, P_Ri;
Mat rangeValues;
double intpart;
double TIME;
int histSize = 256;
/// Set the ranges ( for B,G,R) )
float range[] = { 0, 256 };
const float* histRange = { range };
bool uniform = true; bool accumulate = false;
Mat countB, countG, countR;
//Start the timer for the method 1
TIME = (double)getTickCount();
// Compute the histograms
calcHist(&Blue, 1, 0, Mat(), countB, 1, &histSize, &histRange, uniform, accumulate);
calcHist(&Green, 1, 0, Mat(), countG, 1, &histSize, &histRange, uniform, accumulate);
calcHist(&Red, 1, 0, Mat(), countR, 1, &histSize, &histRange, uniform, accumulate);
// Get the max from each channel
minMaxLoc(Blue, &min, &maxB);
minMaxLoc(Green, &min, &maxG);
minMaxLoc(Red, &min, &maxR);
//Number of pixels
Npx = Blue.rows * Blue.cols;
// Compute the coefficient of the formula
coeffB = maxB / Npx;
coeffG = maxG / Npx;
coeffR = maxR / Npx;
//Initialize the new channels
newB = Mat(Blue.rows, Blue.cols, Blue.type(), cvScalar(0));
newG = Mat(Green.rows, Green.cols, Green.type(), cvScalar(0));
newR = Mat(Red.rows, Red.cols, Red.type(), cvScalar(0));
//For each value of the range
for (int value = 0; value < 255; value++)
{
mapB = (Blue == value)/255;
mapG = (Green == value)/255;
mapR = (Red == value)/255;
//Number of pixels less or equal then 'value'
rangeValues = countB(Range(0, value+1), Range(0, 1));
P_Bi = cv::sum(rangeValues)[0];
rangeValues = countG(Range(0, value + 1), Range(0, 1));
P_Gi = cv::sum(rangeValues)[0];
rangeValues = countR(Range(0, value + 1), Range(0, 1));
P_Ri = cv::sum(rangeValues)[0];
//Substitution of the value in the new channel plane
modf((coeffB * P_Bi), &intpart);
newB = newB + mapB * intpart;
modf((coeffG * P_Gi), &intpart);
newG = newG + mapG * intpart;
modf((coeffR * P_Ri), &intpart);
newR = newR + mapR * intpart;
}
TIME = 1000 * ((double)getTickCount() - TIME) / getTickFrequency();
cout << "Method 1 - elapsed time: " << TIME << "milliseconds." << endl;
//Here it takes 2380 milliseconds
//....
//....
//....
//Start timer of method 2
TIME = 0;
TIME = (double)getTickCount();
//Get the max
minMaxLoc(Blue, &min, &maxB);
minMaxLoc(Green, &min, &maxG);
minMaxLoc(Red, &min, &maxR);
Npx = Blue.rows * Blue.cols;
coeffB = maxB / Npx;
coeffG = maxG / Npx;
coeffR = maxR / Npx;
newB = Mat(Blue.rows, Blue.cols, Blue.type(), cvScalar(0));
newG = Mat(Green.rows, Green.cols, Green.type(), cvScalar(0));
newR = Mat(Red.rows, Red.cols, Red.type(), cvScalar(0));
Mat mask = cvCreateImage(Blue.size(), IPL_DEPTH_8U, 1);
for (int value = 0; value < 255; value++)
{
mapB = (Blue == value) / 255;
mapG = (Green == value) / 255;
mapR = (Red == value) / 255;
//Instead, there i used matrices operations
mask = (Blue <= value)/255;
P_Bi = cv::sum(mask)[0];
mask = (Green <= value) / 255;
P_Gi = cv::sum(mask)[0];
mask = (Red <= value) / 255;
P_Ri = cv::sum(mask)[0];
modf((coeffB * P_Bi), &intpart);
newB = newB + mapB * intpart;
modf((coeffG * P_Gi), &intpart);
newG = newG + mapG * intpart;
modf((coeffR * P_Ri), &intpart);
newR = newR + mapR * intpart;
}
//End of the timer
TIME = 1000 * ((double)getTickCount() - TIME) / getTickFrequency();
cout << "Method 2 - elapsed time: " << TIME << "milliseconds." << endl;
//Here it takes 4651 milliseconds
Related
I am trying to implement the snake algorithm for active contour using C++ and OpenCV 3. I am working with the version that uses the gradient descent. As base test I am trying to draw a contour of a lip. This is the base image.
This is the evolution of the contour without external forces (alpha = 0.001, beta = 3, step-size=0.3).
When I add the external force, this is the result.
As external force I have used just the edge detection with Sobel derivative.
This is the code I use for points update.
array<Mat, 2> edges = edgeMatrices(croppedImage);
const float ALPHA = 0.001, BETA = 3, GAMMA = 0.3, // Gamma is step size.
a = GAMMA * ALPHA, b = GAMMA * BETA;
const uint16_t CYCLES = 1000;
const float p = b, q = -a - 4 * b, r = 1 + 2 * a + 6 * b;
Mat pMatrix = pentadiagonalMatrix(POINTS_NUM, p, q, r).inv();
for (uint16_t i = 0; i < CYCLES; ++i) {
// Extract the x and y derivatives for current points.
auto externalForces = external(edges, x, y);
x = pMatrix * (x + GAMMA * externalForces[0]);
y = pMatrix * (y + GAMMA * externalForces[1]);
// Draw the points.
if (i % 200 == 0 && i > 0)
drawPoints(croppedImage, x, y, { 0.2f * i, 0.2f * i, 0 });
}
This is the code for computing the derivatives.
array<Mat, 2> edgeMatrices(Mat &img) {
// Convert image.
Mat gray;
cvtColor(img, gray, COLOR_BGR2GRAY);
// Apply scharr filter.
Mat grad_x, grad_y, blurred_x, blurred_y;
int scale = 1;
int delta = 0;
int ddepth = CV_16S;
int kernSize = 3;
Sobel(gray, grad_x, ddepth, 1, 0, kernSize, scale, delta, BORDER_DEFAULT);
Sobel(gray, grad_y, ddepth, 0, 1, kernSize, scale, delta, BORDER_DEFAULT);
GaussianBlur(grad_x, blurred_x, Size(5, 5), 30);
GaussianBlur(grad_y, blurred_y, Size(5, 5), 30);
return { blurred_x, blurred_y };
}
array<Mat, 2> external(array<Mat, 2> &edgeMat, Mat &x, Mat &y) {
array<Mat, 2> ext;
ext[0] = { Size{ 1, POINTS_NUM }, CV_32FC1 };
ext[1] = { Size{ 1, POINTS_NUM }, CV_32FC1 };
for (size_t i = 0; i < POINTS_NUM; ++i) {
ext[0].at<float>(0, i) = - edgeMat[0].at<short>(y.at<float>(0, i), x.at<float>(0, i));
ext[1].at<float>(0, i) = - edgeMat[1].at<short>(y.at<float>(0, i), x.at<float>(0, i));
}
return ext;
}
As you can see, the contour points converge in a very strange way and not towards the edge of the lip (that was the result I would expect).
I am not able to understand if it is an error about implementation or about tuning the parameters or it is just is normal behaviour and I misunderstood something about the algorithm.
I have some doubts on the derivative matrices, I think that they should be regularized in some way, but I am not sure which is the right one. Can someone help me?
The only implementations I have found are of the greedy method.
I’m using a modified version of a gauss-newton method to refine a pose estimate using OpenCV. The unmodified code can be found here: http://people.rennes.inria.fr/Eric.Marchand/pose-estimation/tutorial-pose-gauss-newton-opencv.html
The details of this approach are outlined in the corresponding paper:
Marchand, Eric, Hideaki Uchiyama, and Fabien Spindler. "Pose
estimation for augmented reality: a hands-on survey." IEEE
transactions on visualization and computer graphics 22.12 (2016):
2633-2651.
A PDF can be found here: https://hal.inria.fr/hal-01246370/document
The part that is relevant (Pages 4 and 5) are screencapped below:
Here is what I have done. First, I’ve (hopefully) “corrected” some errors: (a) dt and dR can be passed by reference to exponential_map() (even though cv::Mat is essentially a pointer). (b) The last entry of each 2x6 Jacobian matrix, J.at<double>(i*2+1,5), was -x[i].y but should be -x[i].x. (c) I’ve also tried using a different formula for the projection. Specifically, one that includes the focal length and principal point:
xq.at<double>(i*2,0) = cx + fx * cX.at<double>(0,0) / cX.at<double>(2,0);
xq.at<double>(i*2+1,0) = cy + fy * cX.at<double>(1,0) / cX.at<double>(2,0);
Here is the relevant code I am using, in its entirety (control starts at optimizePose3()):
void exponential_map(const cv::Mat &v, cv::Mat &dt, cv::Mat &dR)
{
double vx = v.at<double>(0,0);
double vy = v.at<double>(1,0);
double vz = v.at<double>(2,0);
double vtux = v.at<double>(3,0);
double vtuy = v.at<double>(4,0);
double vtuz = v.at<double>(5,0);
cv::Mat tu = (cv::Mat_<double>(3,1) << vtux, vtuy, vtuz); // theta u
cv::Rodrigues(tu, dR);
double theta = sqrt(tu.dot(tu));
double sinc = (fabs(theta) < 1.0e-8) ? 1.0 : sin(theta) / theta;
double mcosc = (fabs(theta) < 2.5e-4) ? 0.5 : (1.-cos(theta)) / theta / theta;
double msinc = (fabs(theta) < 2.5e-4) ? (1./6.) : (1.-sin(theta)/theta) / theta / theta;
dt.at<double>(0,0) = vx*(sinc + vtux*vtux*msinc)
+ vy*(vtux*vtuy*msinc - vtuz*mcosc)
+ vz*(vtux*vtuz*msinc + vtuy*mcosc);
dt.at<double>(1,0) = vx*(vtux*vtuy*msinc + vtuz*mcosc)
+ vy*(sinc + vtuy*vtuy*msinc)
+ vz*(vtuy*vtuz*msinc - vtux*mcosc);
dt.at<double>(2,0) = vx*(vtux*vtuz*msinc - vtuy*mcosc)
+ vy*(vtuy*vtuz*msinc + vtux*mcosc)
+ vz*(sinc + vtuz*vtuz*msinc);
}
void optimizePose3(const PoseEstimation &pose,
std::vector<FeatureMatch> &feature_matches,
PoseEstimation &optimized_pose) {
//Set camera parameters
double fx = camera_matrix.at<double>(0, 0); //Focal length
double fy = camera_matrix.at<double>(1, 1);
double cx = camera_matrix.at<double>(0, 2); //Principal point
double cy = camera_matrix.at<double>(1, 2);
auto inlier_matches = getInliers(pose, feature_matches);
std::vector<cv::Point3d> wX;
std::vector<cv::Point2d> x;
const unsigned int npoints = inlier_matches.size();
cv::Mat J(2*npoints, 6, CV_64F);
double lambda = 0.25;
cv::Mat xq(npoints*2, 1, CV_64F);
cv::Mat xn(npoints*2, 1, CV_64F);
double residual=0, residual_prev;
cv::Mat Jp;
for(auto i = 0u; i < npoints; i++) {
//Model points
const cv::Point2d &M = inlier_matches[i].model_point();
wX.emplace_back(M.x, M.y, 0.0);
//Imaged points
const cv::Point2d &I = inlier_matches[i].image_point();
xn.at<double>(i*2,0) = I.x; // x
xn.at<double>(i*2+1,0) = I.y; // y
x.push_back(I);
}
//Initial estimation
cv::Mat cRw = pose.rotation_matrix;
cv::Mat ctw = pose.translation_vector;
int nIters = 0;
// Iterative Gauss-Newton minimization loop
do {
for (auto i = 0u; i < npoints; i++) {
cv::Mat cX = cRw * cv::Mat(wX[i]) + ctw; // Update cX, cY, cZ
// Update x(q)
//xq.at<double>(i*2,0) = cX.at<double>(0,0) / cX.at<double>(2,0); // x(q) = cX/cZ
//xq.at<double>(i*2+1,0) = cX.at<double>(1,0) / cX.at<double>(2,0); // y(q) = cY/cZ
xq.at<double>(i*2,0) = cx + fx * cX.at<double>(0,0) / cX.at<double>(2,0);
xq.at<double>(i*2+1,0) = cy + fy * cX.at<double>(1,0) / cX.at<double>(2,0);
// Update J using equation (11)
J.at<double>(i*2,0) = -1 / cX.at<double>(2,0); // -1/cZ
J.at<double>(i*2,1) = 0;
J.at<double>(i*2,2) = x[i].x / cX.at<double>(2,0); // x/cZ
J.at<double>(i*2,3) = x[i].x * x[i].y; // xy
J.at<double>(i*2,4) = -(1 + x[i].x * x[i].x); // -(1+x^2)
J.at<double>(i*2,5) = x[i].y; // y
J.at<double>(i*2+1,0) = 0;
J.at<double>(i*2+1,1) = -1 / cX.at<double>(2,0); // -1/cZ
J.at<double>(i*2+1,2) = x[i].y / cX.at<double>(2,0); // y/cZ
J.at<double>(i*2+1,3) = 1 + x[i].y * x[i].y; // 1+y^2
J.at<double>(i*2+1,4) = -x[i].x * x[i].y; // -xy
J.at<double>(i*2+1,5) = -x[i].x; // -x
}
cv::Mat e_q = xq - xn; // Equation (7)
cv::Mat Jp = J.inv(cv::DECOMP_SVD); // Compute pseudo inverse of the Jacobian
cv::Mat dq = -lambda * Jp * e_q; // Equation (10)
cv::Mat dctw(3, 1, CV_64F), dcRw(3, 3, CV_64F);
exponential_map(dq, dctw, dcRw);
cRw = dcRw.t() * cRw; // Update the pose
ctw = dcRw.t() * (ctw - dctw);
residual_prev = residual; // Memorize previous residual
residual = e_q.dot(e_q); // Compute the actual residual
std::cout << "residual_prev: " << residual_prev << std::endl;
std::cout << "residual: " << residual << std::endl << std::endl;
nIters++;
} while (fabs(residual - residual_prev) > 0);
//} while (nIters < 30);
optimized_pose.rotation_matrix = cRw;
optimized_pose.translation_vector = ctw;
cv::Rodrigues(optimized_pose.rotation_matrix, optimized_pose.rotation_vector);
}
Even when I use the functions as given, it does not produce the correct results. My initial pose estimate is very close to optimal, but when I try run the program, the method takes a very long time to converge - and when it does, the results are very wrong. I’m not sure what could be wrong and I’m out of ideas. I’m confident my inliers are actually inliers (they were chosen using an M-estimator). I’ve compared the results from exponential map with those from other implementations, and they seem to agree.
So, where is the error in this gauss-newton implementation for pose optimization? I’ve tried to make things as easy as possible for anyone willing to lend a hand. Let me know if there is anymore information I can provide. Any help would be greatly appreciated. Thanks.
Edit: 2019/05/13
There is now solvePnPRefineVVS function in OpenCV.
Also, you should use x and y calculated from the current estimated pose instead.
In the cited paper, they expressed the measurements x in the normalized camera frame (at z=1).
When working with real data, you have:
(u,v): 2D image coordinates (e.g. keypoints, corner locations, etc.)
K: the intrinsic parameters (obtained after calibrating the camera)
D: the distortion coefficients (obtained after calibrating the camera)
To compute the 2D image coordinates in the normalized camera frame, you can use in OpenCV the function cv::undistortPoints() (link to my answer about cv::projectPoints() and cv::undistortPoints()).
When there is no distortion, the computation (also called "reverse perspective transformation") is:
x = (u - cx) / fx
y = (v - cy) / fy
I have a problem with the labeling on the MLP, first I thought it would be the same as the SVM labeling but after trying the code below :
Mat labels(numSamples, 3 , CV_32FC1, Scalar(3,0));
labels.rowRange(0, numcar - 1) = Scalar (1.0);
labels.rowRange(numcar, numcar + numbus - 1) = Scalar (2.0);
labels.rowRange(numcar + numbus, numSamples) = Scalar (3.0);
decide that the predictions on the same value even if I had to replace the image with another image. After I searched for it turns out there's a difference. The label must use a vector,I do not know how to label it using vectors because I newbie in this case.
below is the code for training
Mat layers = Mat(4, 1 ,CV_32SC1);
int sz = data.cols;
layers.row(0) = Scalar(sz);
layers.row(1) = Scalar(10);
layers.row(2) = Scalar(10);
layers.row(3) = Scalar(3);
CvANN_MLP mlp;
CvANN_MLP_TrainParams params;
CvTermCriteria criteria;
criteria.max_iter = 1000;
criteria.epsilon = 0.0001;
criteria.type = CV_TERMCRIT_ITER | CV_TERMCRIT_EPS;
params.train_method = CvANN_MLP_TrainParams::BACKPROP;
params.bp_dw_scale = 0.5f;
params.bp_moment_scale = 0.5f;
params.term_crit = criteria;
mlp.create(layers, CvANN_MLP::SIGMOID_SYM);
mlp.train(data , labels ,Mat(),Mat(),params);
and predictions
Mat response(1, 3, cv_32FC1);
mlp.predict (sample, response);
cout << response << endl;
I here want to label cars, buses, and trucks.
Help me to solve this problem, thanks for attention
I am currently working on a project of object tracking and have used c++ , opencv . I have succesfully used Farneback dense optical flow to implement segmentation methods such as k means (using the displacement in each frame) . Now i want to do the same thing with Lucas Kanade sparse method. But the output of this function is :
nextPts – output vector of 2D points (with single-precision floating-point coordinates) containing the calculated new positions of input features in the second image; when OPTFLOW_USE_INITIAL_FLOW flag is passed, the vector must have the same size as in the input.
(as stated in the official site)
My question is how i am going to get the result to a Mat flow for example. I have so far tried :
// Implement Lucas Kanade algorithm
cvCalcOpticalFlowPyrLK(frame1_1C, frame2_1C, pyramid1, pyramid2,
frame1_features, frame2_features, number_of_features,
optical_flow_window, 5, optical_flow_found_feature,
optical_flow_feature_error, optical_flow_termination_criteria,
0);
// Calculate each feature point's coordinates in every frame
CvPoint p,q;
p.x = (int) frame1_features[i].x;
p.y = (int) frame1_features[i].y;
q.x = (int) frame2_features[i].x;
q.y = (int) frame2_features[i].y;
// Creating the arrows for imshow
angle = atan2((double) p.y - q.y, (double) p.x - q.x);
hypotenuse = sqrt(square(p.y - q.y) + square(p.x - q.x));
/* Here we lengthen the arrow by a factor of three. */
q.x = (int) (p.x - 3 * hypotenuse * cos(angle));
q.y = (int) (p.y - 3 * hypotenuse * sin(angle));
cvLine(frame1, p, q, line_color, line_thickness, CV_AA, 0);
p.x = (int) (q.x + 9 * cos(angle + pi / 4));
p.y = (int) (q.y + 9 * sin(angle + pi / 4));
cvLine(frame1, p, q, line_color, line_thickness, CV_AA, 0);
p.x = (int) (q.x + 9 * cos(angle - pi / 4));
p.y = (int) (q.y + 9 * sin(angle - pi / 4));
cvLine(frame1, p, q, line_color, line_thickness, CV_AA, 0);
allocateOnDemand(&framenew, frame_size, IPL_DEPTH_8U, 3);
cvConvertImage(frame1, framenew, CV_CVTIMG_FLIP);
cvShowImage("Optical Flow", framenew);
This is the optical flow presentation. Any ideas how I should get a Mat flow similar to the result of Farneback optical flow ?
(http://docs.opencv.org/2.4/modules/video/doc/motion_analysis_and_object_tracking.html#calcopticalflowfarneback )
UPDATE : Very good answer. But now i have problems with showing the kmeans image. With farneback i used :
cv::kmeans(m, K, bestLabels,
TermCriteria( CV_TERMCRIT_EPS + CV_TERMCRIT_ITER, 10, 1.0),
3, KMEANS_PP_CENTERS, centers);
int colors[K];
for (int i = 0; i < K; i++) {
colors[i] = 255 / (i + 1);
}
namedWindow("Kmeans", WINDOW_NORMAL);
Mat clustered = Mat(flow.rows, flow.cols, CV_32F);
for (int i = 0; i < flow.cols * flow.rows; i++) {
clustered.at<float>(i / flow.cols, i % flow.cols) =
(float) (colors[bestLabels.at<int>(0, i)]);
}
clustered.convertTo(clustered, CV_8U);
imshow("Kmeans", clustered);
Any ideas ? ?
To get an image like the Farneback algorithm you must first understand what is the output.
In OpenCV docs you have:
prev(y,x) ~ next(y + flow(y,x)[1], x +flow(y,x)[0])
So, it is a matrix with the displacements between image 1 and 2. Assuming that the points that you are not calculating will be without movement 0,0; you can simulate this, you only have to put for each of the points (x,y) having the new position (x', y'):
cv::Mat LKFlowMatrix(img.rows, img.cols, CV_32FC2, cv::Scalar(0,0));
LKFlowMatrix.at<cv::Vec2f>(y,x) = cv::Vec2f(x-x', y-y') ;
Also, don't forget to filter the "not found points" with the status = 0
By The Way, your functions are not the opencv c++ version of it:
cvCalcOpticalFlowPyrLK should be cv::calcOpticalFlowFarneback in c++
cvShowImageshould be cv::imshowin c++
and so on
** UPDATE **
Since you need is an input for kmeans (I suppose that is the OpenCV version), and you want to use only the Sparse Points, then you can do something like this:
cv::Mat prevImg, nextImg;
// load your images
std::vector<cv:Point2f> initial_points, new_points;
// fill the initial points vector
std::vector<uchar> status;
std::vector<float> error;
cv::calcOpticalFlowPyrLK(prevImage, nextImage, initial_points, new_points, status, errors);
std::vector<cv::Vec2f> vectorForKMeans;
for(size_t t = 0; t < status.size(); t++){
if(status[t] != 0)
vectorForKmeans.push_back(cv::Vec2f(initial_points[i] - new_points[i]));
}
// Do kmeans to vectorForKMeans
I'm trying to write a method that will find the proper threshold values in HSV space for an object placed at the center of the screen. These values are used for an object tracking algorithm. I've tested that piece of code with hand coded threshold values and it works well. The idea behind the method is that it should calculate the histograms for each of the channels and then return the 5th and 95th percentile for each to be used as the threshold values. (credit: How to find RGB/HSV color parameters for color tracking?) The image being passed is a picture of the object to be tracked (which is set by the user before the whole process begins. Here is the code
std::vector<cv::Scalar> HSV_Threshold_Determiner::Get_Threshold_Values(const cv::Mat& image)
{
cv::Mat inputImage;
cv::cvtColor(image, inputImage, CV_BGR2HSV);
std::vector<cv::Mat> bgrPlanes;
cv::split(inputImage, bgrPlanes);
cv::Mat hHist, sHist, vHist;
int hMax = 180, svMax = 256;
float hRanges[] = { 0, (float)hMax };
const float* hRange = { hRanges };
float svRanges[] = { 0, (float)svMax };
const float* svRange = { svRanges };
//float sRanges[] = { 0, 256 };
cv::calcHist(&bgrPlanes[0], 1, 0, cv::Mat(), hHist, 1, &hMax, &hRange);
cv::calcHist(&bgrPlanes[1], 1, 0, cv::Mat(), sHist, 1, &svMax, &svRange);
cv::calcHist(&bgrPlanes[2], 1, 0, cv::Mat(), vHist, 1, &svMax, &svRange);
int totalEntries = image.cols * image.rows;
int fiveCutoff = (int)(totalEntries * .05);
int ninetyFiveCutoff = (int)(totalEntries * .95);
float hTotal = 0, sTotal = 0, vTotal = 0;
bool hMinFound = false, hMaxFound = false, sMinFound = false, sMaxFound = false,
vMinFound = false, vMaxFound = false;
cv::Scalar hThresholds;
cv::Scalar sThresholds;
cv::Scalar vThresholds;
for(int i = 0; i < vHist.rows; ++i)
{
if(i < hHist.rows)
{
hTotal += hHist.at<float>(i, 0);
if(hTotal >= fiveCutoff && !hMinFound)
{
hThresholds.val[0] = i;
hMinFound = true;
}
else if(hTotal>= ninetyFiveCutoff && !hMaxFound)
{
hThresholds.val[1] = i;
hMaxFound = true;
}
}
sTotal += sHist.at<float>(i, 0);
vTotal += vHist.at<float>(i, 0);
if(sTotal >= fiveCutoff && !sMinFound)
{
sThresholds.val[0] = i;
sMinFound = true;
}
else if(sTotal >= ninetyFiveCutoff && !sMaxFound)
{
sThresholds.val[1] = i;
sMaxFound = true;
}
if(vTotal >= fiveCutoff && !vMinFound)
{
vThresholds.val[0] = i;
vMinFound = true;
}
else if(vTotal >= ninetyFiveCutoff && !vMaxFound)
{
vThresholds.val[1] = i;
vMaxFound = true;
}
if(vMaxFound && sMaxFound && hMaxFound)
{
break;
}
}
std::vector<cv::Scalar> returnVect;
returnVect.push_back(hThresholds);
returnVect.push_back(sThresholds);
returnVect.push_back(vThresholds);
return returnVect;
}
What I am trying to do is sum up the number of entries in each bucket until I get to a number that is greater than or equal to five percent and ninety-five percent of the total. Unfortunately the numbers I get are never close to the ones I get if I do the thresholding by hand.
Mat img = ... // from camera or some other source
// STEP 1: learning phase
Mat hsv, imgThreshed, processed, denoised;
cv::GaussianBlur(img, denoised, cv::Size(5,5), 2, 2); // remove noise
cv::cvtColor(denoised, hsv, CV_BGR2HSV);
// lets say we picked manually a region of 100x100 px with the interested color/object using mouse
cv::Mat roi = hsv (cv::Range(mousex-50, mousey+50), cv::Range(mousex-50, mousey+50));
// must split all channels to get Hue only
std::vector<cv::Mat> hsvPlanes;
cv::split(roi, hsvPlanes);
// compute statistics for Hue value
cv::Scalar mean, stddev;
cv::meanStdDev(hsvPlanes[0], mean, stddev);
// ensure we get 95% of all valid Hue samples (statistics 3*sigma rule)
float minHue = mean[0] - stddev[0]*3;
float maxHue = mean[0] + stddev[0]*3;
// STEP 2: detection phase
cv::inRange(hsvPlanes[0], cv::Scalar(minHue), cv::Scalar(maxHue), imgThreshed);
imshow("thresholded", imgThreshed);
cv_erode(imgThreshed, processed, 5); // minimizes noise
cv_dilate(processed, processed, 20); // maximize left regions
imshow("final", processed);
//STEP 3: do some blob/contour detection on processed image & find maximum blob/region, etc ...
A much simpler solution - just calculate mean & std. deviation for a region of interest, i.e. containing the Hue value.
Since Hue is the most stable component in the image, the other components saturation & value should be discarded as they vary too much. However you can still compute mean for them if needed.