OpenCV, C++: Distance between two points - c++

For a group project, we are attempting to make a game, where functions are executed whenever a player forms a set of specific hand gestures in front of a camera. To process the images, we are using Open-CV 2.3.
During the image-processing we are trying to find the length between two points.
We already know this can be done very easily with Pythagoras law, though it is known that Pythagoras law requires much computer power, and we wish to do this as low-resource as possible.
We wish to know if there exist any build-in function within Open-CV or standard library for C++, which can handle low-resource calculations of the distance between two points.
We have the coordinates for the points, which are in pixel values (Of course).
Extra info:
Previous experience have taught us, that OpenCV and other libraries are heavily optimized. As an example, we attempted to change the RGB values of the live image feed from the camera with a for loop, going through each pixel. This provided with a low frame-rate output. Instead we decided to use an Open-CV build-in function instead, which instead gave us a high frame-rate output.

You should try this
cv::Point a(1, 3);
cv::Point b(5, 6);
double res = cv::norm(a-b);//Euclidian distance

As you correctly pointed out, there's an OpenCV function that does some of your work :)
(Also check the other way)
It is called magnitude() and it calculates the distance for you. And if you have a vector of more than 4 vectors to calculate distances, it will use SSE (i think) to make it faster.
Now, the problem is that it only calculate the square of the powers, and you have to do by hand differences. (check the documentation). But if you do them also using OpenCV functions it should be fast.
Mat pts1(nPts, 1, CV_8UC2), pts2(nPts, 1, CV_8UC2);
// populate them
Mat diffPts = pts1-pts2;
Mat ptsx, ptsy;
// split your points in x and y vectors. maybe separate them from start
Mat dist;
magnitude(ptsx, ptsy, dist); // voila!
The other way is to use a very fast sqrt:
// 15 times faster than the classical float sqrt.
// Reasonably accurate up to root(32500)
// Source: http://supp.iar.com/FilesPublic/SUPPORT/000419/AN-G-002.pdf
unsigned int root(unsigned int x){
unsigned int a,b;
b = x;
a = x = 0x3f;
x = b/x;
a = x = (x+a)>>1;
x = b/x;
a = x = (x+a)>>1;
x = b/x;
x = (x+a)>>1;
return(x);
}

This ought to a comment, but I haven't enough rep (50?) |-( so I post it as an answer.
What the guys are trying to tell you in the comments of your questions is that if it's only about comparing distances, then you can simply use
d=(dx*dx+dy*dy) = (x1-x2)(x1-x2) + (y1-y2)(y1-y2)
thus avoiding the square root. But you can't of course skip the square elevation.

Pythagoras is the fastest way, and it really isn't as expensive as you think. It used to be, because of the square-root. But modern processors can usually do this within a few cycles.
If you really need speed, use OpenCL on the graphics card for image processing.

Related

OpenCV 3.0: Calibration not fitting as expected

I'm getting results I don't expect when I use OpenCV 3.0 calibrateCamera. Here is my algorithm:
Load in 30 image points
Load in 30 corresponding world points (coplanar in this case)
Use points to calibrate the camera, just for un-distorting
Un-distort the image points, but don't use the intrinsics (coplanar world points, so intrinsics are dodgy)
Use the undistorted points to find a homography, transforming to world points (can do this because they are all coplanar)
Use the homography and perspective transform to map the undistorted points to the world space
Compare the original world points to the mapped points
The points I have are noisy and only a small section of the image. There are 30 coplanar points from a single view so I can't get camera intrinsics, but should be able to get distortion coefficients and a homography to create a fronto-parallel view.
As expected, the error varies depending on the calibration flags. However, it varies opposite to what I expected. If I allow all variables to adjust, I would expect error to come down. I am not saying I expect a better model; I actually expect over-fitting, but that should still reduce error. What I see though is that the fewer variables I use, the lower my error. The best result is with a straight homography.
I have two suspected causes, but they seem unlikely and I'd like to hear an unadulterated answer before I air them. I have pulled out the code to just do what I'm talking about. It's a bit long, but it includes loading the points.
The code doesn't appear to have bugs; I've used "better" points and it works perfectly. I want to emphasize that the solution here can't be to use better points or perform a better calibration; the whole point of the exercise is to see how the various calibration models respond to different qualities of calibration data.
Any ideas?
Added
To be clear, I know the results will be bad and I expect that. I also understand that I may learn bad distortion parameters which leads to worse results when testing points that have not been used to train the model. What I don't understand is how the distortion model has more error when using the training set as the test set. That is, if the cv::calibrateCamera is supposed to choose parameters to reduce error over the training set of points provided, yet it is producing more error than if it had just selected 0s for K!, K2, ... K6, P1, P2. Bad data or not, it should at least do better on the training set. Before I can say the data is not appropriate for this model, I have to be sure I'm doing the best I can with the data available, and I can't say that at this stage.
Here an example image
The points with the green pins are marked. This is obviously just a test image.
Here is more example stuff
In the following the image is cropped from the big one above. The centre has not changed. This is what happens when I undistort with just the points marked manually from the green pins and allowing K1 (only K1) to vary from 0:
Before
After
I would put it down to a bug, but when I use a larger set of points that covers more of the screen, even from a single plane, it works reasonably well. This looks terrible. However, the error is not nearly as bad as you might think from looking at the picture.
// Load image points
std::vector<cv::Point2f> im_points;
im_points.push_back(cv::Point2f(1206, 1454));
im_points.push_back(cv::Point2f(1245, 1443));
im_points.push_back(cv::Point2f(1284, 1429));
im_points.push_back(cv::Point2f(1315, 1456));
im_points.push_back(cv::Point2f(1352, 1443));
im_points.push_back(cv::Point2f(1383, 1431));
im_points.push_back(cv::Point2f(1431, 1458));
im_points.push_back(cv::Point2f(1463, 1445));
im_points.push_back(cv::Point2f(1489, 1432));
im_points.push_back(cv::Point2f(1550, 1461));
im_points.push_back(cv::Point2f(1574, 1447));
im_points.push_back(cv::Point2f(1597, 1434));
im_points.push_back(cv::Point2f(1673, 1463));
im_points.push_back(cv::Point2f(1691, 1449));
im_points.push_back(cv::Point2f(1708, 1436));
im_points.push_back(cv::Point2f(1798, 1464));
im_points.push_back(cv::Point2f(1809, 1451));
im_points.push_back(cv::Point2f(1819, 1438));
im_points.push_back(cv::Point2f(1925, 1467));
im_points.push_back(cv::Point2f(1929, 1454));
im_points.push_back(cv::Point2f(1935, 1440));
im_points.push_back(cv::Point2f(2054, 1470));
im_points.push_back(cv::Point2f(2052, 1456));
im_points.push_back(cv::Point2f(2051, 1443));
im_points.push_back(cv::Point2f(2182, 1474));
im_points.push_back(cv::Point2f(2171, 1459));
im_points.push_back(cv::Point2f(2164, 1446));
im_points.push_back(cv::Point2f(2306, 1474));
im_points.push_back(cv::Point2f(2292, 1462));
im_points.push_back(cv::Point2f(2278, 1449));
// Create corresponding world / object points
std::vector<cv::Point3f> world_points;
for (int i = 0; i < 30; i++) {
world_points.push_back(cv::Point3f(5 * (i / 3), 4 * (i % 3), 0.0f));
}
// Perform calibration
// Flags are set out so they can be commented out and "freed" easily
int calibration_flags = 0
| cv::CALIB_FIX_K1
| cv::CALIB_FIX_K2
| cv::CALIB_FIX_K3
| cv::CALIB_FIX_K4
| cv::CALIB_FIX_K5
| cv::CALIB_FIX_K6
| cv::CALIB_ZERO_TANGENT_DIST
| 0;
// Initialise matrix
cv::Mat intrinsic_matrix = cv::Mat(3, 3, CV_64F);
intrinsic_matrix.ptr<float>(0)[0] = 1;
intrinsic_matrix.ptr<float>(1)[1] = 1;
cv::Mat distortion_coeffs = cv::Mat::zeros(5, 1, CV_64F);
// Rotation and translation vectors
std::vector<cv::Mat> undistort_rvecs;
std::vector<cv::Mat> undistort_tvecs;
// Wrap in an outer vector for calibration
std::vector<std::vector<cv::Point2f>>im_points_v(1, im_points);
std::vector<std::vector<cv::Point3f>>w_points_v(1, world_points);
// Calibrate; only 1 plane, so intrinsics can't be trusted
cv::Size image_size(4000, 3000);
calibrateCamera(w_points_v, im_points_v,
image_size, intrinsic_matrix, distortion_coeffs,
undistort_rvecs, undistort_tvecs, calibration_flags);
// Undistort im_points
std::vector<cv::Point2f> ud_points;
cv::undistortPoints(im_points, ud_points, intrinsic_matrix, distortion_coeffs);
// ud_points have been "unintrinsiced", but we don't know the intrinsics, so reverse that
double fx = intrinsic_matrix.at<double>(0, 0);
double fy = intrinsic_matrix.at<double>(1, 1);
double cx = intrinsic_matrix.at<double>(0, 2);
double cy = intrinsic_matrix.at<double>(1, 2);
for (std::vector<cv::Point2f>::iterator iter = ud_points.begin(); iter != ud_points.end(); iter++) {
iter->x = iter->x * fx + cx;
iter->y = iter->y * fy + cy;
}
// Find a homography mapping the undistorted points to the known world points, ground plane
cv::Mat homography = cv::findHomography(ud_points, world_points);
// Transform the undistorted image points to the world points (2d only, but z is constant)
std::vector<cv::Point2f> estimated_world_points;
std::cout << "homography" << homography << std::endl;
cv::perspectiveTransform(ud_points, estimated_world_points, homography);
// Work out error
double sum_sq_error = 0;
for (int i = 0; i < 30; i++) {
double err_x = estimated_world_points.at(i).x - world_points.at(i).x;
double err_y = estimated_world_points.at(i).y - world_points.at(i).y;
sum_sq_error += err_x*err_x + err_y*err_y;
}
std::cout << "Sum squared error is: " << sum_sq_error << std::endl;
I would take random samples of the 30 input points and compute the homography in each case along with the errors under the estimated homographies, a RANSAC scheme, and verify consensus between error levels and homography parameters, this can be just a verification of the global optimisation process. I know that might seem unnecessary, but it is just a sanity check for how sensitive the procedure is to the input (noise levels, location)
Also, it seems logical that fixing most of the variables gets you the least errors, as the degrees of freedom in the minimization process are less. I would try fixing different ones to establish another consensus. At least this would let you know which variables are the most sensitive to the noise levels of the input.
Hopefully, such a small section of the image would be close to the image centre as it will incur the least amount of lens distortion. Is using a different distortion model possible in your case? A more viable way is to adapt the number of distortion parameters given the position of the pattern with respect to the image centre.
Without knowing the constraints of the algorithm, I might have misunderstood the question, that's also an option too, in such case I can roll back.
I would like to have this as a comment rather, but I do not have enough points.
OpenCV runs Levenberg-Marquardt algorithm inside calibrate camera.
https://en.wikipedia.org/wiki/Levenberg%E2%80%93Marquardt_algorithm/
This algortihm works fine in problems with one minimum. In case of single image, points located close each other and many dimensional problem (n= number of coefficents) algorithm may be unstable (especially with wrong initial guess of camera matrix. Convergence of algorithm is well described here:
https://na.math.kit.edu/download/papers/levenberg.pdf/
As you wrote, error depends on calibration flags - number of flags changes dimension of a problem to be optimized.
Camera calibration also calculates pose of camera, which will be bad in models with wrong calibration matrix.
As a solution I suggest changing approach. You dont need to calculate camera matrix and pose in this step. Since you know, that points are located on a plane you can use 3d-2d plane projection equation to determine distribution type of points. By distribution I mean, that all points will be located equally on some kind of trapezoid.
Then you can use cv::undistort with different distCoeffs on your test image and calculate image point distribution and distribution error.
The last step will be to perform this steps as a target function for some optimization algorithm with distortion coefficents being optimized.
This is not the easiest solution, but i hope it will help you.

Matlab griddata equivalent in C++

I am looking for a C++ equivalent to Matlab's griddata function, or any 2D global interpolation method.
I have a C++ code that uses Eigen 3. I will have an Eigen Vector that will contain x,y, and z values, and two Eigen matrices equivalent to those produced by Meshgrid in Matlab. I would like to interpolate the z values from the Vectors onto the grid points defined by the Meshgrid equivalents (which will extend past the outside of the original points a bit, so minor extrapolation is required).
I'm not too bothered by accuracy--it doesn't need to be perfect. However, I cannot accept NaN as a solution--the interpolation must be computed everywhere on the mesh regardless of data gaps. In other words, staying inside the convex hull is not an option.
I would prefer not to write an interpolation from scratch, but if someone wants to point me to pretty good (and explicit) recipe I'll give it a shot. It's not the most hateful thing to write (at least in an algorithmic sense), but I don't want to reinvent the wheel.
Effectively what I have is scattered terrain locations, and I wish to define a rectilinear mesh that nominally follows some distance beneath the topography for use later. Once I have the node points, I will be good.
My research so far:
The question asked here: MATLAB functions in C++ produced a close answer, but unfortunately the suggestion was not free (SciMath).
I have tried understanding the interpolation function used in Generic Mapping Tools, and was rewarded with a headache.
I briefly looked into the Grid Algorithms library (GrAL). If anyone has commentary I would appreciate it.
Eigen has an unsupported interpolation package, but it seems to just be for curves (not surfaces).
Edit: VTK has a matplotlib functionality. Presumably there must be an interpolation used somewhere in that for display purposes. Does anyone know if that's accessible and usable?
Thank you.
This is probably a little late, but hopefully it helps someone.
Method 1.) Octave: If you're coming from Matlab, one way is to embed the gnu Matlab clone Octave directly into the c++ program. I don't have much experience with it, but you can call the octave library functions directly from a cpp file.
See here, for instance. http://www.gnu.org/software/octave/doc/interpreter/Standalone-Programs.html#Standalone-Programs
griddata is included in octave's geometry package.
Method 2.) PCL: They way I do it is to use the point cloud library (http://www.pointclouds.org) and VoxelGrid. You can set x, and y bin sizes as you please, then set a really large z bin size, which gets you one z value for each x,y bin. The catch is that x,y, and z values are the centroid for the points averaged into the bin, not the bin centers (which is also why it works for this). So you need to massage the x,y values when you're done:
Ex:
//read in a list of comma separated values (x,y,z)
FILE * fp;
fp = fopen("points.xyz","r");
//store them in PCL's point cloud format
pcl::PointCloud<pcl::PointXYZ>::Ptr basic_cloud_ptr (new pcl::PointCloud<pcl::PointXYZ>);
int numpts=0;
double x,y,z;
while(fscanf(fp, "%lg, %lg, %lg", &x, &y, &z)!=EOF)
{
pcl::PointXYZ basic_point;
basic_point.x = x; basic_point.y = y; basic_point.z = z;
basic_cloud_ptr->points.push_back(basic_point);
}
fclose(fp);
basic_cloud_ptr->width = (int) basic_cloud_ptr->points.size ();
basic_cloud_ptr->height = 1;
// create object for result
pcl::PointCloud<pcl::PointXYZ>::Ptr cloud_filtered(new pcl::PointCloud<pcl::PointXYZ>());
// create filtering object and process
pcl::VoxelGrid<pcl::PointXYZ> sor;
sor.setInputCloud (basic_cloud_ptr);
//set the bin sizes here. (dx,dy,dz). for 2d results, make one of the bins larger
//than the data set span in that axis
sor.setLeafSize (0.1, 0.1, 1000);
sor.filter (*cloud_filtered);
So that cloud_filtered is now a point cloud that contains one point for each bin. Then I just make a 2-d matrix and go through the point cloud assigning points to their x,y bins if I want an image, etc. as would be produced by griddata. It works pretty well, and it's much faster than matlab's griddata for large datasets.

Drawing big circles from scratch [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I am new to C++, but the language seems alright to me. As a learning project I have decided to make a minor 2D graphic-engine. It might seem like a hard project, but I have a good idea how to move on.
I havn't really started yet, but I am forming things in my head at the moment, when I came across this problem:
At some point I will have to make a function to draw circles on the screen. My approach to that right now would be something like this:
in a square with sides from (x-r) to (x+r) loop through x and y,
if at each point, the current distance sqr(x^2+y^2) is less than or equal to r
, then draw a pixel at that point.
This would work, if not, dont bother telling me, I'll figure it out. I would of cause only draw this circle if the x+r & y+r is on the screen.
The problem lies in that I will need to draw really big circles sometimes. If for example I need to draw a circle with radius 5000, the (if pixel loop calculations would need loop a total of 10000^2 times). So with a processor at 2Ghz, this single circle would only be able to render 2Ghz/(10000^2) which is ~22 times/s while taking up the whole core (believing it only takes one calculation per pixel, which is nowhere the truth).
Which approach is the correct one now? I guess it has something to do with using the GFX for these simple calculations. If so, can I use OpenGL for C++ for this? I'd like to learn that as well :)
My first C/C++ projects were in fact graphics libraries as well. I did not have OpenGL or DirectX and was using DOS at the time. I learned quite a lot from it, as I constantly found new and better (and faster) ways to draw to the screen.
The problem with modern operating systems is that they don't really allow you to do what I did back then. You cannot just start using the hardware directly. And frankly, these days you don't want to anymore.
You can still draw everything yourself. Have a look at SDL if you want to put your own pixels. This is a C library that you will be able to wrap into your own C++ objects. It works on different platforms (including Linux, Windows, Mac,...) and internally it will make use of things like DirectX or OpenGL.
For real-world graphics, one doesn't just go about drawing one's own pixels. That is not efficient. Or at least not on devices where you cannot use the hardware directly...
But for your purposes, I think SDL is definitely the way to go! Good luck with that.
You don't do graphics by manually drawing pixels to screen, that way madness lies.
What you want to use is either DirectX or OpenGL. I suggest you crack open google and go read, there's a lot to read out there.
Once you've downloaded the libs there's lots of sample projects to take a look at, they'll get you started.
There's two approaches at this point: there's the mathematical way of calculating the vectors that describe a shape with a very large number of sides (i.e it'll look like a circle). Or there's the 'cheating' method of just drawing a texture (i.e a picture) of a circle to the screen with an alpha channel to make the rest of the texture transparent. (The cheating method is easier to code, faster to execute, and produces a better result, although it is less flexible).
If you want to do it mathematically then both of these libraries will allow you to draw lines to screen, so you need to begin your approach from the view of start point and end point of each line, not the individual pixels. i,e you want vector graphics.
I can't do the heavy maths right now, but the vector approach might look a little like this (sudo-code):
in-param: num_of_sides, length_of_side;
float angle = 360 / num_of_sides;
float start_x = 0;
float start_y = 0;
x = startX;
y = startX;
for(int i(0); i < num_of_sides; ++i)
{
float endX, endY;
rotateOffsetByAngle(x, y, x + lenth_of_side, y, angle * i, endX, endY);
drawline(x, y, endX, endY);
x = endX;
y = endY;
}
drawline(float startX, startY, endX, endY)
{
//does code that draws line between the start and end coordinates;
}
rotateOffsetByAngle(float startX, startY, endX, endY, angle, &outX, &outY)
{
//the in-parameters startX, startY and endX, endY describe a line
//we treat this line as the offset from the starting point
//do code that rotates this line around the point startX, startY, by the angle.
//after this rotation is done endX and endY are now at the same
//distance from startX and startY that they were, but rotated.
outX = endX;
outY = endY; //pass these new coordinates back out by reference;
}
In the above code we move around the outside of the circle drawing each individual line around the outside 1 by one. For each line we have a start point and an offset, we then rotate the offset by an angle (this angle increases as we move around the circle). Then we draw the line from the start point to the offset point. Before we begin the next iteration we move the start point to the offset point so the next line starts from the end of the last.
I hope that's understandable.
That is one way to draw a filled circle. It will perform appallingly slowly, as you can see.
Modern graphics is based on abstracting away the lower-level stuff so that it can be optimised; the developer writes drawCircle(x,y,r) and the graphics libary + drivers can pass that all the way down to the chip, which can fill in the appropriate pixels.
Although you are writing in C++, you are not manipulating data closest to the core unless you use the graphics drivers. There are layers of subroutine calls between even your setPixelColour level methods and an actual binary value being passed over the wire; at almost every layer there are checks and additional calculations and routines run. The secret to faster graphics, then, is to reduce the number of these calls you make. If you can get the command drawCircle all the way to the graphics chip, do that. Don't waste a call on a single pixel, when it's as mundane as drawing a regular shape.
In a modern OS, there are layers of graphics processing taking the requests of individual applications like yours and combining them with the windowing, compositing and any other effects. So your command to 'draw to screen' is intermediated by several layers already. What you want to provide to the CPU is the minimum information necessary to offload the calculations to the graphics subsystem.
I would say if you want to learn to draw stuff on the screen, play with canvas and js, as the development cycle is easy and comparatively painless. If you want to learn C++, try project Euler, or draw stuff using existing graphics libraries. If you want to write a 2d graphics library, learn the underlying graphics technologies like DirectX and OpenGL, because they are the way that graphics is done in reality. But they seem so complex, you say? Then you need to learn more C++ first. They are the way they are for some very good reasons, however complex the result is.
As the first answer says, you shouldn't do this yourself for serious work. But if you just want to do this as an example, then could could do something like this: First define a function for drawing line segments on the screen:
void draw_line(int x1, int y1, int x2, int y2);
This should be relatively straightforward to implement: Just find the direction that is changing fastest, and iterate over that direction while using integer logic to find out how much the other dimension should change. I.e., if x is changing faster, then y = (x-x1)*(y2-y1)/(x2-x1).
Then use this function to implement a circle as piecewise line elements:
void draw_circle(int x, int y, int r)
{
double dtheta = 2*M_PI/8/r;
int x1 = x+r, x2, y1 = y, y2;
int n = 2*M_PI/dtheta;
for(int i = 1; i < n; i++)
{
double theta = i*dtheta;
x2 = int(x+r*cos(theta)); y2 = int(y+r*sin(theta));
draw_line(x1, y1, x2, y2);
x1 = x2; y1 = y2;
}
}
This uses floating point logic and trigonometric functions to figure out which line elements best approximate a circle. It is a somewhat crude implementation, but I think any implementation that wants to be efficient for very large circles has to do something like this.
If you are only allowed to use integer logic, one approach could be to first draw a low-resolution integer circle, and then subdivide each selected pixel into smaller pixels, and choose the sub-pixels you want there, and so on. This would scale as N log N, so still slower than the approach above. But you would be able to avoid sin and cos.

Robustly find N circles with the same diameter: alternative to bruteforcing Hough transform threshold

I am developing application to track small animals in Petri dishes (or other circular containers).
Before any tracking takes place, the first few frames are used to define areas.
Each dish will match an circular independent static area (i.e. will not be updated during tracking).
The user can request the program to try to find dishes from the original image and use them as areas.
Here are examples:
In order to perform this task, I am using Hough Circle Transform.
But in practice, different users will have very different settings and images and I do not want to ask the user to manually define the parameters.
I cannot just guess all the parameters either.
However, I have got additional informations that I would like to use:
I know the exact number of circles to be detected.
All the circles have the almost same dimensions.
The circles cannot overlap.
I have a rough idea of the minimal and maximal size of the circles.
The circles must be entirely in the picture.
I can therefore narrow down the number of parameters to define to one: the threshold.
Using these informations and considering that I have got N circles to find, my current solution is to
test many values of threshold and keep the circles between which the standard deviation is the smallest (since all the circles should have a similar size):
//at this point, minRad and maxRad were calculated from the size of the image and the number of circles to find.
//assuming circles should altogether fill more than 1/3 of the images but cannot be altogether larger than the image.
//N is the integer number of circles to find.
//img is the picture of the scene (filtered).
//the vectors containing the detected circles and the --so far-- best circles found.
std::vector<cv::Vec3f> circles, bestCircles;
//the score of the --so far-- best set of circles
double bestSsem = 0;
for(int t=5; t<400 ; t=t+2){
//Apply Hough Circles with the threshold t
cv::HoughCircles(img, circles, CV_HOUGH_GRADIENT, 3, minRad*2, t,3, minRad, maxRad );
if(circles.size() >= N){
//call a routine to give a score to this set of circles according to the similarity of their radii
double ssem = scoreSetOfCircles(circles,N);
//if no circles are recorded yet, or if the score of this set of circles is higher than the former best
if( bestCircles.size() < N || ssem > bestSsem){
//this set become the temporary best set of circles
bestCircles=circles;
bestSsem=ssem;
}
}
}
With:
//the methods to assess how good is a set of circle (the more similar the circles are, the higher is ssem)
double scoreSetOfCircles(std::vector<cv::Vec3f> circles, int N){
double ssem=0, sum = 0;
double mean;
for(unsigned int j=0;j<N;j++){
sum = sum + circles[j][2];
}
mean = sum/N;
for(unsigned int j=0;j<N;j++){
double em = mean - circles[j][2];
ssem = 1/(ssem + em*em);
}
return ssem;
}
I have reached a higher accuracy by performing a second pass in which I repeated this algorithm narrowing the [minRad:maxRad] interval using the result of the first pass.
For instance minRad2 = 0.95 * average radius of best circles and maxRad2 = 1.05 * average radius of best circles.
I had fairly good results using this method so far. However, it is slow and rather dirty.
My questions are:
Can you thing of any alternative algorithm to solve this problem in a cleaner/faster manner ?
Or what would you suggest to improve this algorithm?
Do you think I should investigate generalised Hough transform ?
Thank you for your answers and suggestions.
The following approach should work pretty well for your case:
Binarize your image (you might need to do this on several levels of threshold to make algorithm independent of the lighting conditions)
Find contours
For each contour calculate the moments
Filter them by area to remove too small contours
Filter contours by circularity:
double area = moms.m00;
double perimeter = arcLength(Mat(contours[contourIdx]), true);
double ratio = 4 * CV_PI * area / (perimeter * perimeter);
ratio close to 1 will give you circles.
Calculate radius and center of each circle
center = Point2d(moms.m10 / moms.m00, moms.m01 / moms.m00);
And you can add more filters to improve the robustness.
Actually you can find an implementation of the whole procedure in OpenCV. Look how the SimpleBlobDetector class and findCirclesGrid function are implemented.
Within the current algorithm, the biggest thing that sticks out is the for(int t=5; t<400; t=t+2) loop. Trying recording score values for some test images. Graph score(t) versus t. With any luck, it will either suggest a smaller range for t or be a smoothish curve with a single maximum. In the latter case you can change your loop over all t values into a smarter search using Hill Climbing methods.
Even if it's fairly noisy, you can first loop over multiples of, say, 30, and for the best 1 or 2 of those loop over nearby multiples of 2.
Also, in your score function, you should disqualify any results with overlapping circles and maybe penalize overly spaced out circles.
You don't explain why you are using a black background. Unless you are using a telecentric lens (which seems unlikely, given the apparent field of view), and ignoring radial distortion for the moment, the images of the dishes will be ellipses, so estimating them as circles may lead to significant errors.
All and all, it doesn't seem to me that you are following a good approach. If the goals is simply to remove the background, so you can track the bugs inside the dishes, then your goal should be just that: find which pixels are background and mark them. The easiest way to do that is to take a picture of the background without dishes, under the same illumination and camera, and directly detect differences with the picture with the images. A colored background would be preferable to do that, with a color unlikely to appear in the dishes (e.g. green or blue velvet). So you'd have reduced the problem to bluescreening (or chroma keying), a classic technique in machine vision as applied to visual effects. Do a google search for "matte petro vlahos assumption" to find classic algorithms for solving this problem.

Fast/Efficent Pixel Access in Magick++

As an educational excercise for myself I'm writing an application that can average a bunch of images. This is often used in Astrophotography to reduce noise.
The library I'm using is Magick++ and I've succeeded in actually writing the application. But, unfortunately, its slow. This is the code I'm using:
for(row=0;row<rows;row++)
{
for(column=0;column<columns;column++)
{
red.clear(); blue.clear(); green.clear();
for(i=1;i<10;i++)
{
ColorRGB rgb(image[i].pixelColor(column,row));
red.push_back(rgb.red());
green.push_back(rgb.green());
blue.push_back(rgb.blue());
}
redVal = avg(red);
greenVal = avg(green);
blueVal = avg(blue);
redVal = redVal*MaxRGB; greenVal = greenVal*MaxRGB; blueVal = blueVal*MaxRGB;
Color newRGB(redVal,greenVal,blueVal);
stackedImage.pixelColor(column,row,newRGB);
}
}
The code averages 10 images by going through each pixel and adding each channel's pixel intensity into a double vector. The function avg then takes the vector as a parameter and averages the result. This average is then used at the corresponding pixel in stackedImage - which is the resultant image. It works just fine but as I mentioned, I'm not happy with the speed. It takes 2 minutes and 30s seconds on a Core i5 machine. The images are 8 megapixel and 16 bit TIFFs. I understand that its a lot of data, but I have seen it done faster in other applications.
Is it my loop thats slow or is pixelColor(x,y) a slow way to access pixels in an image? Is there a faster way?
Why use vectors/arrays at all?
Why not
double red=0.0, blue=0.0, green=0.0;
for(i=1;i<10;i++)
{
ColorRGB rgb(image[i].pixelColor(column,row));
red+=rgb.red();
blue+=rgb.blue();
green+=rgb.green();
}
red/=10;
blue/=10;
green/=10;
This avoids 36 function calls on vector objects per pixel.
And you may get even better performance by using a PixelCache of the whole image instead of the original Image objects. See the "Low-Level Image Pixel Access" section of the online Magick++ documentation for Image
Then the inner loop becomes
PixelPacket* pix = cache[i]+row*columns+column;
red+= pix->red;
blue+= pix->blue;
green+= pix->green;
Now you have also removed 10 calls to PixelColor, 10 ColorRGB constructors, and 30 accessor functions per pixel.
Note, This is all theory; I haven't tested any of it
Comments:
Why do you use vectors for red, blue and green? Because using push_back can perform reallocations, and bottleneck processing. You could instead allocate just once three arrays of 10 colors.
Couldn't you declare rgb outside of the loops in order to relieve stack of unnecessary constructions and destructions?
Doesn't Magick++ have a way to average images?
Just in case anyone else wants to average images to reduce noise, and doesn't feel like too much "educational exercise" ;-)
ImageMagick can do averaging of a sequence of images like this:
convert image1.tif image2.tif ... image32.tif -evaluate-sequence mean result.tif
You can also do median filtering and others by changing the word mean in the above command to whatever you want, e.g.:
convert image1.tif image2.tif ... image32.tif -evaluate-sequence median result.tif
You can get a list of the available operations with:
identify -list evaluate
Output
Abs
Add
AddModulus
And
Cos
Cosine
Divide
Exp
Exponential
GaussianNoise
ImpulseNoise
LaplacianNoise
LeftShift
Log
Max
Mean
Median
Min
MultiplicativeNoise
Multiply
Or
PoissonNoise
Pow
RightShift
RMS
RootMeanSquare
Set
Sin
Sine
Subtract
Sum
Threshold
ThresholdBlack
ThresholdWhite
UniformNoise
Xor