Explain numbers from OpenCV matchShapes() - c++

I am developing an app where I compare two images using matchShapes() of OpenCV.
I implemented the method in Objective-C code is below
- (void) someMethod:(UIImage *)image :(UIImage *)temp {
RNG rng(12345);
cv::Mat src_base, hsv_base;
cv::Mat src_test1, hsv_test1;
src_base = [self cvMatWithImage:image];
src_test1 = [self cvMatWithImage:temp];
int thresh=150;
double ans=0, result=0;
Mat imageresult1, imageresult2;
cv::cvtColor(src_base, hsv_base, cv::COLOR_BGR2HSV);
cv::cvtColor(src_test1, hsv_test1, cv::COLOR_BGR2HSV);
std::vector<std::vector<cv::Point>>contours1, contours2;
std::vector<Vec4i>hierarchy1, hierarchy2;
Canny(hsv_base, imageresult1, thresh, thresh*2);
Canny(hsv_test1, imageresult2, thresh, thresh*2);
findContours(imageresult1,contours1,hierarchy1,CV_RETR_TREE,CV_CHAIN_APPROX_SIMPLE,cvPoint(0,0));
for(int i=0;i<contours1.size();i++)
{
Scalar color=Scalar(rng.uniform(0, 255), rng.uniform(0,255), rng.uniform(0,255));
drawContours(imageresult1,contours1,i,color,1,8,hierarchy1,0,cv::Point());
}
findContours(imageresult2,contours2,hierarchy2,CV_RETR_TREE,CV_CHAIN_APPROX_SIMPLE,cvPoint(0,0));
for(int i=0;i<contours2.size();i++)
{
Scalar color=Scalar(rng.uniform(0, 255), rng.uniform(0,255), rng.uniform(0,255));
drawContours(imageresult2,contours2,i,color,1,8,hierarchy2,0,cv::Point());
}
for(int i=0;i<contours1.size();i++)
{
ans = matchShapes(contours1[i],contours2[i],CV_CONTOURS_MATCH_I1,0);
std::cout<<ans<<" ";
getchar();
}
}
I got those results but do not know what exactly those numbers mean: 0 0 0.81946 0.816337 0.622353 0.634221 0

this blogpost I think should give a lot more insight into how matchShapes works.
You obviously already know what the input parameters are but for anyone finding this that doesn't:
double matchShapes(InputArray contour1, InputArray contour2, int method, double parameter)
The output is a metric where:
The lower the result, the better match it is. It is calculated based on the hu-moment values. Different measurement methods are explained in the docs.
The findings on the blogpost mentioned are as follows: ( max = 1 , min = 0)
I got following results:
Matching Image A with itself = 0.0
Matching Image A with Image B = 0.001946
Matching Image A with Image C = 0.326911
See, even image rotation doesn’t affect much on this comparison.
This basically shows that for your results:
The first two are great, you got a compelte match at 0
The second two (0.81946 0.816337) are quite an incompatible match
the third two are OK at around 62% incompatible
the last one is complete match.
If my computer vision learnings have taught me anything is always be sceptical of a complete match unless you are 100% using the same images.
Edit1: I think it might also be rotationally invarient so in your case you might have three very similar drawn lines that have been rotated to the same way (i.e. horizontal) and compared

Related

Why "findChessboardCorners" function is returning false

I am using Opencv "findChessboardCorners" function to find corners of chess board, but I am getting false as a returned value from "findChessboardCorners" function.
Following is my code:
int main(int argc, char* argv[])
{
vector<vector<Point2f>> imagePoints;
Mat view;
bool found;
vector<Point2f> pointBuf;
Size boardSize; // The size of the board -> Number of items by width and height
boardSize.width = 75;
boardSize.height = 49;
view = cv::imread("FraunhoferChessBoard.jpeg");
namedWindow("Original Image", WINDOW_NORMAL);// Create a window for display.
imshow("Original Image", view);
found = findChessboardCorners(view, boardSize, pointBuf,
CV_CALIB_CB_ADAPTIVE_THRESH | CV_CALIB_CB_FAST_CHECK | CV_CALIB_CB_NORMALIZE_IMAGE);
if (found)
{
cout << "Corners of chess board detected";
}
else
{
cout << "Corners of chess board not detected";
}
waitKey(0);
return 0;
}
I expect return value from "findChessboardCorners" function to be true whereas I am getting false.
Please explain me where have I made mistake ?
Many thanks :)
The function didn't find the pattern in your image and this is why it returns false. Maybe the exact same code works with a different image.
I cannot directly answer to why this function did not find the pattern inside your image, but I would recommend different approaches to be less sensitive to the noise, so that the algorithm could detect properly your corners:
- Use findChessboardCornersSB instead of findChessboardCorners. According to the documentation it is more robust to noise and works faster for large images like yours. That's probably what you are looking for. I tried and with python it works properly with the image you posted. See the result below.
- Change the pattern shapes as shown in the doc for findChessboardCornersSB.
- Use less and bigger squares in your pattern. It's not helping to have so many squares.
For the next step you will need to use a non-symmetrical pattern. If your top-left square is white then the bottom right has to be black.
If you have additional problems with the square pattern, you could also change your approach using corners and switch to the circle pattern. All functions are available in opencv. In my case it worked better. See findCirclesGrid. If you use this method, you can run the "BlobDetector" to check how each circle is detected and configure some parameters to improve the accuracy.
Hope this helps!
EDIT:
Here is the python code to make it work from the downloaded image.
import cv2
import matplotlib.pyplot as plt
img = cv2.imread('img.jpg')
img_small = cv2.resize(img, (img.shape[1], img.shape[0]))
found, corners = cv2.findChessboardCornersSB(img_small, (75, 49), flags=0)
plt.imshow(cv2.cvtColor(img_small, cv2.COLOR_BGR2RGB), cmap='gray')
plt.scatter(corners[:, 0, 0], corners[:, 0, 1])
plt.show()

Comparing and combining two images in c++ using open cv

I have two pictures taken from two cameras disposed next to each other.The pictures are almost the same, but in one of them I have some rain drops and I need to compare and combine those two pictures in the same image in the end (eliminating of course the rains drops), as a result.
Actually I used a lot of methods that I found online but I did not get the result I need. First, I tried to find all the correspondence points and I found a lot of them and then, I tried to eliminate all the "bad" matches to have at the end only "good " points. And here is the tutorial I used.
Second, I used the affine transformation method provided by OpenCV library to know the transformation that happened to these points from one image to the other and unfortunately I got a very wrong result. (I think since I have the points and their coordinates I need to know the matrix M itself now ).
Here is the code I tried and sorry if it is very well written since a newbie and this my first time with this kind of work.
vector<KeyPoint> keypoints1;
vector<KeyPoint> keypoints2;
Mat descriptors1, descriptors2;
cv::Ptr<cv::AKAZE> akaze = cv::AKAZE::create();
akaze->detectAndCompute(Src1, cv::Mat(), keypoints1, descriptors1);
akaze->detectAndCompute(Src2, cv::Mat(), keypoints2, descriptors2);
vector<vector<cv::DMatch>> knnmatch_points;
cv::BFMatcher match(cv::NORM_HAMMING);
match.knnMatch(descriptors1, descriptors2, knnmatch_points, 2);
const double match_par = 0.4;
vector<cv::DMatch> goodMatch;
vector<cv::Point2f> match_point1;
vector<cv::Point2f> match_point2;
for (size_t i = 0; i < knnmatch_points.size(); ++i) {
double distance1 = knnmatch_points[i][0].distance;
double distance2 = knnmatch_points[i][1].distance;
if (distance1 <= distance2 * match_par) {
goodMatch.push_back(knnmatch_points[i][0]);
match_point1.push_back(keypoints1[knnmatch_points[i][0].queryIdx].pt);
match_point2.push_back(keypoints2[knnmatch_points[i][0].trainIdx].pt);
...
srcTri[i] = Point2f(keypoints1[knnmatch_points[i][0].queryIdx].pt);
dstTri[i] = Point2f(keypoints2[knnmatch_points[i][0].queryIdx].pt);
}
}
cv::Mat masks;
cv::Mat H = cv::findHomography(match_point1, match_point2, masks, cv::RANSAC, 3);
vector<cv::DMatch> inlinerMatch;
for (size_t i = 0; i < masks.rows; ++i) {
uchar *inliner = masks.ptr<uchar>(i);
if (inliner[0] == 1) {
inlinerMatch.push_back(goodMatch[i]);
}
}
warp_mat = getAffineTransform(srcTri, dstTri);
warpAffine(Src2, warp_dst, warp_mat, warp_dst.size());
namedWindow(warp_window, WINDOW_AUTOSIZE);
imshow(warp_window, warp_dst);
I think I can use other methods to get my result like binary search to find the error or gradient descent to minimize the system of equation but I am not really familiar with any of them .. so please if there is any help in this subject
Thanks in advance and any help will be very appreciated

Template Matching for Coins with OpenCV

I am undertaking a project that will automatically count values of coins from an input image. So far I have segmented the coins using some pre-processing with edge detection and using the Hough-Transform.
My question is how do I proceed from here? I need to do some template matching on the segmented images based on some previously stored features. How can I go about doing this.
I have also read about something called K-Nearest Neighbours and I feel it is something I should be using. But I am not too sure how to go about using it.
Research articles I have followed:
Coin
Detector
Coin
Recognition
One way of doing pattern matching is using cv::matchTemplate.
This takes an input image and a smaller image which acts as template. It compares the template against overlapped image regions computing the similarity of the template with the overlapped region. Several methods for computing the comparision are available.
This methods does not directly support scale or orientation invariance. But it is possible to overcome that by scaling candidates to a reference size and by testing against several rotated templates.
A detailed example of this technique is shown to detect pressence and location of 50c coins. The same procedure can be applied to the other coins.
Two programs will be built. One to create templates from the big image template for the 50c coin. And another one which will take as input those templates as well as the image with coins and will output an image where the 50c coin(s) are labelled.
Template Maker
#define TEMPLATE_IMG "50c.jpg"
#define ANGLE_STEP 30
int main()
{
cv::Mat image = loadImage(TEMPLATE_IMG);
cv::Mat mask = createMask( image );
cv::Mat loc = locate( mask );
cv::Mat imageCS;
cv::Mat maskCS;
centerAndScale( image, mask, loc, imageCS, maskCS);
saveRotatedTemplates( imageCS, maskCS, ANGLE_STEP );
return 0;
}
Here we load the image which will be used to construct our templates.
Segment it to create a mask.
Locate the center of masses of said mask.
And we rescale and copy that mask and the coin so that they ocupy a square of fixed size where the edges of the square are touching the circunference of mask and coin. That is, the side of the square has the same lenght in pixels as the diameter of the scaled mask or coin image.
Finally we save that scaled and centered image of the coin. And we save further copies of it rotated in fixed angle increments.
cv::Mat loadImage(const char* name)
{
cv::Mat image;
image = cv::imread(name);
if ( image.data==NULL || image.channels()!=3 )
{
std::cout << name << " could not be read or is not correct." << std::endl;
exit(1);
}
return image;
}
loadImage uses cv::imread to read the image. Verifies that data has been read and the image has three channels and returns the read image.
#define THRESHOLD_BLUE 130
#define THRESHOLD_TYPE_BLUE cv::THRESH_BINARY_INV
#define THRESHOLD_GREEN 230
#define THRESHOLD_TYPE_GREEN cv::THRESH_BINARY_INV
#define THRESHOLD_RED 140
#define THRESHOLD_TYPE_RED cv::THRESH_BINARY
#define CLOSE_ITERATIONS 5
cv::Mat createMask(const cv::Mat& image)
{
cv::Mat channels[3];
cv::split( image, channels);
cv::Mat mask[3];
cv::threshold( channels[0], mask[0], THRESHOLD_BLUE , 255, THRESHOLD_TYPE_BLUE );
cv::threshold( channels[1], mask[1], THRESHOLD_GREEN, 255, THRESHOLD_TYPE_GREEN );
cv::threshold( channels[2], mask[2], THRESHOLD_RED , 255, THRESHOLD_TYPE_RED );
cv::Mat compositeMask;
cv::bitwise_and( mask[0], mask[1], compositeMask);
cv::bitwise_and( compositeMask, mask[2], compositeMask);
cv::morphologyEx(compositeMask, compositeMask, cv::MORPH_CLOSE,
cv::Mat(), cv::Point(-1, -1), CLOSE_ITERATIONS );
/// Next three lines only for debugging, may be removed
cv::Mat filtered;
image.copyTo( filtered, compositeMask );
cv::imwrite( "filtered.jpg", filtered);
return compositeMask;
}
createMask does the segmentation of the template. It binarizes each of the BGR channels, does the AND of those three binarized images and performs a CLOSE morphologic operation to produce the mask.
The three debug lines copy the original image into a black one using the computed mask as a mask for the copy operation. This helped in chosing the proper values for the threshold.
Here we can see the 50c image filtered by the mask created in createMask
cv::Mat locate( const cv::Mat& mask )
{
// Compute center and radius.
cv::Moments moments = cv::moments( mask, true);
float area = moments.m00;
float radius = sqrt( area/M_PI );
float xCentroid = moments.m10/moments.m00;
float yCentroid = moments.m01/moments.m00;
float m[1][3] = {{ xCentroid, yCentroid, radius}};
return cv::Mat(1, 3, CV_32F, m);
}
locate computes the center of mass of the mask and its radius. Returning those 3 values in a single row mat in the form { x, y, radius }.
It uses cv::moments which calculates all of the moments up to the third order of a polygon or rasterized shape. A rasterized shape in our case. We are not interested in all of those moments. But three of them are useful here. M00 is the area of the mask. And the centroid can be calculated from m00, m10 and m01.
void centerAndScale(const cv::Mat& image, const cv::Mat& mask,
const cv::Mat& characteristics,
cv::Mat& imageCS, cv::Mat& maskCS)
{
float radius = characteristics.at<float>(0,2);
float xCenter = characteristics.at<float>(0,0);
float yCenter = characteristics.at<float>(0,1);
int diameter = round(radius*2);
int xOrg = round(xCenter-radius);
int yOrg = round(yCenter-radius);
cv::Rect roiOrg = cv::Rect( xOrg, yOrg, diameter, diameter );
cv::Mat roiImg = image(roiOrg);
cv::Mat roiMask = mask(roiOrg);
cv::Mat centered = cv::Mat::zeros( diameter, diameter, CV_8UC3);
roiImg.copyTo( centered, roiMask);
cv::imwrite( "centered.bmp", centered); // debug
imageCS.create( TEMPLATE_SIZE, TEMPLATE_SIZE, CV_8UC3);
cv::resize( centered, imageCS, cv::Size(TEMPLATE_SIZE,TEMPLATE_SIZE), 0, 0 );
cv::imwrite( "scaled.bmp", imageCS); // debug
roiMask.copyTo(centered);
cv::resize( centered, maskCS, cv::Size(TEMPLATE_SIZE,TEMPLATE_SIZE), 0, 0 );
}
centerAndScale uses the centroid and radius computed by locate to get a region of interest of the input image and a region of interest of the mask such that the center of the such regions is also the center of the coin and mask and the side length of the regions are equal to the diameter of the coin/mask.
These regions are later scaled to a fixed TEMPLATE_SIZE. This scaled region will be our reference template. When later on in the matching program we want to check if a detected candidate coin is this coin we will also take a region of the candidate coin, center and scale that candidate coin in the same way before performing template matching. This way we achieve scale invariance.
void saveRotatedTemplates( const cv::Mat& image, const cv::Mat& mask, int stepAngle )
{
char name[1000];
cv::Mat rotated( TEMPLATE_SIZE, TEMPLATE_SIZE, CV_8UC3 );
for ( int angle=0; angle<360; angle+=stepAngle )
{
cv::Point2f center( TEMPLATE_SIZE/2, TEMPLATE_SIZE/2);
cv::Mat r = cv::getRotationMatrix2D(center, angle, 1.0);
cv::warpAffine(image, rotated, r, cv::Size(TEMPLATE_SIZE, TEMPLATE_SIZE));
sprintf( name, "template-%03d.bmp", angle);
cv::imwrite( name, rotated );
cv::warpAffine(mask, rotated, r, cv::Size(TEMPLATE_SIZE, TEMPLATE_SIZE));
sprintf( name, "templateMask-%03d.bmp", angle);
cv::imwrite( name, rotated );
}
}
saveRotatedTemplates saves the previous computed template.
But it saves several copies of it, each one rotated by an angle, defined in ANGLE_STEP. The goal of this is to provide orientation invariance. The lower that we define stepAngle the better orientation invariance we get but it also implies a higher computational cost.
You may download the whole template maker program here.
When run with ANGLE_STEP as 30 I get the following 12 templates :
Template Matching.
#define INPUT_IMAGE "coins.jpg"
#define LABELED_IMAGE "coins_with50cLabeled.bmp"
#define LABEL "50c"
#define MATCH_THRESHOLD 0.065
#define ANGLE_STEP 30
int main()
{
vector<cv::Mat> templates;
loadTemplates( templates, ANGLE_STEP );
cv::Mat image = loadImage( INPUT_IMAGE );
cv::Mat mask = createMask( image );
vector<Candidate> candidates;
getCandidates( image, mask, candidates );
saveCandidates( candidates ); // debug
matchCandidates( templates, candidates );
for (int n = 0; n < candidates.size( ); ++n)
std::cout << candidates[n].score << std::endl;
cv::Mat labeledImg = labelCoins( image, candidates, MATCH_THRESHOLD, false, LABEL );
cv::imwrite( LABELED_IMAGE, labeledImg );
return 0;
}
The goal here is to read the templates and the image to be examined and determine the location of coins which match our template.
First we read into a vector of images all the template images we produced in the previous program.
Then we read the image to be examined.
Then we binarize the image to be examined using exactly the same function as in the template maker.
getCandidates locates the groups of points which are toghether forming a polygon. Each of these polygons is a candidate for coin. And all of them are rescaled and centered in a square of size equal to that of our templates so that we can perform matching in a way invariant to scale.
We save the candidate images obtained for debugging and tuning purposes.
matchCandidates matches each candidate with all the templates storing for each the result of the best match. Since we have templates for several orientations this provides invariance to orientation.
Scores of each candidate are printed so we can decide on a threshold to separate 50c coins from non 50c coins.
labelCoins copies the original image and draws a label over the ones which have a score greater than (or lesser than for some methods) the threshold defined in MATCH_THRESHOLD.
And finally we save the result in a .BMP
void loadTemplates(vector<cv::Mat>& templates, int angleStep)
{
templates.clear( );
for (int angle = 0; angle < 360; angle += angleStep)
{
char name[1000];
sprintf( name, "template-%03d.bmp", angle );
cv::Mat templateImg = cv::imread( name );
if (templateImg.data == NULL)
{
std::cout << "Could not read " << name << std::endl;
exit( 1 );
}
templates.push_back( templateImg );
}
}
loadTemplates is similar to loadImage. But it loads several images instead of just one and stores them in a std::vector.
loadImage is exactly the same as in the template maker.
createMask is also exactly the same as in the tempate maker. This time we apply it to the image with several coins. It should be noted that binarization thresholds were chosen to binarize the 50c and those will not work properly to binarize all the coins in the image. But that is of no consequence since the program objective is only to identify 50c coins. As long as those are properly segmented we are fine. It actually works in our favour if some coins are lost in this segmentation since we will save time evaluating them (as long as we only lose coins which are not 50c).
typedef struct Candidate
{
cv::Mat image;
float x;
float y;
float radius;
float score;
} Candidate;
void getCandidates(const cv::Mat& image, const cv::Mat& mask,
vector<Candidate>& candidates)
{
vector<vector<cv::Point> > contours;
vector<cv::Vec4i> hierarchy;
/// Find contours
cv::Mat maskCopy;
mask.copyTo( maskCopy );
cv::findContours( maskCopy, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE, cv::Point( 0, 0 ) );
cv::Mat maskCS;
cv::Mat imageCS;
cv::Scalar white = cv::Scalar( 255 );
for (int nContour = 0; nContour < contours.size( ); ++nContour)
{
/// Draw contour
cv::Mat drawing = cv::Mat::zeros( mask.size( ), CV_8UC1 );
cv::drawContours( drawing, contours, nContour, white, -1, 8, hierarchy, 0, cv::Point( ) );
// Compute center and radius and area.
// Discard small areas.
cv::Moments moments = cv::moments( drawing, true );
float area = moments.m00;
if (area < CANDIDATES_MIN_AREA)
continue;
Candidate candidate;
candidate.radius = sqrt( area / M_PI );
candidate.x = moments.m10 / moments.m00;
candidate.y = moments.m01 / moments.m00;
float m[1][3] = {
{ candidate.x, candidate.y, candidate.radius}
};
cv::Mat characteristics( 1, 3, CV_32F, m );
centerAndScale( image, drawing, characteristics, imageCS, maskCS );
imageCS.copyTo( candidate.image );
candidates.push_back( candidate );
}
}
The heart of getCandidates is cv::findContours which finds the contours of areas present in its input image. Which here is the mask previously computed.
findContours returns a vector of contours. Each contour itself being a vector of points which form the outer line of the detected polygon.
Each polygon delimites the region of each candidate coin.
For each contour we use cv::drawContours to draw the filled polygon over a black image.
With this drawn image we use the same procedure earlier explained to compute centroid and radius of the polygon.
And we use centerAndScale, the same function used in the template maker, to center and scale the image contained in that poligon in an image which will have the same size as our templates. This way we will later on be able to perform a proper matching even for coins from photos of different scales.
Each of these candidate coins is copied in a Candidate structure which contains :
Candidate image
x and y for centroid
radius
score
getCandidates computes all these values except for score.
After composing the candidate it is put in a vector of candidates which is the result we get from getCandidates.
These are the 4 candidates obtained :
void saveCandidates(const vector<Candidate>& candidates)
{
for (int n = 0; n < candidates.size( ); ++n)
{
char name[1000];
sprintf( name, "Candidate-%03d.bmp", n );
cv::imwrite( name, candidates[n].image );
}
}
saveCandidates saves the computed candidates for debugging purpouses. And also so that I may post those images here.
void matchCandidates(const vector<cv::Mat>& templates,
vector<Candidate>& candidates)
{
for (auto it = candidates.begin( ); it != candidates.end( ); ++it)
matchCandidate( templates, *it );
}
matchCandidates just calls matchCandidate for each candidate. After completion we will have the score for all candidates computed.
void matchCandidate(const vector<cv::Mat>& templates, Candidate& candidate)
{
/// For SQDIFF and SQDIFF_NORMED, the best matches are lower values. For all the other methods, the higher the better
candidate.score;
if (MATCH_METHOD == CV_TM_SQDIFF || MATCH_METHOD == CV_TM_SQDIFF_NORMED)
candidate.score = FLT_MAX;
else
candidate.score = 0;
for (auto it = templates.begin( ); it != templates.end( ); ++it)
{
float score = singleTemplateMatch( *it, candidate.image );
if (MATCH_METHOD == CV_TM_SQDIFF || MATCH_METHOD == CV_TM_SQDIFF_NORMED)
{
if (score < candidate.score)
candidate.score = score;
}
else
{
if (score > candidate.score)
candidate.score = score;
}
}
}
matchCandidate has as input a single candidate and all the templates. It's goal is to match each template against the candidate. That work is delegated to singleTemplateMatch.
We store the best score obtained, which for CV_TM_SQDIFF and CV_TM_SQDIFF_NORMED is the smallest one and for the other matching methods is the biggest one.
float singleTemplateMatch(const cv::Mat& templateImg, const cv::Mat& candidateImg)
{
cv::Mat result( 1, 1, CV_8UC1 );
cv::matchTemplate( candidateImg, templateImg, result, MATCH_METHOD );
return result.at<float>( 0, 0 );
}
singleTemplateMatch peforms the matching.
cv::matchTemplate uses two imput images, the second smaller or equal in size to the first one.
The common use case is for a small template (2nd parameter) to be matched against a larger image (1st parameter) and the result is a bidimensional Mat of floats with the matching of the template along the image. Locating the maximun (or minimun depending on the method) of this Mat of floats we get the best candidate position for our template in the image of the 1st parameter.
But we are not interested in locating our template in the image, we already have the coordinates of our candidates.
What we want is to get a measure of similitude between our candidate and template. Which is why we use cv::matchTemplate in a way which is less usual; we do so with a 1st parameter image of size equal to the 2nd parameter template. In this situation the result is a Mat of size 1x1. And the single value in that Mat is our score of similitude (or dissimilitude).
for (int n = 0; n < candidates.size( ); ++n)
std::cout << candidates[n].score << std::endl;
We print the scores obtained for each of our candidates.
In this table we can see the scores for each of the methods available for cv::matchTemplate. The best score is in green.
CCORR and CCOEFF give a wrong result, so those two are discarded. Of the remaining 4 methods the two SQDIFF methods are the ones with higher relative difference between the best match (which is a 50c) and the 2nd best (which is not a 50c). Which is why I have choosen them.
I have chosen SQDIFF_NORMED but there is no strong reason for that. In order to really chose a method we should test with a higher ammount of samples, not just one.
For this method a working threshold could be 0.065. Selection of a proper threshold also requires many samples.
bool selected(const Candidate& candidate, float threshold)
{
/// For SQDIFF and SQDIFF_NORMED, the best matches are lower values. For all the other methods, the higher the better
if (MATCH_METHOD == CV_TM_SQDIFF || MATCH_METHOD == CV_TM_SQDIFF_NORMED)
return candidate.score <= threshold;
else
return candidate.score>threshold;
}
void drawLabel(const Candidate& candidate, const char* label, cv::Mat image)
{
int x = candidate.x - candidate.radius;
int y = candidate.y;
cv::Point point( x, y );
cv::Scalar blue( 255, 128, 128 );
cv::putText( image, label, point, CV_FONT_HERSHEY_SIMPLEX, 1.5f, blue, 2 );
}
cv::Mat labelCoins(const cv::Mat& image, const vector<Candidate>& candidates,
float threshold, bool inverseThreshold, const char* label)
{
cv::Mat imageLabeled;
image.copyTo( imageLabeled );
for (auto it = candidates.begin( ); it != candidates.end( ); ++it)
{
if (selected( *it, threshold ))
drawLabel( *it, label, imageLabeled );
}
return imageLabeled;
}
labelCoins draws a label string at the location of candidates with a score bigger than ( or lesser than depending on the method) the threshold.
And finally the result of labelCoins is saved with
cv::imwrite( LABELED_IMAGE, labeledImg );
The result being :
The whole code for the coin matcher can be downloaded here.
Is this a good method?
That is hard to tell.
The method is consistent. It correctly detects the 50c coin for the sample and input image provided.
But we have no idea if the method is robust because it has not been tested with a proper sample size. And even more important is to test it against samples which were not available when the program was being coded, that is the true measure of robustness when done with a large enough sample size.
I am rather confident in the method not having false positives from silver coins. But I am not so sure about other copper coins like the 20c. As we can see from the scores obtained the 20c coin gets a score very similar to the 50c.
It is also quite possible that false negatives will happen under varying lighting conditions. Which is something that can and should be avoided if we have control over lighting conditions such as when we are designing a machine to take photos of coins and count them.
If the method works the same method can be repeated for each type of coin leading to full detection of all coins.
Code in this answer is also available under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
If you detect all coins correctly Its better to use size(radial) and RGB features to recognize its value. Its not a good idea that concatenate these features because their number are not equal ( size is one number and number of RGB features are much larger than one). I recommend you to use two classifier for this purpose. One for size and another for RGB features.
You have to classify all coins into for example 3 (It depends on type
of your coins) size class. You can do this with a simple 1NN
classifier (just calculate the radial of test coin and classify it to
nearest predefined radial)
Then you should have some templates in each size and use template matching to recognize its value.(all templates and detected coins should be resize to a particular size. e.g. (100,100) ) For template
matching you can use matchtemplate function. I thing that the CV_TM_CCOEFF method may be the best one, but you can test all methods
to get a good result. (Note you don't need to search on image for coin because you detect the coin previously as you mentioned in your
question. You just need to use this function to get one number as a similarity/difference between two image and classify the test coin to a class which the similarity is maximized or difference is minimized)
EDIT1: You should have all rotations in your templates in each class to compensate the rotation of test coin.
EDIT2: If all coins are in different sizes the first step is enough. Otherwise you should patch the similar sizes to one class and classify the test coin using the second step (RGB features).
(1) Find the coins edge, using Hough Transform Algorithm.
(2) Determine the origin dot of the coins. I don't know how you'll do this.
(3) You can use k from KNN Algorithm for comparing the diameter or of the coins. Don't forget to set the bias value.
You could try and set up a training set of coin images and generate SIFT/SURF etc. descriptors of it. (EDIT: OpenCV feature detectors
Using these data you could set up a kNN classifier, using the coins values as training labels.
Once you perform kNN classification on you segmented coin images, your classification result would yield the coins value.

OpenCV Sobel Filters resulting in almost completely black images

I am having some issues with my sobel_y (and sobel_x, but I figure they are having the same issue) filter in that it keeps giving me an image that it basically only black and white. I am having to rewrite this function for a class, so no I cannot use the built-in one, and had it working, minus some minor tweaks because the output image looked a little strange with still being black and white even though it was supposed to be converted back. I figured out how to fix that, and in the process I messed with something and broke it and cannot seem to get it back to working even with the black and white image output only. I keep getting a black image, with some white lines here and there near the top. I have tried changing the Mat grayscale type (third parameter) to all different values, as my professor mentioned in the class that we are using 32 bit floating point images, but that did not help either.
Even though the issue occurs after running the Studentfilter2D, I think it is a problem with the grayscaling of the image, although whenever I debug, it seems to work just fine. This is also because I have 2 other filtering functions I had to write that use Studentfilter2D, and they both give me the expected results. My sobel_y function is shown below:
// Convert the image in bgr to grayscale OK to use the OpenCV function.
// Find the coefficients used by the OpenCV function, and give a link where you found it.
// Note: This student function expects the matrix gray to be preallocated with the same width and
// height, but with 1 channel.
void BGR2Gray(Mat& bgr, Mat& gray)
{
// Y = .299 * R + .587 * G + .114 * B, from http://docs.opencv.org/modules/imgproc/doc/miscellaneous_transformations.html#cvtcolor
// Some extra assistance, for the third parameter for the InputArray, from http://docs.opencv.org/trunk/modules/core/doc/basic_structures.html#inputarray
// Not sure about the fourth parameter, but was just trying it to see if that may be the issue as well
cvtColor(bgr, gray, CV_BGR2GRAY, 1);
return;
}
// Convolve image with kernel - this routine will be called from the other
// subroutines! (gaussian, sobel_x and sobel_y)
// image is single channel. Do not use the OpenCV filter2D!!
// Implementation can be with the .at or similar to the
// basic method found in the Chapter 2 of the OpenCV tutorial in CANVAS,
// or online at the OpenCV documentation here:
// http://docs.opencv.org/doc/tutorials/core/mat-mask-operations/mat-mask operations.html
// In our code the image and the kernel are both floats (so the sample code will need to change)
void Studentfilter2D (Mat& image, Mat& kernel)
{
int kCenterX = kernel.cols / 2;
int kCenterY = kernel.rows / 2;
// Algorithm help from http://www.songho.ca/dsp/convolution/convolution.html
for (int iRows = 0; iRows < image.rows; iRows++)
{
for (int iCols = 0; iCols < image.cols; iCols++)
{
float result = 0.0;
for (int kRows = 0; kRows < kernel.rows; kRows++)
{
// Flip the rows for the convolution
int kRowsFlipped = kernel.rows - 1 - kRows;
for (int kCols = 0; kCols < kernel.cols; kCols++)
{
// Flip the columns for the convolution
int kColsFlipped = kernel.cols - 1 - kCols;
// Indices of shifting around the convolution
int iRowsIndex = iRows + kRows - kCenterY;
int iColsIndex = iCols + kCols - kCenterX;
// Check bounds using the indices
if (iRowsIndex >= 0 && iRowsIndex < image.rows && iColsIndex >= 0 && iColsIndex < image.cols)
{
result += image.at<float>(iRowsIndex, iColsIndex) * kernel.at<float>(kRowsFlipped, kColsFlipped);
}
}
}
image.at<float>(iRows, iCols) = result;
}
}
return;
}
void sobel_y (Mat& image, int)
{
// Note, the filter parameter int is unused.
Mat mask = (Mat_<float>(3, 3) << 1, 2, 1,
0, 0, 0,
-1, -2, -1) / 3;
//Mat grayscale(image.rows, image.cols, CV_32FC1);
BGR2Gray(image, image);
Studentfilter2D(image, mask);
// Here is the documentation on normalize http://docs.opencv.org/modules/core/doc/operations_on_arrays.html#normalize
normalize(image, image, 0, 1, NORM_MINMAX);
cvtColor(image, image, CV_GRAY2BGR);
return;
}
Like I said, I had this working before, just looking for some fresh eyes to look at it and see what I may be missing. I have been looking at this same code so much for the past 4 days that I think I am just missing things. In case anyone is wondering, I have also tried changing the mask values of the filter, but to no avail.
There are two things that are worth mentioning.
The first is that you are not taking proper care of the type of your matrices/images.
The input to Studentfilter2D in sobel_y is an 8-bit grayscale image of type CV_8UC1 meaning that the data is an array of unsigned char.
Your Studentfilter2D function, however, is indexing this input image as though it was of type float. This means it is picking the wrong pixels to work with.
If the above does not immediately solve your problem, you should consider the range of your final derivative image. Since it is a derivative it will no longer be in the range [0, 255]. Instead, it might even contain negative numbers. When you try to visualize this, you will run into problems unless you first normalize your image.
There are built in functions to do this in OpenCV if you look around in the documentation.

How can I use OpenCV to find an arbitrarily transformed rectangle in a depth image?

I'm attempting to work with a depth sensor to add positional tracking to the Oculus Rift dev kit. However, I'm having trouble with the sequence of operations producing a usable result.
I'm starting with a 16 bit depth image, where the values sort of (but not really) correspond to millimeters. Undefined values in the image have already been set to 0.
First I'm eliminating everything outside a certain near and far distance by updating a mask image to exclude them.
cv::Mat result = cv::Mat::zeros(depthImage.size(), CV_8UC3);
cv::Mat depthMask;
depthImage.convertTo(depthMask, CV_8U);
for_each_pixel<DepthImagePixel, uint8_t>(depthImage, depthMask,
[&](DepthImagePixel & depthPixel, uint8_t & maskPixel){
if (!maskPixel) {
return;
}
static const uint16_t depthMax = 1200;
static const uint16_t depthMin = 200;
if (depthPixel < depthMin || depthPixel > depthMax) {
maskPixel = 0;
}
});
Next, since the feature I want is likely to be closer to the camera than the overall scene average, I update the mask again to exclude anything that isn't within a certain range of the median value:
const float depthAverage = cv::mean(depthImage, depthMask)[0];
const uint16_t depthMax = depthAverage * 1.0;
const uint16_t depthMin = depthAverage * 0.75;
for_each_pixel<DepthImagePixel, uint8_t>(depthImage, depthMask,
[&](DepthImagePixel & depthPixel, uint8_t & maskPixel){
if (!maskPixel) {
return;
}
if (depthPixel < depthMin || depthPixel > depthMax) {
maskPixel = 0;
}
});
Finally, I zero out everything that's not in the mask, and scale the remaining values to between 10 & 255 before converting the image format to 8 bit
cv::Mat outsideMask;
cv::bitwise_not(depthMask, outsideMask);
// Zero out outside the mask
cv::subtract(depthImage, depthImage, depthImage, outsideMask);
// Within the mask, normalize to the range + X
cv::subtract(depthImage, depthMin, depthImage, depthMask);
double minVal, maxVal;
minMaxLoc(depthImage, &minVal, &maxVal);
float range = depthMax - depthMin;
float scale = (((float)(UINT8_MAX - 10) / range));
depthImage *= scale;
cv::add(depthImage, 10, depthImage, depthMask);
depthImage.convertTo(depthImage, CV_8U);
The results looks like this:
I'm pretty happy with this section of the code, since it produces pretty clear visual features.
I'm then applying a couple of smoothing operations to get rid of the ridiculous amount of noise from the depth camera:
cv::medianBlur(depthImage, depthImage, 9);
cv::Mat blurred;
cv::bilateralFilter(depthImage, blurred, 5, 250, 250);
depthImage = blurred;
cv::Mat result = cv::Mat::zeros(depthImage.size(), CV_8UC3);
cv::insertChannel(depthImage, result, 0);
Again, the features look pretty clear visually, but I wonder if they couldn't be sharpened somehow:
Next I'm using canny for edge detection:
cv::Mat canny_output;
{
cv::Canny(depthImage, canny_output, 20, 80, 3, true);
cv::insertChannel(canny_output, result, 1);
}
The lines I'm looking for are there, but not well represented towards the corners:
Finally I'm using probabilistic Hough to identify lines:
std::vector<cv::Vec4i> lines;
cv::HoughLinesP(canny_output, lines, pixelRes, degreeRes * CV_PI / 180, hughThreshold, hughMinLength, hughMaxGap);
for (size_t i = 0; i < lines.size(); i++)
{
cv::Vec4i l = lines[i];
glm::vec2 a((l[0], l[1]));
glm::vec2 b((l[2], l[3]));
float length = glm::length(a - b);
cv::line(result, cv::Point(l[0], l[1]), cv::Point(l[2], l[3]), cv::Scalar(0, 0, 255), 3, CV_AA);
}
This results in this image
At this point I feel like I've gone off the rails, because I can't find a good set of parameters for Hough to produce a reasonable number of candidate lines in which to search for my shape, and I'm not sure if I should be fiddling with Hough or looking at improving the outputs of the prior steps.
Is there a good way of objectively validating my results at each stage, as opposed to just fiddling with the input values until I think it 'looks good'? Is there a better approach to finding the rectangle given the starting image (and given that it won't necessarily be oriented in a particular direction?
Very cool project!
Though, I feel like your approach does not use all the info that you could get from the depthmap (e.g. 3D points, normals, etc), which would help a lot.
The Point Cloud Library (PCL), which is a C++ library dedicated to the processing of RGB-D data, has a tutorial on plane segmentation using RANSAC which could inspire you. You might not want to use PCL in your program, due to the numerous dependencies, however as it is open-source, you can find the algorithm implementation on Github (PCL SAC segmentation). However, RANSAC might be slow and produce unwanted results depending on the scene.
You could also try to use the approach presented in "Real-Time Plane Segmentation
using RGB-D Cameras" by Holz, Holzer, Rusu and Behnke, 2011 (PDF), which suggests fast normal estimation using integral images followed by plane detection using clustering of normals.