OpenCV Detecting Multiple, Rotated, Scaled objects - c++

I've only been using OpenCV for 12 hours or so and haven't been able to solve this problem. The end goal is to take an image and store each character as an entry inside of 6 separate vector2 arrays (5 chars + bubbles)
Additionally, I need to know whether a character is "enlarged" or not.
Link to resources: https://imgur.com/a/lT5HA
As you can tell at any given moment there's a ton of stuff going on, making this a somewhat difficult task. I know that it's possible, though - Robotmon identifies each character with almost 100% accuracy - the only downfall is that the "enlarged" characters get identified 3 times (the distance when discarding duplicates just doesn't work on the big ones due to them being large enough to register multiple times).
All characters are tagged with a single color and characters from the same color group won't appear in the same match.
I'm sure I'm making a ton of errors - I'm not finding much useful information on OpenCV for this usecase. A decent amount is trial and error + looking inside of the files.
For instance, I'm sure that if I were to add all of the characters appearing in a screenshot, searched for all of them, and then compared the "scores" I'd be able to rule out a few false identifications (because the character would be accurately claimed).
To restate my question:
How do I identify every character within the images with no false positives (including small characters being "sent" to the score or transparent characters fading away), all characters identified accurately, and with the enlarged characters identified separately? (And using OpenCL perhaps?)
#include "stdafx.h"
#include <opencv2/opencv.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/core.hpp>
#include <opencv2/features2d.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/xfeatures2d.hpp>
#include <iostream>
#include <stdio.h>
#include <string>
using namespace cv;
using namespace std;
using namespace cv::xfeatures2d;
int MatchFunction();
int main()
{
MatchFunction();
waitKey(0);
return 0;
}
int MatchFunction()
{
Mat Image_Scene = imread("Bubbles.jpg");
Mat image_Object = imread("block_peterpan_s.png");
// Check for invalid input
if (Image_Scene.empty() || image_Object.empty())
{
cout << "Could not open or find the image" << endl;
return 0;
}
// Initiate ORB detector
Ptr<ORB> detector = ORB::create();
//detector->setMaxFeatures(50);
detector->setScaleFactor(1.1);
detector->setNLevels(12);
//detector->setPatchSize(16);
std::vector<KeyPoint> keypoints_object, keypoints_scene;
Mat descriptors_object, descriptors_scene;
// find the keypoints and descriptors with ORB
detector->detect(image_Object, keypoints_object);
detector->detect(Image_Scene, keypoints_scene);
detector->compute(image_Object, keypoints_object, descriptors_object);
detector->compute(Image_Scene, keypoints_scene, descriptors_scene);
//-- Step 3: Matching descriptor vectors with a brute force matcher
//BFMatcher matcher(NORM_HAMMING, true); //BFMatcher matcher(NORM_L2);
//Ptr<BFMatcher> matcher = BFMatcher::create(); //Ptr<ORB> detector = ORB::create();
Ptr<BFMatcher> matcher = BFMatcher::create(NORM_HAMMING, true);
vector<DMatch> matches;
matcher->match(descriptors_object, descriptors_scene, matches);
//matcher.match(descriptors_object, descriptors_scene, matches);
vector<DMatch> good_matches;
//vector<Point2f> featurePoints1;
//vector<Point2f> featurePoints2;
//Sort the matches by adding them 1 by 1 to good_matches
//for (int i = 0; i<int(matches.size()); i++) { //Size is basically length
// good_matches.push_back(matches[i]);
//}
string k = to_string((matches.size()));
cout << k << endl;
//cout << " Usage: ./SURF_FlannMatcher <img1> <img2>" << std::endl;
double max_dist = 0; double min_dist = 100;
//-- Quick calculation of max and min distances between keypoints
for (int i = 0; i < int(matches.size()); i++)
{
//cout << to_string(i) << endl;
double dist = matches[i].distance;
if (dist < min_dist) min_dist = dist;
if (dist > max_dist) max_dist = dist;
}
//printf("-- Max dist : %f \n", max_dist);
//printf("-- Min dist : %f \n", min_dist);
//-- Draw only "good" matches (i.e. whose distance is less than 2*min_dist,
//-- or a small arbitary value ( 0.02 ) in the event that min_dist is very
//-- small)
//-- PS.- radiusMatch can also be used here.
//std::vector< DMatch > good_matches;
for (int i = 0; i < int(matches.size()); i++)
{
if (matches[i].distance <= max(4 * min_dist, 0.02))
{
good_matches.push_back(matches[i]);
}
}
Mat img_matches;
drawMatches(image_Object, keypoints_object, Image_Scene, keypoints_scene,
good_matches, img_matches, Scalar::all(-1), Scalar::all(-1),
vector<char>(), DrawMatchesFlags::NOT_DRAW_SINGLE_POINTS);
imshow("Good Matches", img_matches);
}

Template matching would be the most naive approach, but you'd have to brute-force the same template at different scales and rotations for each object. If you have the information about the possible number of such scales/rotations in the game, this will narrow down the number of iterations drastically.
The image seems free from noise, distortion or occlusions, so machine learning approach is not really necessary in this case. If you want something more efficient and are familiar with scientific language and implementing algorithms, take a look at this study which uses radial and circular filters to narrow down the number of combinations:
http://pdfs.semanticscholar.org/c965/0f78bf9d18eba3850281841dc7ddd20e8d5e.pdf
The algorithm or more specifically the filters can be parallelized with OpenCL or any other library if needed. On modern machines this shouldn't be necessary as serial implementation works quite fast.
I successfully implemented it some time ago and it works well and is fast enough to solve your problem with near-realtime performance. RGB won't be neccessary. To my best knowledge it is not implemented anywhere as open-source code, but you can also try looking up for scale and rotation invariant template matching and see what comes up.

Related

opencv program crash after matching descriptors

I'm using SIFT descriptor and FLANN matcher to get the minimum distance of the matches between two pictures. The first picture is a query picture and the second picture is from the dataset. I want to load the second picture one by one using a loop, but right after the first iteration, the program crashes at runtime after displaying the result of the first iteration and fails to enter the second iteration. I'm using opencv 2.4.13 and vs2017. Below is my code:
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/nonfree/nonfree.hpp"
#include "opencv2/nonfree/features2d.hpp"
#include <opencv2/legacy/legacy.hpp>
#include <iostream>
#include <stdio.h>
#include <algorithm>
#include <vector>
using namespace std;
using namespace cv;
#define IMAGE_folder "D:\\Project\\dataset1" // change to your folder location
int main()
{
initModule_nonfree();
const int filename_len = 900;
char tempname1[filename_len];
sprintf_s(tempname1, filename_len, "%s\\%s", IMAGE_folder, "995.jpg");
Mat i1 = imread(tempname1);
Mat img1, img2;
cvtColor(i1, img1, COLOR_BGR2GRAY);
if (!i1.data)
{
printf("Cannot find the input image!\n");
system("pause");
return -1;
}
std::vector<KeyPoint> key_points1, key_points2;
Mat descriptors1, descriptors2;
SIFT sift;
sift(img1, Mat(), key_points1, descriptors1);
for (int i = 990; i < 1000; i++)
{
initModule_nonfree();
cout << i << endl;
char tempname2[filename_len];
sprintf_s(tempname2, filename_len, "%s\\%d.jpg", IMAGE_folder, i);
Mat i2 = imread(tempname2);
if (!i2.data)
{
printf("Cannot find the input image!\n");
system("pause");
return -1;
}
cvtColor(i2, img2, COLOR_BGR2GRAY);
sift(img2, Mat(), key_points2, descriptors2);
FlannBasedMatcher matcher;
vector<DMatch> matches;
matches.clear();
matcher.match(descriptors1, descriptors2, matches); //if I comment this line and the code below to show the distance, it can work
double max_dist = 0;
double min_dist = 100;
for (int j = 0; j < descriptors1.rows; j++)
{
double dist = matches[j].distance;
if (dist < min_dist)
min_dist = dist;
if (dist > max_dist)
max_dist = dist;
}
//matcher.clear(); //I've tried these four but still cannot help
//matches.clear();
//key_points1.clear();
//key_points2.clear();
printf("-- Max dist : %f \n", max_dist);
printf("-- Min dist : %f \n", min_dist);
cout << "Done!" << endl;
}
//waitKey(0);
return 0;
}
I've tried a lot and here are my problems and observation:
It is not because I use iterations. If I only run the matcher.match(descriptors1, descriptors2, matches); , it will also crash after execution.
It is also not working for SURF descriptor or BruteForceMatcher, both have the same problem with the code above. I've used different pieces of code from opencv tutorial using SURF, and it still crashes after displaying the result. Example code from opencv tutorial see here
I've also tried initModule_nonfree(); as some answers said, but still didn't help.
The program crashes after displaying "Done!" and not entering the next iteration.
If I delete the matcher.match(descriptors1, descriptors2, matches); and the relevant code below, it can work properly. So the problem must be with the "match" function.
Thanks a lot in advance!
-----------------------------updated-----------------------------
My includes and linked libraries are shown below:
C:\Program Files (x86)\opencv\build\include
C:\Program Files (x86)\opencv\build\include\opencv
C:\Program Files (x86)\opencv\build\include\opencv2
C:\Program Files (x86)\opencv\build\x64\vc12\staticlib
C:\Program Files (x86)\opencv\build\x64\vc12\lib
opencv_objdetect2413.lib
opencv_ts2413.lib
opencv_video2413.lib
opencv_nonfree2413.lib
opencv_ocl2413.lib
opencv_photo2413.lib
opencv_stitching2413.lib
opencv_superres2413.lib
opencv_videostab2413.lib
opencv_calib3d2413.lib
opencv_contrib2413.lib
opencv_core2413.lib
opencv_features2d2413.lib
opencv_flann2413.lib
opencv_gpu2413.lib
opencv_highgui2413.lib
opencv_imgproc2413.lib
opencv_legacy2413.lib
opencv_ml2413.lib
I think the configuration may not have problems... I use it in release mode under x86, thanks!
I've found out the problem. The code works for opencv2.4.13 and vs2012 instead of vs2017. Maybe it's just because opencv2.4.13 is not compatible with vs2017.

OpenCV Image Stitching with Image Resolutions greater than 1080 * 1080

I've been working with OpenCV to stitch two images together on a Raspberry Pi and on a Windows OS based PC.
#include <stdio.h>
#include <iostream>
#include "opencv2/opencv.hpp"
#include "opencv2/core/core.hpp"
#include "opencv2/features2d/features2d.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/nonfree/nonfree.hpp"
#include "opencv2/calib3d/calib3d.hpp"
#include "opencv2/imgproc/imgproc.hpp"
using namespace cv;
int main (int argc, char** argv) {
Mat image_1 = imread (argv[1]);
Mat image_2 = imread (argv[2]);
Mat gray_image_1;
Mat gray_image_2;
cvtColor (image_1, gray_image_1, CV_RGB2GRAY);
cvtColor (image_2, gray_image_2, CV_RGB2GRAY);
// Check if image files can be read
if (!gray_image_1.data) {
std::cout << "Error Reading Image 1" << std::endl;
return 0;
}
if (!gray_image_2.data) {
std::cout << "Error Reading Image 2" << std::endl;
return 0;
}
// Detect the keypoints using SURF Detector
// Based from Anna Huaman's 'Features2D + Homography to find a known object' Tutorial
int minHessian = 50;
SurfFeatureDetector detector (minHessian);
std::vector <KeyPoint> keypoints_object, keypoints_scene;
detector.detect (gray_image_2, keypoints_object);
detector.detect (gray_image_1, keypoints_scene);
// Calculate Feature Vectors (descriptors)
// Based from Anna Huaman's 'Features2D + Homography to find a known object' Tutorial
SurfDescriptorExtractor extractor;
Mat descriptors_object, descriptors_scene;
extractor.compute (gray_image_2, keypoints_object, descriptors_object);
extractor.compute (gray_image_1, keypoints_scene, descriptors_scene);
// Matching descriptor vectors using FLANN matcher
// Based from Anna Huaman's 'Features2D + Homography to find a known object' Tutorial
FlannBasedMatcher matcher;
std::vector <DMatch> matches;
matcher.match (descriptors_object, descriptors_scene, matches);
double max_dist = 0;
double min_dist = 100;
// Quick calculation of max and min distances between keypoints
// Based from Anna Huaman's 'Features2D + Homography to find a known object' Tutorial
for (int i = 0; i < descriptors_object.rows; i++) {
double dist = matches[i].distance;
if (dist < min_dist) {
min_dist = dist;
}
}
// Use matches that have a distance that is less than 3 * min_dist
std::vector <DMatch> good_matches;
for (int i = 0; i < descriptors_object.rows; i++){
if (matches[i].distance < 3 * min_dist) {
good_matches.push_back (matches[i]);
}
}
std::vector <Point2f> obj;
std::vector <Point2f> scene;
for (int i = 0; i < good_matches.size(); i++) {
// Get the keypoints from the good matches
obj.push_back (keypoints_object[good_matches[i].queryIdx].pt);
scene.push_back (keypoints_scene[good_matches[i].trainIdx].pt);
}
// Find the Homography Matrix
Mat H = findHomography (obj, scene, CV_RANSAC);
// Use the Homography Matrix to warp the images
cv::Mat result;
warpPerspective (image_2, result, H, cv::Size (image_2.cols + image_1.cols, image_2.rows));
cv::Mat half (result, cv::Rect (0, 0, image_1.cols, image_1.rows));
image_1.copyTo (half);
// Write image
imwrite("Update.jpg", result);
waitKey (0);
return 0;
}
The two images I use as inputs result in success. But, only when those two images have resolutions of <= 1080 * 1080 pixels.
For 1440 * 1440 and 1944 * 1944 resolutions I found that the findHomography couldn't function because I was no longer getting more than 3 good matches. findHomography needs at least 4 good matches.
I have tried...
cv::resize(the input images) - results in no resolution size images producing enough good matches for the findHomography.
min Hessian increased or decreased - no change
minimum distance increased or decreased - no change
Note: Both images overlap and have the same dimensions.
Does anyone have a solution to this problem? I have spent a few hours researching this issue and only being lead to the conclusion that OpenCV Image Stitching cannot process high resolution images.
Below I'll include two high resolution images for anyone wishing to help.
colour_1_1440
colour_2_1440
I was using OpenCV 2.4.13 and not the new OpenCV 3.1.0.
Based from Martin Matilla's comment:
"are you sure you are not discarding good matches in the distance filter section? if (matches[i].distance < 3 * min_dist)" – Martin Matilla 53 mins ago
The solution did lie at 3 * min_dist. I changed the value '3' to '4' to allow for high resolution images to be processed.
Note: Originally I changed '3' to '30' and found that the 2nd input image was distorted as expected. <- Just to let anyone know :)

SIFT matching gives very poor results

I'm working on a project where I will use homography as features in a classifier. My problem is in automatically calculating homographies, i'm using SIFT descriptors to find the points between the two images on which to calculate homography but SIFT are giving me very poor results, hence i can't use them in my work.
I'm using OpenCV 2.4.3.
At first I was using SURF, but I had similar results and I decided to use SIFT which are slower but more precise. My first guess was that the image resolution in my dataset was too low but i ran my algorithm on a state-of-the-art dataset (Pointing 04) and I obtained pretty much the same results, so the problem lies in what I do and not in my dataset.
The match between the SIFT keypoints found in each image is done with the FlannBased matcher, i tried the BruteForce one but the results were again pretty much the same.
This is an example of the match I found (image from Pointing 04 dataset)
The above image shows how poor is the match found with my program. Only 1 point is a correct match. I need (at least) 4 correct matches for what I have to do.
Here is the code i use:
This is the function that extracts SIFT descriptors from each image
void extract_sift(const Mat &img, vector<KeyPoint> &keypoints, Mat &descriptors, Rect* face_rec) {
// Create masks for ROI on the original image
Mat mask1 = Mat::zeros(img.size(), CV_8U); // type of mask is CV_8U
Mat roi1(mask1, *face_rec);
roi1 = Scalar(255, 255, 255);
// Extracts keypoints in ROIs only
Ptr<DescriptorExtractor> featExtractor;
Ptr<FeatureDetector> featDetector;
Ptr<DescriptorMatcher> featMatcher;
featExtractor = new SIFT();
featDetector = FeatureDetector::create("SIFT");
featDetector->detect(img,keypoints,mask1);
featExtractor->compute(img,keypoints,descriptors);
}
This is the function that matches two images' descriptors
void match_sift(const Mat &img1, const Mat &img2, const vector<KeyPoint> &kp1,
const vector<KeyPoint> &kp2, const Mat &descriptors1, const Mat &descriptors2,
vector<Point2f> &p_im1, vector<Point2f> &p_im2) {
// Matching descriptor vectors using FLANN matcher
Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("FlannBased");
std::vector< DMatch > matches;
matcher->match( descriptors1, descriptors2, matches );
double max_dist = 0; double min_dist = 100;
// Quick calculation of max and min distances between keypoints
for( int i = 0; i < descriptors1.rows; ++i ){
double dist = matches[i].distance;
if( dist < min_dist ) min_dist = dist;
if( dist > max_dist ) max_dist = dist;
}
// Draw only the 4 best matches
std::vector< DMatch > good_matches;
// XXX: DMatch has no sort method, maybe a more efficent min extraction algorithm can be used here?
double min=matches[0].distance;
int min_i = 0;
for( int i = 0; i < (matches.size()>4?4:matches.size()); ++i ) {
for(int j=0;j<matches.size();++j)
if(matches[j].distance < min) {
min = matches[j].distance;
min_i = j;
}
good_matches.push_back( matches[min_i]);
matches.erase(matches.begin() + min_i);
min=matches[0].distance;
min_i = 0;
}
Mat img_matches;
drawMatches( img1, kp1, img2, kp2,
good_matches, img_matches, Scalar::all(-1), Scalar::all(-1),
vector<char>(), DrawMatchesFlags::NOT_DRAW_SINGLE_POINTS );
imwrite("imgMatch.jpeg",img_matches);
imshow("",img_matches);
waitKey();
for( int i = 0; i < good_matches.size(); i++ )
{
// Get the points from the best matches
p_im1.push_back( kp1[ good_matches[i].queryIdx ].pt );
p_im2.push_back( kp2[ good_matches[i].trainIdx ].pt );
}
}
And these functions are called here:
extract_sift(dataset[i].img,dataset[i].keypoints,dataset[i].descriptors,face_rec);
[...]
// Extract keypoints from i+1 image and calculate homography
extract_sift(dataset[i+1].img,dataset[i+1].keypoints,dataset[i+1].descriptors,face_rec);
dataset[front].points_r.clear(); // XXX: dunno if clearing the points every time is the best way to do it..
match_sift(dataset[front].img,dataset[i+1].img,dataset[front].keypoints,dataset[i+1].keypoints,
dataset[front].descriptors,dataset[i+1].descriptors,dataset[front].points_r,dataset[i+1].points_r);
dataset[i+1].H = findHomography(dataset[front].points_r,dataset[i+1].points_r, RANSAC);
Any help on how to improve the matching performance would be really appreciated, thanks.
You apparently use the "best four points" in your code w.r.t. the distance of the matches. In other words, you consider that a match is valid if both descriptors are really similar. I believe this is wrong. Did you try to draw all of the matches? Many of them should be wrong, but many should be good as well.
The distance of a match just tells how similar both points are. This doesn't tell if the match is coherent geometrically. Selecting the best matches should definitely consider the geometry.
Here is how I would do:
Detect the corners (you already do this)
Find the matches (you already do this)
Try to find a homography transform between both images by using the matches (don't filter them before!) using findHomography(...)
findHomography(...) will tell you which are the inliers. Those are your good_matches.

Object Detection with Hamming distance

I am using FAST and FREAK to get the descriptors of a couple of images and then I apply knnMatch with a BruteForceMatcher matcher and next I am using a loop to separate the good matches:
float nndrRatio = 0.7f;
std::vector<KeyPoint> keypointsA, keypointsB;
Mat descriptorsA, descriptorsB;
std::vector< vector< DMatch > > matches;
int threshold=9;
// detect keypoints:
FAST(objectMat,keypointsA,threshold,true);
FAST(sceneMat,keypointsB,threshold,true);
FREAK extractor;
// extract descriptors:
extractor.compute( objectMat, keypointsA, descriptorsA );
extractor.compute( sceneMat, keypointsB, descriptorsB );
BruteForceMatcher<Hamming> matcher;
// match
matcher.knnMatch(descriptorsA, descriptorsB, matches, 2);
// good matches search:
vector< DMatch > good_matches;
for (size_t i = 0; i < matches.size(); ++i)
{
if (matches[i].size() < 2)
continue;
const DMatch &m1 = matches[i][0];
const DMatch &m2 = matches[i][1];
if(m1.distance <= nndrRatio * m2.distance)
good_matches.push_back(m1);
}
//If there are at least 7 good matches, then object has been found
if ( (good_matches.size() >=7))
{
cout << "OBJECT FOUND!" << endl;
}
I think the problem could be the good matches search method, because using it with the FlannBasedMatcher works fine but with the BruteForceMatcher very weirdly. I'm suspecting that I may be doing a nonsense with this method because the Hamming distance uses binary descriptors, but I can't think of a way to adapt it!
Any links, snippets, ideas,... please?
Your code is not bad, but I don't think it is what you want to do. Why did you choose this method?
If you want to detect an object in an image using OpenCV, you should maybe try the Cascade Classification. This link will explain how to train a classifier.
EDIT: If you think it is too complicated and if the object you want to detect is planar, you can try this tutorial (it basically computes the inliers by trying to find a homography transform between the object and the image). But the cascade classification is more general for object detection.

Using opencv to match an image from a group of images for purpose of identification in C++

EDIT: I've acquired enough reputation through this post to be able to edit it with more links, which will help me get my point across better
People playing binding of isaac often come across important items on little pedestals.
The goal is to have a user confused about what an item is be able to press a button which will then instruct him to "box" the item(think windows desktop boxing). The box gives us the region of interest(the actual item plus some background environment) to compare to what will be an entire grid of items.
Theoretical user boxed item
Theoretical grid of items(there's not many more, I just ripped this out of the binding of isaac wiki)
The location in the grid of items identified as the item the user boxed would represent a certain area on the image that correlates to a proper link to the binding of isaac wiki giving information on the item.
In the grid the item is 1st column 3rd from the bottom row. I use these two images in all of the things I tried below
My goal is creating a program that can take a manual crop of an item from the game "The Binding of Isaac", identify the cropped item by finding comparing the image to an image of a table of items in the game, then display the proper wiki page.
This would be my first "real project" in the sense that it requires a huge amount of library learning to get what I want done. It's been a bit overwhelming.
I've messed with a few options just from googling around. (you can quickly find the tutorials I used by searching the name of the method and opencv. my account is heavily restricted with link posting for some reason)
using bruteforcematcher:
http://docs.opencv.org/doc/tutorials/features2d/feature_description/feature_description.html
#include <stdio.h>
#include <iostream>
#include "opencv2/core/core.hpp"
#include <opencv2/legacy/legacy.hpp>
#include <opencv2/nonfree/features2d.hpp>
#include "opencv2/highgui/highgui.hpp"
using namespace cv;
void readme();
/** #function main */
int main( int argc, char** argv )
{
if( argc != 3 )
{ return -1; }
Mat img_1 = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE );
Mat img_2 = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE );
if( !img_1.data || !img_2.data )
{ return -1; }
//-- Step 1: Detect the keypoints using SURF Detector
int minHessian = 400;
SurfFeatureDetector detector( minHessian );
std::vector<KeyPoint> keypoints_1, keypoints_2;
detector.detect( img_1, keypoints_1 );
detector.detect( img_2, keypoints_2 );
//-- Step 2: Calculate descriptors (feature vectors)
SurfDescriptorExtractor extractor;
Mat descriptors_1, descriptors_2;
extractor.compute( img_1, keypoints_1, descriptors_1 );
extractor.compute( img_2, keypoints_2, descriptors_2 );
//-- Step 3: Matching descriptor vectors with a brute force matcher
BruteForceMatcher< L2<float> > matcher;
std::vector< DMatch > matches;
matcher.match( descriptors_1, descriptors_2, matches );
//-- Draw matches
Mat img_matches;
drawMatches( img_1, keypoints_1, img_2, keypoints_2, matches, img_matches );
//-- Show detected matches
imshow("Matches", img_matches );
waitKey(0);
return 0;
}
/** #function readme */
void readme()
{ std::cout << " Usage: ./SURF_descriptor <img1> <img2>" << std::endl; }
results in not so useful looking stuff. Cleaner but equally unreliable results using flann.
http://docs.opencv.org/doc/tutorials/features2d/feature_flann_matcher/feature_flann_matcher.html
#include <stdio.h>
#include <iostream>
#include "opencv2/core/core.hpp"
#include <opencv2/legacy/legacy.hpp>
#include <opencv2/nonfree/features2d.hpp>
#include "opencv2/highgui/highgui.hpp"
using namespace cv;
void readme();
/** #function main */
int main( int argc, char** argv )
{
if( argc != 3 )
{ readme(); return -1; }
Mat img_1 = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE );
Mat img_2 = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE );
if( !img_1.data || !img_2.data )
{ std::cout<< " --(!) Error reading images " << std::endl; return -1; }
//-- Step 1: Detect the keypoints using SURF Detector
int minHessian = 400;
SurfFeatureDetector detector( minHessian );
std::vector<KeyPoint> keypoints_1, keypoints_2;
detector.detect( img_1, keypoints_1 );
detector.detect( img_2, keypoints_2 );
//-- Step 2: Calculate descriptors (feature vectors)
SurfDescriptorExtractor extractor;
Mat descriptors_1, descriptors_2;
extractor.compute( img_1, keypoints_1, descriptors_1 );
extractor.compute( img_2, keypoints_2, descriptors_2 );
//-- Step 3: Matching descriptor vectors using FLANN matcher
FlannBasedMatcher matcher;
std::vector< DMatch > matches;
matcher.match( descriptors_1, descriptors_2, matches );
double max_dist = 0; double min_dist = 100;
//-- Quick calculation of max and min distances between keypoints
for( int i = 0; i < descriptors_1.rows; i++ )
{ double dist = matches[i].distance;
if( dist < min_dist ) min_dist = dist;
if( dist > max_dist ) max_dist = dist;
}
printf("-- Max dist : %f \n", max_dist );
printf("-- Min dist : %f \n", min_dist );
//-- Draw only "good" matches (i.e. whose distance is less than 2*min_dist )
//-- PS.- radiusMatch can also be used here.
std::vector< DMatch > good_matches;
for( int i = 0; i < descriptors_1.rows; i++ )
{ if( matches[i].distance < 2*min_dist )
{ good_matches.push_back( matches[i]); }
}
//-- Draw only "good" matches
Mat img_matches;
drawMatches( img_1, keypoints_1, img_2, keypoints_2,
good_matches, img_matches, Scalar::all(-1), Scalar::all(-1),
vector<char>(), DrawMatchesFlags::NOT_DRAW_SINGLE_POINTS );
//-- Show detected matches
imshow( "Good Matches", img_matches );
for( int i = 0; i < good_matches.size(); i++ )
{ printf( "-- Good Match [%d] Keypoint 1: %d -- Keypoint 2: %d \n", i, good_matches[i].queryIdx, good_matches[i].trainIdx ); }
waitKey(0);
return 0;
}
/** #function readme */
void readme()
{ std::cout << " Usage: ./SURF_FlannMatcher <img1> <img2>" << std::endl; }
templatematching has been my best method so far. of the 6 methods it ranges from getting only 0-4 correct identifications though.
http://docs.opencv.org/doc/tutorials/imgproc/histograms/template_matching/template_matching.html
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>
using namespace std;
using namespace cv;
/// Global Variables
Mat img; Mat templ; Mat result;
char* image_window = "Source Image";
char* result_window = "Result window";
int match_method;
int max_Trackbar = 5;
/// Function Headers
void MatchingMethod( int, void* );
/** #function main */
int main( int argc, char** argv )
{
/// Load image and template
img = imread( argv[1], 1 );
templ = imread( argv[2], 1 );
/// Create windows
namedWindow( image_window, CV_WINDOW_AUTOSIZE );
namedWindow( result_window, CV_WINDOW_AUTOSIZE );
/// Create Trackbar
char* trackbar_label = "Method: \n 0: SQDIFF \n 1: SQDIFF NORMED \n 2: TM CCORR \n 3: TM CCORR NORMED \n 4: TM COEFF \n 5: TM COEFF NORMED";
createTrackbar( trackbar_label, image_window, &match_method, max_Trackbar, MatchingMethod );
MatchingMethod( 0, 0 );
waitKey(0);
return 0;
}
/**
* #function MatchingMethod
* #brief Trackbar callback
*/
void MatchingMethod( int, void* )
{
/// Source image to display
Mat img_display;
img.copyTo( img_display );
/// Create the result matrix
int result_cols = img.cols - templ.cols + 1;
int result_rows = img.rows - templ.rows + 1;
result.create( result_cols, result_rows, CV_32FC1 );
/// Do the Matching and Normalize
matchTemplate( img, templ, result, match_method );
normalize( result, result, 0, 1, NORM_MINMAX, -1, Mat() );
/// Localizing the best match with minMaxLoc
double minVal; double maxVal; Point minLoc; Point maxLoc;
Point matchLoc;
minMaxLoc( result, &minVal, &maxVal, &minLoc, &maxLoc, Mat() );
/// For SQDIFF and SQDIFF_NORMED, the best matches are lower values. For all the other methods, the higher the better
if( match_method == CV_TM_SQDIFF || match_method == CV_TM_SQDIFF_NORMED )
{ matchLoc = minLoc; }
else
{ matchLoc = maxLoc; }
/// Show me what you got
rectangle( img_display, matchLoc, Point( matchLoc.x + templ.cols , matchLoc.y + templ.rows ), Scalar::all(0), 2, 8, 0 );
rectangle( result, matchLoc, Point( matchLoc.x + templ.cols , matchLoc.y + templ.rows ), Scalar::all(0), 2, 8, 0 );
imshow( image_window, img_display );
imshow( result_window, result );
return;
}
http://imgur.com/pIRBPQM,h0wkqer,1JG0QY0,haLJzRF,CmrlTeL,DZuW73V#3
of the 6
fail,pass,fail,pass,pass,pass
This was sort of a best case result though. The next item I tried was
and resulted in fail,fail,fail,fail,fail,fail
From item to item all of these methods have some that work well and some that do terribly
So I'll ask: is templatematching my best bet or is there a method I'm not considering that will be my holy grail?
How can I get a USER to create the crop manually? Opencv's documentation on this is really bad and the examples I find online are extremely old cpp or straight C.
Thanks for any help. This venture has been an interesting experience so far. I had to strip all of the links which would better portray how everything's been working out, but the site is saying I'm posting more than 10 links even when I'm not.
some more examples of items throughout the game:
the rock is a rare item and one of the few that can be "anywhere" on the screen. items like the rock are the reason why cropping of the item by user is the best way about isolating the item, otherwise their positions are only in a couple of specific places.
An item after a boss fight, lots of stuff everywhere and transparency in the middle. I would imagine this being one of the harder ones to work correctly
Rare room. simple background. no item transparency.
here are the two tables all of the items in the game are.. I'll make them one image eventually but for now they were directly taken from the isaac wiki.
One important detail here is that you have pure image of every item in your table. You know color of background and can detach item from the rest of the picture. For example, in addition to matrix, representing image itself, you may store matrix of 1-s and 0-s of the same size, where ones correspond to image area and zeros - to background. Let's call this matrix "mask" and pure image of the item - "pattern".
There are 2 ways to compare images: match image with the pattern and match pattern with the image. What you have described is matching image with the pattern - you have some cropped image and want to find similar pattern. Instead, think about searching pattern on image.
Let's first define function match() that takes pattern, mask and image of the same size and checks if area on pattern under the mask is exactly the same as in image (pseudocode):
def match(pattern, mask, image):
for x = 0 to pattern.width:
for y = 0 to pattern.height:
if mask[x, y] == 1 and # if in pattern this pixel is not part of background
pattern[x, y] != image[x, y]: # and pixels on pattern and image differ
return False
return True
But sizes of pattern and cropped image may differ. Standard solution for this (used, for example, in cascade classifier) is to use sliding window - just move pattern "window" across image and check if pattern matches selected region. This is pretty much how image detection works in OpenCV.
Of course, this solution is not very robust - cropping, resizing or any other image transformations may change some pixels, and in this case method match() will always return false. To overcome this, instead of boolean answer you can use distance between image and pattern. In this case function match() should return some value of similarity, say, between 0 and 1, where 1 stands for "exactly the same", while 0 for "completely different". Then you either set threshold for similarity (e.g. image should be at least 85% similar to the pattern), or just select pattern with highest value of similarity.
Since items in the game are artificial images and variation in them is very small, this approach should be enough. However, for more complicated cases you will need other features than simply pixels under the mask. As I already suggested in my comment, methods like Eigenfaces, cascade classifier using Haar-like features or even Active Appearance Models may be more efficient for these tasks. As for SURF, as far as I know it's better suited for tasks with varying angle and size of object, but not for different backgrounds and all such things.
I came upon your question while trying to figure out my own template-matching issue, and now I'm back to share what I think might be your best bet based on my own experience. You've probably long-since abandoned this, but hey someone else might be in similar shoes one day.
None of the items that you shared are a solid rectangle, and since template matching in opencv cannot work with a mask you'll always be comparing your reference image against what I must assume is at least several different backgrounds (not to mention the items that are found in varied locations on different backgrounds, making the template match even worse).
It will always be comparing the background pixels and confounding your match unless you can collect a crop of every single situation where the reference image can be found. If decals of blood/etc introduce yet more variability into the backgrounds around the items too then template matching probably won't get great results.
So the two things I would try if I were you are depending on some details:
If possible, crop a reference template of every situation where the item is found (this will not be a good time), then compare the user-specified area against every template of every item. Take the best result from these comparisons and you will, if lucky, have a correct match.
The example screen shots you shared don't have any dark/black lines on the background,so the outlines of all of the items stands out. If this is consistent throughout the game, you can find edges within the user-specified area and detect the exterior contours. Ahead of time you would have processed the exterior contours of each reference item and stored those contours. Then you can compare your contour(s) in the user's crop against each contour in your database, taking the best match as the answer.
I'm confident either of those could work for you, depending on whether the game is well-represented by your screenshots.
Note: The contour matching will be much, much faster than the template matching. Fast enough to run in realtime and negate the need for the user to crop anything, perhaps.