This is my problem
Kinect is mounted on top of the room(on ceiling). Then i take a depth image of the people below the kinect.
So what i get is a top view of the people below.
Then i want to extract the heads of the people to count the number of people.
As the way i see it, this problem requires identification of LOCAL minimum regions of the image. But i coudn't figure out a way.
Can some one suggest me a way to achieve this??
Is there a OpenCV function to get local minimum regions??
Thank you.
You can try the watershed transform to find local minima. A quick search brought up this sample code which you may want to try with OpenCV.
I would do a foreground-background segmentation, that separates the static background from the dynamic "foreground" (people).
Then, once you have the point clouds/depth maps of the people, you can segment them for example with some region growing (flood fill) method. This way you get the separated people which you can count or find their minimum point if you are looking for the heads specifically.
I would go with something as simple as thresholding for near and far depths, using the and operation to merge the two and find the contours in the resulting image.
It's not super flexible as you're kind of hard coding a depth range (a minimum human height expected), but it's easy to setup/tweak and shouldn'be that costly computationally. Optionally you can use a bit of blur and erode/dilate to help refine the contours.
Although it has more than what I explained, you can see a demo here
And here's a basic example using OpenCV with OpenNI:
#include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
using namespace cv;
using namespace std;
int threshNear = 60;
int threshFar = 100;
int dilateAmt = 1;
int erodeAmt = 1;
int blurAmt = 1;
int blurPre = 1;
void on_trackbar(int, void*){}
int main( )
{
VideoCapture capture;
capture.open(CV_CAP_OPENNI);
if( !capture.isOpened() )
{
cout << "Can not open a capture object." << endl;
return -1;
}
cout << "ready" << endl;
vector<vector<Point> > contours;
namedWindow("depth map");
createTrackbar( "amount dilate", "depth map", &dilateAmt,16, on_trackbar );
createTrackbar( "amount erode", "depth map", &erodeAmt,16, on_trackbar );
createTrackbar( "amount blur", "depth map", &blurAmt,16, on_trackbar );
createTrackbar( "blur pre", "depth map", &blurPre,1, on_trackbar );
createTrackbar( "threshold near", "depth map", &threshNear,255, on_trackbar );
createTrackbar( "threshold far", "depth map", &threshFar,255, on_trackbar );
for(;;)
{
Mat depthMap;
if( !capture.grab() )
{
cout << "Can not grab images." << endl;
return -1;
}
else
{
if( capture.retrieve( depthMap, CV_CAP_OPENNI_DEPTH_MAP ) )
{
const float scaleFactor = 0.05f;
Mat show; depthMap.convertTo( show, CV_8UC1, scaleFactor );
//threshold
Mat tnear,tfar;
show.copyTo(tnear);
show.copyTo(tfar);
threshold(tnear,tnear,threshNear,255,CV_THRESH_TOZERO);
threshold(tfar,tfar,threshFar,255,CV_THRESH_TOZERO_INV);
show = tnear & tfar;//or cvAnd(tnear,tfar,show,NULL); to join the two thresholded images
//filter
if(blurPre == 1) blur(show,show,Size(blurAmt+1,blurAmt+1));
Mat cntr; show.copyTo(cntr);
erode(cntr,cntr,Mat(),Point(-1,-1),erodeAmt);
if(blurPre == 0) blur(cntr,cntr,Size(blurAmt+1,blurAmt+1));
dilate(cntr,cntr,Mat(),Point(-1,-1),dilateAmt);
//compute and draw contours
findContours(cntr,contours,0,1);
drawContours(cntr,contours,-1,Scalar(192,0,0),2,3);
//optionally compute bounding box and circle to exclude small blobs(non human) or do further filtering,etc.
int numContours = contours.size();
vector<vector<Point> > contours_poly( numContours );
vector<Rect> boundRect( numContours );
vector<Point2f> centers( numContours );
vector<float> radii(numContours);
for(int i = 0; i < numContours; i++ ){
approxPolyDP( Mat(contours[i]), contours_poly[i], 3, true );
boundRect[i] = boundingRect( Mat(contours_poly[i]) );
minEnclosingCircle(contours_poly[i],centers[i],radii[i]);
rectangle( cntr, boundRect[i].tl(), boundRect[i].br(), Scalar(64), 2, 8, 0 );
circle(cntr,centers[i],radii[i],Scalar(192));
}
imshow( "depth map", show );
imshow( "contours", cntr );
}
}
if( waitKey( 30 ) == 27 ) break;//exit on esc
}
}
Even if you're not using OpenNI to grab the depth stream you can still plug the depth image into OpenCV. Also, you can detect the bounding box and circle which might help filter things further a bit. Say you're setup is in an office space, you might want to avoid a column, tall plant,shelves, etc. so you can check the bounding circle's radius or bounding box's width/height ratio.
Related
I'm trying to get information such as the area occupied by water droplets on a water sensitive card, in which I must extract the area, the number of drops, and determine the largest drop as well as the smallest drop.
Example Image :
What I've done so far is the detection of the area that is wet, but I have difficulty detecting the drops and measuring their size and quantity.
Follow the code below,
If anyone can help, I appreciate it!
src = cv::imread("/Users/gustavovisentini/Documents/Developer/Desktop/OpenCV-Teste3.3.1/binary_image.png");
cout << "Loading Image...\n\n";
cvtColor( src, src_gray, COLOR_BGR2GRAY );
blur( src_gray, src_gray, Size(3,3) );
Mat canny_output;
Canny( src_gray, canny_output, thresh, thresh*2 );
vector<vector<Point>> contours;
findContours( src_gray, contours, RETR_TREE, CHAIN_APPROX_SIMPLE );
vector<vector<Point> > contours_poly( contours.size() );
vector<Rect> boundRect( contours.size() );
vector<Point2f>centers( contours.size() );
vector<float>radius( contours.size() );
for( size_t i = 0; i < contours.size(); i++ )
{
approxPolyDP( contours[i], contours_poly[i], 3, true );
boundRect[i] = boundingRect( contours_poly[i] );
minEnclosingCircle( contours_poly[i], centers[i], radius[i] );
}
Mat drawing = src.clone();
for( size_t i = 0; i< contours.size(); i++ )
{
Scalar color = Scalar( rng.uniform(0, 256), rng.uniform(0,256), rng.uniform(0,256) );
drawContours( drawing, contours_poly, (int)i, color );
rectangle( drawing, boundRect[i].tl(), boundRect[i].br(), color, 2 );
circle( drawing, centers[i], (int)radius[i], color, 2 );
}
stringstream temp;
temp << "Total: " << contours.size() << " - " << thresh << " - " << contours[1][1];
cv::putText(drawing, temp.str(), cv::Point(10,40), FONT_HERSHEY_PLAIN, 0.7, CV_RGB(255, 0, 0));
imshow( "Contours", drawing );
Here's an approach using thresholding + contour filtering. Using this screenshotted input image:
We first convert the image to grayscale then Otsu's threshold to get a binary image
Next we find contours on the binary image, iterate through each contour, and filter using contour area. To determine the total area of the water droplets, we keep a total_area variable and sum the area of each contour. The number of droplets is the length of the number of contours on the mask. To determine the smallest or largest drop, we simply sort the contours based on ascending contour area. The first contour will be the smallest drop and the last contour will be that largest drop.
Here's the detected droplets, the number of drops, and the total area
Drops: 257
Total area: 31448.0
I implemented this approach in Python but you can easily convert it to C++
import cv2
import numpy as np
# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.png')
mask = np.zeros(image.shape, dtype=np.uint8)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray,0,255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Find contours and filter using contour area
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
total_area = 0
drops = len(cnts)
smallest = sorted(cnts, key=cv2.contourArea)[0]
largest = sorted(cnts, key=cv2.contourArea)[-1]
for c in cnts:
area = cv2.contourArea(c)
total_area += area
# Draw largest and smallest drop onto a mask
cv2.drawContours(mask, [largest], -1, (255,255,255), -1)
cv2.drawContours(mask, [smallest], -1, (255,255,255), -1)
# Visualize result better
result = cv2.bitwise_and(image, image, mask=thresh)
result[thresh==0] = (255,255,255)
print('Drops: {}'.format(drops))
print('Total area: {}'.format(total_area))
cv2.imshow('thresh', thresh)
cv2.imshow('mask', mask)
cv2.imshow('result', result)
cv2.waitKey()
Thanks for the help so far! Another problem I have is how to find water droplets that have no round shape. In this way I can classify what is gout and what was water drained by the card. Is it possible to know this format through findContours? Follows the image that has this different water droplet.
I am doing a project in OpenCV on estimating the speed of moving vehicle using the video captured. Here the camera is stationary. I have estimated the speed of single object using centroid and Euclidean distance. Now the problem is, I am not getting how to do the same for multiple objects.
Here, I need to calculate the Euclidean distance of objects between 2 subsequent frames.
I am grateful if anyone would help.
I have created the class-
class centroids
{
public:
vector<Point2f> ce;
vector<float> area;
};
centroids c[100];
And this is the code I've written. I would be grateful if anyone helped me with the code:
findContours( fgMaskMOG2,
contours,
hierarchy,
CV_RETR_CCOMP,
CV_CHAIN_APPROX_SIMPLE );
int morph_size = 6;
Mat element = getStructuringElement( MORPH_RECT,
Size( 2*morph_size+1, 2*morph_size+1 ),
Point( morph_size, morph_size ) );
Scalar color( 255, 255, 255 ); // color of the contour in the
//Draw the contour and rectangle
for( int i = 0; i < contours.size(); i++ )
{
drawContours( fgMaskMOG2,
contours,
i,
color,
CV_FILLED,
8,
hierarchy );
}
//imshow("morpho window",dst);
vector<Moments> mu( contours.size() );
vector<Point2f> mc( contours.size() );
vector<Point2f> m ;
vector<double> time;
vector<Point2f> centroid( mc.size() );
//vector< vector<Point> >::iterator itc = contours.begin();
// iterate through each contour.
double time1[1000];
for( int i = 0; i < contours.size(); i++ )
{
// Find the area of contour
double a = contourArea( contours[i], false );
if( a > 500 )
{
mu[i] = moments( contours[i], false );
mc[i] = Point2f( (mu[i].m10 / mu[i].m00), (mu[i].m01 / mu[i].m00) );
m.push_back( mc[i] );
Point2f diff;
double euclidian = 0;
for( int f = 0; f < m.size(); f++ )
{
if( k == 1 )
{
c[f].ce.push_back( m[f] );
cout << "cen" << c[f].ce << endl;
euclidian = 0;
}
else
{
c[f+1].ce.push_back( m[f] );
cout << "cent" << c[f+1].ce << endl;
diff = c[f].ce[f] - c[f-1].ce[f-1];
euclidian = abs( sqrt( (diff.x*diff.x) + (diff.y*diff.y) ) );
cout << "euclidian" << euclidian << endl;
}
}
cout << "\n centroid" << m << endl;
circle( fgMaskMOG2,
mc[i],
5,
Scalar( 0, 0, 255 ),
1,
8,
0 );
}
}
Thanks in advance :)
You can estimate speed of a moving vehicle based on the video frames only if approximate distance between vehicle and camera is constant throughout the calculation i.e. vehicle is moving in a straight line perpendicular to the camera's vision. So, if camera is looking from side, all the vehicles will be at different distance and calculation will become highly inaccurate for multiple vehicles. Even the vehicles will overlap and their segmentation will be difficult.
There are two scenarios in which your calculation may work -
First, when the camera is capturing from top looking vertically down on vehicles. In this case, there will be a stark difference between the vehicle color and road color. You can use several ways to segment out those individual vehicles, tag them based on their features and identify those vehicles in next frame using the features. This way you'll get position of individual vehicles and then you can predict the speed based on your algorithm. These are following links which will be helpful for segmenting the vechicles -
How to define the markers for Watershed in OpenCV?
http://www.codeproject.com/Articles/751744/Image-Segmentation-using-Unsupervised-Watershed-Al
http://www.bogotobogo.com/python/OpenCV_Python/python_opencv3_Image_Watershed_Algorithm_Marker_Based_Segmentation.php
Second, when vehicles are moving in a single line behind one another. In this case, you can use a combination of color and contour based segmentation depending on the background of your vehicles. After segmentation you can again use object features to identify the position of objects in the next frame. Then run your algorithm for both cases.
If you have complete video sequence of the vehicles, you can segment out different vehicles in the first frame automatically or identify them manually, and then apply motion tracking on those identified objects. You can use Opencv's motion analysis functions and object tracking functions to do so. Thus you'll get position of all tracked vehicles in each frame. So you can easily run and test your speed calculating algorithms.
I need to test contour on self-intersection but I don't know how it implement. Or how I can detect only contours without self-intersection in cv::Mat?
F.ex. left contour must be matched, right contour don't matched
Here is a solution:
Skeleton + pruning => reduce the contours to a single pixel width
For each pixel, compute the number of neighbors
If a pixel has more than 2 neighbors, then there it is in the middle of an intersection.
(optional) Connected component labeling in order to separate the different shapes.
You can also use a Hough transform.
If the lines are represented by a polygon (you know the corner points), you may draw the lines on an accumulation matrix.
Declare an new blank cv::Mat of type CV_8UC1 and initialize it with zero values. For every pixel between the two lines, increment the matrix by 1.
I am not if using the cv::line method is the best way to accomplish this task (you may create a new image for every line and sum up all the images as the final step). The best way that I can think of is to increment the points by using the equation of the line.
When you draw lines that intersect, in the accumulation matrix you'll have values of 2. If you find them, you'll know that the contour has self-intersections and you also know where they are.
If you have the image as an input, then the previously mentioned solution might work.
Best regards!
I tried ma best to implement it but couldn't due to lack the logic to code it. The logic i tried is you have the set of points of contours. Now check the occurrence of each point i.e how many number of times each point has appeared, if it has appeared more then one time it indicates the intersection point.
Let me know if i'm wrong.
the code i tried isn't working for this logic maybe someone might help you with it.
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <iostream>
#include "opencv2/imgproc/imgproc.hpp"
using namespace cv;
using namespace std;
RNG rng(12345);
int main( )
{
Mat image;
image = imread("0.png", CV_LOAD_IMAGE_COLOR); // Read the file
if(! image.data ) // Check for invalid input
{
cout << "Could not open or find the image" << std::endl ;
return -1;
}
cvtColor( image, image, CV_BGR2GRAY );
namedWindow( "Display window12", WINDOW_AUTOSIZE );// Create a window for display.
imshow( "Display window12", image );
Mat drawing;
vector<vector<Point> > contours;
vector<Vec4i> hierarchy;
findContours( image, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE, Point(0, 0) );
int m = 1;
vector<Point> contours1;
for(int i= 0; i < contours.size(); i++)
{
for(int j= 0; j < contours[i].size();j++) // run until j < contours[i].size();
{
contours1.push_back(Point (contours[i][j]));
// cout << contours[i][j] << "contours1"<<contours1<<endl; //do whatever
}
}
cout<<contours.size();
// Finding the occrence of each point it has appeared
//for(int i=0;i<contours.size();i++)
//{
// for(int j=0; j<contours[i].size();j++) // run until j < contours[i].size();
// {
// //contours1.push_back(Point (contours[i][j]));
// //if (contours[i][j] == contours[i][j])
// if( contours[i] ==contours1.at(i).x)
// // if( posX ==points.at(p).x)
// cout<<"hi";
// // cout << contours[i][j] << "contours1"<<contours1<<endl; //do whatever
// }
//}
namedWindow( "Display window", WINDOW_AUTOSIZE );// Create a window for display.
imshow( "Display window", image );
waitKey(0); // Wait for a keystroke in the window
return 0;
}
EDIT: I've acquired enough reputation through this post to be able to edit it with more links, which will help me get my point across better
People playing binding of isaac often come across important items on little pedestals.
The goal is to have a user confused about what an item is be able to press a button which will then instruct him to "box" the item(think windows desktop boxing). The box gives us the region of interest(the actual item plus some background environment) to compare to what will be an entire grid of items.
Theoretical user boxed item
Theoretical grid of items(there's not many more, I just ripped this out of the binding of isaac wiki)
The location in the grid of items identified as the item the user boxed would represent a certain area on the image that correlates to a proper link to the binding of isaac wiki giving information on the item.
In the grid the item is 1st column 3rd from the bottom row. I use these two images in all of the things I tried below
My goal is creating a program that can take a manual crop of an item from the game "The Binding of Isaac", identify the cropped item by finding comparing the image to an image of a table of items in the game, then display the proper wiki page.
This would be my first "real project" in the sense that it requires a huge amount of library learning to get what I want done. It's been a bit overwhelming.
I've messed with a few options just from googling around. (you can quickly find the tutorials I used by searching the name of the method and opencv. my account is heavily restricted with link posting for some reason)
using bruteforcematcher:
http://docs.opencv.org/doc/tutorials/features2d/feature_description/feature_description.html
#include <stdio.h>
#include <iostream>
#include "opencv2/core/core.hpp"
#include <opencv2/legacy/legacy.hpp>
#include <opencv2/nonfree/features2d.hpp>
#include "opencv2/highgui/highgui.hpp"
using namespace cv;
void readme();
/** #function main */
int main( int argc, char** argv )
{
if( argc != 3 )
{ return -1; }
Mat img_1 = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE );
Mat img_2 = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE );
if( !img_1.data || !img_2.data )
{ return -1; }
//-- Step 1: Detect the keypoints using SURF Detector
int minHessian = 400;
SurfFeatureDetector detector( minHessian );
std::vector<KeyPoint> keypoints_1, keypoints_2;
detector.detect( img_1, keypoints_1 );
detector.detect( img_2, keypoints_2 );
//-- Step 2: Calculate descriptors (feature vectors)
SurfDescriptorExtractor extractor;
Mat descriptors_1, descriptors_2;
extractor.compute( img_1, keypoints_1, descriptors_1 );
extractor.compute( img_2, keypoints_2, descriptors_2 );
//-- Step 3: Matching descriptor vectors with a brute force matcher
BruteForceMatcher< L2<float> > matcher;
std::vector< DMatch > matches;
matcher.match( descriptors_1, descriptors_2, matches );
//-- Draw matches
Mat img_matches;
drawMatches( img_1, keypoints_1, img_2, keypoints_2, matches, img_matches );
//-- Show detected matches
imshow("Matches", img_matches );
waitKey(0);
return 0;
}
/** #function readme */
void readme()
{ std::cout << " Usage: ./SURF_descriptor <img1> <img2>" << std::endl; }
results in not so useful looking stuff. Cleaner but equally unreliable results using flann.
http://docs.opencv.org/doc/tutorials/features2d/feature_flann_matcher/feature_flann_matcher.html
#include <stdio.h>
#include <iostream>
#include "opencv2/core/core.hpp"
#include <opencv2/legacy/legacy.hpp>
#include <opencv2/nonfree/features2d.hpp>
#include "opencv2/highgui/highgui.hpp"
using namespace cv;
void readme();
/** #function main */
int main( int argc, char** argv )
{
if( argc != 3 )
{ readme(); return -1; }
Mat img_1 = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE );
Mat img_2 = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE );
if( !img_1.data || !img_2.data )
{ std::cout<< " --(!) Error reading images " << std::endl; return -1; }
//-- Step 1: Detect the keypoints using SURF Detector
int minHessian = 400;
SurfFeatureDetector detector( minHessian );
std::vector<KeyPoint> keypoints_1, keypoints_2;
detector.detect( img_1, keypoints_1 );
detector.detect( img_2, keypoints_2 );
//-- Step 2: Calculate descriptors (feature vectors)
SurfDescriptorExtractor extractor;
Mat descriptors_1, descriptors_2;
extractor.compute( img_1, keypoints_1, descriptors_1 );
extractor.compute( img_2, keypoints_2, descriptors_2 );
//-- Step 3: Matching descriptor vectors using FLANN matcher
FlannBasedMatcher matcher;
std::vector< DMatch > matches;
matcher.match( descriptors_1, descriptors_2, matches );
double max_dist = 0; double min_dist = 100;
//-- Quick calculation of max and min distances between keypoints
for( int i = 0; i < descriptors_1.rows; i++ )
{ double dist = matches[i].distance;
if( dist < min_dist ) min_dist = dist;
if( dist > max_dist ) max_dist = dist;
}
printf("-- Max dist : %f \n", max_dist );
printf("-- Min dist : %f \n", min_dist );
//-- Draw only "good" matches (i.e. whose distance is less than 2*min_dist )
//-- PS.- radiusMatch can also be used here.
std::vector< DMatch > good_matches;
for( int i = 0; i < descriptors_1.rows; i++ )
{ if( matches[i].distance < 2*min_dist )
{ good_matches.push_back( matches[i]); }
}
//-- Draw only "good" matches
Mat img_matches;
drawMatches( img_1, keypoints_1, img_2, keypoints_2,
good_matches, img_matches, Scalar::all(-1), Scalar::all(-1),
vector<char>(), DrawMatchesFlags::NOT_DRAW_SINGLE_POINTS );
//-- Show detected matches
imshow( "Good Matches", img_matches );
for( int i = 0; i < good_matches.size(); i++ )
{ printf( "-- Good Match [%d] Keypoint 1: %d -- Keypoint 2: %d \n", i, good_matches[i].queryIdx, good_matches[i].trainIdx ); }
waitKey(0);
return 0;
}
/** #function readme */
void readme()
{ std::cout << " Usage: ./SURF_FlannMatcher <img1> <img2>" << std::endl; }
templatematching has been my best method so far. of the 6 methods it ranges from getting only 0-4 correct identifications though.
http://docs.opencv.org/doc/tutorials/imgproc/histograms/template_matching/template_matching.html
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>
using namespace std;
using namespace cv;
/// Global Variables
Mat img; Mat templ; Mat result;
char* image_window = "Source Image";
char* result_window = "Result window";
int match_method;
int max_Trackbar = 5;
/// Function Headers
void MatchingMethod( int, void* );
/** #function main */
int main( int argc, char** argv )
{
/// Load image and template
img = imread( argv[1], 1 );
templ = imread( argv[2], 1 );
/// Create windows
namedWindow( image_window, CV_WINDOW_AUTOSIZE );
namedWindow( result_window, CV_WINDOW_AUTOSIZE );
/// Create Trackbar
char* trackbar_label = "Method: \n 0: SQDIFF \n 1: SQDIFF NORMED \n 2: TM CCORR \n 3: TM CCORR NORMED \n 4: TM COEFF \n 5: TM COEFF NORMED";
createTrackbar( trackbar_label, image_window, &match_method, max_Trackbar, MatchingMethod );
MatchingMethod( 0, 0 );
waitKey(0);
return 0;
}
/**
* #function MatchingMethod
* #brief Trackbar callback
*/
void MatchingMethod( int, void* )
{
/// Source image to display
Mat img_display;
img.copyTo( img_display );
/// Create the result matrix
int result_cols = img.cols - templ.cols + 1;
int result_rows = img.rows - templ.rows + 1;
result.create( result_cols, result_rows, CV_32FC1 );
/// Do the Matching and Normalize
matchTemplate( img, templ, result, match_method );
normalize( result, result, 0, 1, NORM_MINMAX, -1, Mat() );
/// Localizing the best match with minMaxLoc
double minVal; double maxVal; Point minLoc; Point maxLoc;
Point matchLoc;
minMaxLoc( result, &minVal, &maxVal, &minLoc, &maxLoc, Mat() );
/// For SQDIFF and SQDIFF_NORMED, the best matches are lower values. For all the other methods, the higher the better
if( match_method == CV_TM_SQDIFF || match_method == CV_TM_SQDIFF_NORMED )
{ matchLoc = minLoc; }
else
{ matchLoc = maxLoc; }
/// Show me what you got
rectangle( img_display, matchLoc, Point( matchLoc.x + templ.cols , matchLoc.y + templ.rows ), Scalar::all(0), 2, 8, 0 );
rectangle( result, matchLoc, Point( matchLoc.x + templ.cols , matchLoc.y + templ.rows ), Scalar::all(0), 2, 8, 0 );
imshow( image_window, img_display );
imshow( result_window, result );
return;
}
http://imgur.com/pIRBPQM,h0wkqer,1JG0QY0,haLJzRF,CmrlTeL,DZuW73V#3
of the 6
fail,pass,fail,pass,pass,pass
This was sort of a best case result though. The next item I tried was
and resulted in fail,fail,fail,fail,fail,fail
From item to item all of these methods have some that work well and some that do terribly
So I'll ask: is templatematching my best bet or is there a method I'm not considering that will be my holy grail?
How can I get a USER to create the crop manually? Opencv's documentation on this is really bad and the examples I find online are extremely old cpp or straight C.
Thanks for any help. This venture has been an interesting experience so far. I had to strip all of the links which would better portray how everything's been working out, but the site is saying I'm posting more than 10 links even when I'm not.
some more examples of items throughout the game:
the rock is a rare item and one of the few that can be "anywhere" on the screen. items like the rock are the reason why cropping of the item by user is the best way about isolating the item, otherwise their positions are only in a couple of specific places.
An item after a boss fight, lots of stuff everywhere and transparency in the middle. I would imagine this being one of the harder ones to work correctly
Rare room. simple background. no item transparency.
here are the two tables all of the items in the game are.. I'll make them one image eventually but for now they were directly taken from the isaac wiki.
One important detail here is that you have pure image of every item in your table. You know color of background and can detach item from the rest of the picture. For example, in addition to matrix, representing image itself, you may store matrix of 1-s and 0-s of the same size, where ones correspond to image area and zeros - to background. Let's call this matrix "mask" and pure image of the item - "pattern".
There are 2 ways to compare images: match image with the pattern and match pattern with the image. What you have described is matching image with the pattern - you have some cropped image and want to find similar pattern. Instead, think about searching pattern on image.
Let's first define function match() that takes pattern, mask and image of the same size and checks if area on pattern under the mask is exactly the same as in image (pseudocode):
def match(pattern, mask, image):
for x = 0 to pattern.width:
for y = 0 to pattern.height:
if mask[x, y] == 1 and # if in pattern this pixel is not part of background
pattern[x, y] != image[x, y]: # and pixels on pattern and image differ
return False
return True
But sizes of pattern and cropped image may differ. Standard solution for this (used, for example, in cascade classifier) is to use sliding window - just move pattern "window" across image and check if pattern matches selected region. This is pretty much how image detection works in OpenCV.
Of course, this solution is not very robust - cropping, resizing or any other image transformations may change some pixels, and in this case method match() will always return false. To overcome this, instead of boolean answer you can use distance between image and pattern. In this case function match() should return some value of similarity, say, between 0 and 1, where 1 stands for "exactly the same", while 0 for "completely different". Then you either set threshold for similarity (e.g. image should be at least 85% similar to the pattern), or just select pattern with highest value of similarity.
Since items in the game are artificial images and variation in them is very small, this approach should be enough. However, for more complicated cases you will need other features than simply pixels under the mask. As I already suggested in my comment, methods like Eigenfaces, cascade classifier using Haar-like features or even Active Appearance Models may be more efficient for these tasks. As for SURF, as far as I know it's better suited for tasks with varying angle and size of object, but not for different backgrounds and all such things.
I came upon your question while trying to figure out my own template-matching issue, and now I'm back to share what I think might be your best bet based on my own experience. You've probably long-since abandoned this, but hey someone else might be in similar shoes one day.
None of the items that you shared are a solid rectangle, and since template matching in opencv cannot work with a mask you'll always be comparing your reference image against what I must assume is at least several different backgrounds (not to mention the items that are found in varied locations on different backgrounds, making the template match even worse).
It will always be comparing the background pixels and confounding your match unless you can collect a crop of every single situation where the reference image can be found. If decals of blood/etc introduce yet more variability into the backgrounds around the items too then template matching probably won't get great results.
So the two things I would try if I were you are depending on some details:
If possible, crop a reference template of every situation where the item is found (this will not be a good time), then compare the user-specified area against every template of every item. Take the best result from these comparisons and you will, if lucky, have a correct match.
The example screen shots you shared don't have any dark/black lines on the background,so the outlines of all of the items stands out. If this is consistent throughout the game, you can find edges within the user-specified area and detect the exterior contours. Ahead of time you would have processed the exterior contours of each reference item and stored those contours. Then you can compare your contour(s) in the user's crop against each contour in your database, taking the best match as the answer.
I'm confident either of those could work for you, depending on whether the game is well-represented by your screenshots.
Note: The contour matching will be much, much faster than the template matching. Fast enough to run in realtime and negate the need for the user to crop anything, perhaps.
I am working on a motion recognition project of walking, involving openCV and C++. I have reached the stage in the algorithm where I am required to find the area of the human blob. I have loaded the video, converted it to grayscale and thresholded it to obtain a binary image with white regions showing the human walking in addition to other white regions. I need to find the area of each white region to determine the area of the human blob since this region will have an area greater than that of the other white regions. Please look through my code and explain the output to me because I am getting an area of 40872 and I do not know what this means. This is my code. I want to upload the video I used but I do not know how to:/ If someone can tell me how to upload the video I used, please do, because this is the only way I will be able to get help with this particular video. I really hope someone can help me.
#include "cv.h"
#include "highgui.h"
#include "iostream"
using namespace std;
int main( int argc, char* argv ) {
CvCapture *capture = NULL;
capture = cvCaptureFromAVI("C:\\walking\\lady walking.avi");
if(!capture){
return -1;
}
IplImage* color_frame = NULL;
IplImage* gray_frame = NULL ;
int thresh_frame = 70;
CvMoments moments;
int frameCount=0;//Counts every 5 frames
cvNamedWindow( "walking", CV_WINDOW_AUTOSIZE );
while(1) {
color_frame = cvQueryFrame( capture );//Grabs the frame from a file
if( !color_frame ) break;
gray_frame = cvCreateImage(cvSize(color_frame->width, color_frame->height), color_frame->depth, 1);
if( !color_frame ) break;// If the frame does not exist, quit the loop
frameCount++;
if(frameCount==5)
{
cvCvtColor(color_frame, gray_frame, CV_BGR2GRAY);
cvThreshold(gray_frame, gray_frame, thresh_frame, 255, CV_THRESH_BINARY);
cvErode(gray_frame, gray_frame, NULL, 1);
cvDilate(gray_frame, gray_frame, NULL, 1);
cvMoments(gray_frame, &moments, 1);
double m00;
m00 = cvGetSpatialMoment(&moments, 0,0);
cvShowImage("walking", gray_frame);
frameCount=0;
}
char c = cvWaitKey(33);
if( c == 27 ) break;
}
double m00 = (double)cvGetSpatialMoment(&moments, 0,0);
cout << "Area - : " << m00 << endl;
cvReleaseImage(&color_frame);
cvReleaseImage(&gray_frame);
cvReleaseCapture( &capture );
cvDestroyWindow( "walking" );
return 0;
}
cout << "Area - : " << m00 << endl;
The function cvGetSpatialMoment retrieves the spatial moment, which in case of image moments is defined as:
Mji=sumx,y(I(x,y)•xj•yi)
where I(x,y) is the intensity of the pixel (x, y).
The spatial moment m00 is like the mass of an object. It contains no x, y information. The average x position is average(x) = sum(density(x)*x_i) over all i's. I(x,y) is like the density function, but here it is the intensity of the pixel. If you don't want your result to change based on the lighting, you probably want to make the matrix a binary matrix. A pixel is either part of the object or not. Feeding in a greyscale image of the object will essentially convert the greylevel to density as per the formula above.
Area = average(x) * average(y)
so you want
Area = m01 * m10
m00 is basically summing the grey-level over all the pixels in the image. No spatial meaning. Though if you don't convert your image to binary, you may want to divide by m00 to "normalize" it.
You can use MEI and MHI image to recognize motion. with 50frame/1 you updateMHI image and get segment motion and create motion by cvMotions, after that you need to use mathanan distinct with training data. I'm Vietnamese. And english i'm very bad.