I trained my pc with opencv_traincascade all one day long to detect 2€ coins using more than 6000 positive images similar to the following:
Now, I have just tried to run a simple OpenCV program to see the results and to check the file cascade.xml. The final result is very disappointing:
There are many points on the coin but there are also many other points on the background. Could it be a problem with my positive images used for training? Or maybe, am I using the detectMultiScale() with wrong parameters?
Here's my code:
#include "opencv2/opencv.hpp"
using namespace cv;
int main(int, char**) {
Mat src = imread("2c.jpg", CV_LOAD_IMAGE_COLOR);
Mat src_gray;
std::vector<cv::Rect> money;
CascadeClassifier euro2_cascade;
cvtColor(src, src_gray, CV_BGR2GRAY );
//equalizeHist(src_gray, src_gray);
if ( !euro2_cascade.load( "/Users/lory/Desktop/cascade.xml" ) ) {
printf("--(!)Error loading\n");
return -1;
}
euro2_cascade.detectMultiScale( src_gray, money, 1.1, 0, CV_HAAR_FIND_BIGGEST_OBJECT|CV_HAAR_SCALE_IMAGE, cv::Size(10, 10),cv::Size(2000, 2000) );
for( size_t i = 0; i < money.size(); i++ ) {
cv::Point center( money[i].x + money[i].width*0.5, money[i].y + money[i].height*0.5 );
ellipse( src, center, cv::Size( money[i].width*0.5, money[i].height*0.5), 0, 0, 360, Scalar( 255, 0, 255 ), 4, 8, 0 );
}
//namedWindow( "Display window", WINDOW_AUTOSIZE );
imwrite("result.jpg",src);
}
I have also tried to reduce the number of neighbours but the effect is the same, just with many less points... Could it be a problem the fact that in positive images there are those 4 corners as background around the coin? I generated png images with Gimp from a shot video showing the coin, so I don't know why opencv_createsamples puts those 4 corners.
Those positive images are just plain wrong
The more "noise" you give your images on the parts of the training data then the more robust it will be, but yes the longer it will take to train. This is however where your negative sampels will come into action. If you have as many negative training samples as possible with as many ranges as possible then you will create more robust detectors. You need to make sure your positive images only have your coins in, all your negative images have everything but coins in them
I've seen a couple of your questions up to now & I think you want to detect three different types of euro coins. You would be best training three classifiers all on those different coins and then running all three on your images.
I think also you are missing a key piece of knowledge on how HAAR works (or LBP or whatever) effectively it creates a set of "features" from your positive images then tries to find those features in the images you run the classifier over. It creates these features by working out what is different between your positive images and your negative images. You don't want anything that isnt going to be the thing you are trying to detect in your positive images.
Edit 1 - An Example
Imagine creating a classifier for a road stop sign, which is a similar detection to coins. It's big, it's red & it's hexagonal. Creating a classifier for this is relatively easy - as long as you don't confuse the training stage with erroneous data.
Edit 2 - Scaling of images:
You have to also remember that when running the detection stage it takes your classifier and starts small and then scales up. Large, obvious features will get detected quicker - in my previous example big red blobs & hexagonal shapes. It would then start on small features i.e. text, or numbers.
Edit 3 - a much better example
This example shows you really well how training a cascade object detector works. In fact it even has the same example as with a stop sign!
To detect image of euro coin you can use several methods:
1) Train OpenCV cascade (HAAR or LBP). Don't forget use the great amount of false images. Also extend image of coin (add border).
2) Estimate image with abs gradients of original image. Use Hough Transform to detect circles (coin has shape of circle).
Related
I am totally new to OpenCV and I have started to dive into it. But I'd need a little bit of help.
So I want to combine these 2 images:
I would like the 2 images to match along their edges (ignoring the very right part of the image for now)
Can anyone please point me into the right direction? I have tried using the findTransformECC function. Here's my implementation:
cv::Mat im1 = [imageArray[1] CVMat3];
cv::Mat im2 = [imageArray[0] CVMat3];
// Convert images to gray scale;
cv::Mat im1_gray, im2_gray;
cvtColor(im1, im1_gray, CV_BGR2GRAY);
cvtColor(im2, im2_gray, CV_BGR2GRAY);
// Define the motion model
const int warp_mode = cv::MOTION_AFFINE;
// Set a 2x3 or 3x3 warp matrix depending on the motion model.
cv::Mat warp_matrix;
// Initialize the matrix to identity
if ( warp_mode == cv::MOTION_HOMOGRAPHY )
warp_matrix = cv::Mat::eye(3, 3, CV_32F);
else
warp_matrix = cv::Mat::eye(2, 3, CV_32F);
// Specify the number of iterations.
int number_of_iterations = 50;
// Specify the threshold of the increment
// in the correlation coefficient between two iterations
double termination_eps = 1e-10;
// Define termination criteria
cv::TermCriteria criteria (cv::TermCriteria::COUNT+cv::TermCriteria::EPS, number_of_iterations, termination_eps);
// Run the ECC algorithm. The results are stored in warp_matrix.
findTransformECC(
im1_gray,
im2_gray,
warp_matrix,
warp_mode,
criteria
);
// Storage for warped image.
cv::Mat im2_aligned;
if (warp_mode != cv::MOTION_HOMOGRAPHY)
// Use warpAffine for Translation, Euclidean and Affine
warpAffine(im2, im2_aligned, warp_matrix, im1.size(), cv::INTER_LINEAR + cv::WARP_INVERSE_MAP);
else
// Use warpPerspective for Homography
warpPerspective (im2, im2_aligned, warp_matrix, im1.size(),cv::INTER_LINEAR + cv::WARP_INVERSE_MAP);
UIImage* result = [UIImage imageWithCVMat:im2_aligned];
return result;
I have tried playing around with the termination_eps and number_of_iterations and increased/decreased those values, but they didn't really make a big difference.
So here's the result:
What can I do to improve my result?
EDIT: I have marked the problematic edges with red circles. The goal is to warp the bottom image and make it match with the lines from the image above:
I did a little bit of research and I'm afraid the findTransformECC function won't give me the result I'd like to have :-(
Something important to add:
I actually have an array of those image "stripes", 8 in this case, they all look similar to the images shown here and they all need to be processed to match the line. I have tried experimenting with the stitch function of OpenCV, but the results were horrible.
EDIT:
Here are the 3 source images:
The result should be something like this:
I transformed every image along the lines that should match. Lines that are too far away from each other can be ignored (the shadow and the piece of road on the right portion of the image)
By your images, it seems that they overlap. Since you said the stitch function didn't get you the desired results, implement your own stitching. I'm trying to do something close to that too. Here is a tutorial on how to implement it in c++: https://ramsrigoutham.com/2012/11/22/panorama-image-stitching-in-opencv/
You can use Hough algorithm with high threshold on two images and then compare the vertical lines on both of them - most of them should be shifted a bit, but keep the angle.
This is what I've got from running this algorithm on one of the pictures:
Filtering out horizontal lines should be easy(as they are represented as Vec4i), and then you can align the remaining lines together.
Here is the example of using it in OpenCV's documentation.
UPDATE: another thought. Aligning the lines together can be done with the concept similar to how cross-correlation function works. Doesn't matter if picture 1 has 10 lines, and picture 2 has 100 lines, position of shift with most lines aligned(which is, mostly, the maximum for CCF) should be pretty close to the answer, though this might require some tweaking - for example giving weight to every line based on its length, angle, etc. Computer vision never has a direct way, huh :)
UPDATE 2: I actually wonder if taking bottom pixels line of top image as an array 1 and top pixels line of bottom image as array 2 and running general CCF over them, then using its maximum as shift could work too... But I think it would be a known method if it worked good.
I'm trying to implement static hand gesture recognition for only 6 gestures. Because the hand could turn a bit i tried surf + flann as these are invariant.
The images are binary and when i compare them i get bad results and i even don't understand them. For example, for images equal i get 1 or 2 good keypoints and for diferent images i get 5 or 6 good keypoints.
Do you have any sugestion to implement gesture recognition for this case?
Result of a train and query image:
double max_dist = 0, min_dist = 100;
matcher.match(gestoDescriptors,t1descriptors,matches);
for (int i=0; i<gestoDescriptors.rows; i++){
double dist = matches[i].distance;
if (dist<min_dist)
min_dist = dist;
if (dist>max_dist)
max_dist=dist;
}
vector<DMatch>t1_good_matches;
for (int i=0; i<gestoDescriptors.rows; i++){
if (matches[i].distance<=max(2*min_dist,0.02)){
t1_good_matches.push_back(matches[i]);
}
}
//-- Draw only "good" matches
Mat img_matches;
drawMatches(gestoImage,gestoKeypoints,train1,t1keypoints,t1_good_matches,
img_matches, Scalar::all(-1), Scalar::all(-1),
vector<char>(), DrawMatchesFlags::NOT_DRAW_SINGLE_POINTS );
I haven't used SURF for feature extraction extensively, but from what I can gather, it's important that the image be reasonably "expressive" in the sense that there are various interest points with unique properties. I think the main issue you have is that images of hands (and particularly images that have been thresholded to yield binary images) don't have many unique points of interest. Especially due to the scale and rotation-invariant nature of SURF features, there is very little in your image that would qualify as unique to a specific gesture.
You might want to try using your binary image as a mask just to remove the background, and then try the SURF and FLANN logic on the full-color (or maybe grayscale) hand gestures.
In the long run, I think you'd be better off researching Convolutional Neural Networks if you're serious about building a high-quality gesture recognition system. CNNs require far less pre-processing of the image and are particularly well-suited for this kind of application.
I have been using OpenCV's SVM and RF for a multi-class face recognition problem with 11 classes and only 5 images per class. I used two kinds of features - initially a toy intensity image feature (just each image resized to 32x32 grayscale) and then the second feature was simply another toy feature using Tan Triggs preprocessing(link). Here is the feature code:
void Feature::makeFeature(cv::Mat &image, cv::Mat &result)
{
cv::resize( image, image, cv::Size(32, 32), 0, 0, cv::INTER_CUBIC );
cv::equalizeHist(image, image);
// Images must be aligned - Only pitch executed, yaw and roll assumed negligible
algmt->getAlignedImage( image, image ); // image alignment
// tan triggs
{
tan_triggs_preprocessing(image, result);
result = result.reshape(0, 1); // make a single row vector, needed for the training samples matrix
}
// if plain intensity
{
// image.copyTo(result);
// result.convertTo(result, CV_32F, 1.0f/255.0f);
// result = result.reshape(0, 1); // make a single row vector, needed for the training samples matrix
}
}
Where the tan_triggs_preprocessing function is the same as the Tan Triggs preprocessing function given in the link. I added one step - i normalized the result between 0 and 1.
The results on test for both were not very good, as expected, but then I made a silly mistake and discovered something strange: When I accidentally gave the training directory as input for both training and test, I get 100% results on the plain intensity feature, but the Tan Triggs feature gives the following as result:
SVM Training Complete
Total number of correct: 51 and accuracy: 92.7273
RF Training Complete
Total number of correct: 53 and accuracy: 96.3636
I do know however much you overfit the result should be perfect when the training set is input to test. Everything else is standard, both SVM and RF are standard as in the OpenCV examples. Besides I get 100% for plain intensity feature so of course I am mucking something up here when using Tan Triggs. Anyone has any idea what mistake I am making?
I have used other complex features like LTPs and LQPs without issue, but this preprocessing method is something I want to use. I use the Jain-Learned Miller congealing algorithm for alignment as I assume frontals for face recognition, no pose correction.
For my college project I need to identify a species of a plant from plant leaf shape by detecting edges of a leaf. (I use OpenCV 2.4.9 and C++), but the source image has taken in the real environment of the plant and has more than one leaf. See the below example image. So here I need to extract the edge pattern of just one leaf to process further.
Using Canny Edge Detector I can identify edges of the whole image.
But I don't know how to proceed from here to extract edge pattern of just one leaf, may be more clear and complete leaf. I don't know even if this is possible also. Can anyone please tell me if this is possible how to extract edges of one leaf I just want to know the image peocessing steps that I need to apply to the image. I don't want any code samples. I'm new to image processing and OpenCV and learning by doing experiments.
Thanks in advance.
Edit
As Luis said said I have done Morphological close to the image after doing edge detection using Canny edge detection, but it seems still it is difficult me to find the largest contour from the image.
Here are the steps I have taken to process the image
Apply Bilateral Filter to reduce noise
bilateralFilter(img_src, img_blur, 31, 31 * 2, 31 / 2);
Adjust contrast by histogram equaliztion
cvtColor(img_blur,img_equalized,CV_BGR2GRAY);
Apply Canny edge detector
Canny(img_equalized, img_edge_detected, 20, 60, 3);
Threshold binary image to remove some background data
threshold(img_edge_detected, img_threshold, 1, 255,THRESH_BINARY_INV);
Morphological close of the image
morphologyEx(img_threshold, img_closed, MORPH_CLOSE, getStructuringElement(MORPH_ELLIPSE, Size(2, 2)));
Following are the resulting images I'm getting.
This result I'm getting for the above original image
Source image and result for second image
Source :
Result :
Is there any way to detect the largest contour and extract it from the image ?
Note that my final target is to create a plant identification system using real environmental image, but here I cannot use template matching or masking kind of things because the user has to take an image and upload it so the system doesn't have any prior idea about the leaf.
Here is the full code
#include <opencv\cv.h>
#include <opencv\highgui.h>
using namespace cv;
int main()
{
Mat img_src, img_blur,img_gray,img_equalized,img_edge_detected,img_threshold,img_closed;
//Load original image
img_src = imread("E:\\IMAG0196.jpg");
//Apply Bilateral Filter to reduce noise
bilateralFilter(img_src, img_blur, 31, 31 * 2, 31 / 2);
//Adjust contrast by histogram equaliztion
cvtColor(img_blur,img_equalized,CV_BGR2GRAY);
//Apply Canny edge detector
Canny(img_equalized, img_edge_detected, 20, 60, 3);
//Threshold binary image to remove some background data
threshold(img_edge_detected, img_threshold, 15, 255,THRESH_BINARY_INV);
//Morphological close of the image
morphologyEx(img_threshold, img_closed, MORPH_CLOSE, getStructuringElement(MORPH_ELLIPSE, Size(2, 2)));
imshow("Result", img_closed);
waitKey(0);
return 0;
}
Thank you.
Well there is a similar question that was asked here:
opencv matching edge images
It seems that edge information is not a good descriptor for the image, still if you want to try it I'll do the following steps:
Load image and convert it to grayscale
Detect edges - Canny, Sobel try them and find what it suits you best
Set threshold to a given value that eliminates most background - Binarize image
Close the image - Morphological close dont close the window!
Count and identify objects in the image (Blobs, Watershed)
Check each object for a shape (assuming you have described shapes of the leaf you could find before or a standard shape like an ellipse) features like:
http://docs.opencv.org/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html
http://www.math.uci.edu/icamp/summer/research_11/park/shape_descriptors_survey.pdf
If a given object has a given shape that you described as a leaf then you detected the leaf!.
I believe that given images are taken in the real world these algorithm will perform poorly but it's a start. Well hope it helps :).
-- POST EDIT 06/07
Well since you have no prior information about the leaf, I think the best we could do is the following:
Load image
Bilateral filter
Canny
Extract contours
Assume: that the contour with the largest perimeter is the leaf
Convex hull the 3 or 2 largest contours (the blue line is the convex hull done)
Use this convex hull to do a graph cut on the image and segmentate it
If you do those steps, you'll end up with images like these:
I won't post the code here, but you can check it out in my messy github. I hope you don't mind it was made in python.
Leaf - Github
Still, I have a couple of things to finish that could improve the result.. Roadmap would be:
Define the mask in the graphcut (like its described in the doc)
Apply region grow may give a better convex hull
Remove all edges that touch the border of the image can help to identify larger edges
Well, again, I hope it helps
I am currently working on image processing project. I am using Opencv2.3.1 with VC++.
I have written the code such that, the input image is filtered to only blue color and converted to a binary image. The binary image has some small objects which I don't want. I wanted to eliminate those small objects, so i used openCV's cvFindContours() method to detect contours in Binary image. but the problem is I cant eliminate the small objects in the image output. I used cvContourArea() function , but didn't work properly.. , erode function also didn't work properly.
So please someone help me with this problem..
The binary image which I obtained :
The result/output image which I want to obtain :
Ok, I believe your problem could be solved with the bounding box demo recently introduced by OpenCV.
As you have probably noticed, the object you are interested at should be inside the largest rectangle draw in the picture. Luckily, this code is not very complex and I'm sure you can figure it all out by investigating and experimenting with it.
Here is my solution to eliminate small contours.
The basic idea is check the length/area for each contour, then delete the smaller one from vector container.
normally you will get contours like this
Mat canny_output; //example from OpenCV Tutorial
vector<vector<Point> > contours;
vector<Vec4i> hierarchy;
Canny(src_img, canny_output, thresh, thresh*2, 3);//with or without, explained later.
findContours(canny_output, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE, Point(0,0));
With Canny() pre-processing, you will get contour segments, however each segment is stored with boundary pixels as a closed ring. In this case, you can check the length and delete the small one like
for (vector<vector<Point> >::iterator it = contours.begin(); it!=contours.end(); )
{
if (it->size()<contour_length_threshold)
it=contours.erase(it);
else
++it;
}
Without Canny() preprocessing, you will get contours of objects.
Similarity, you can also use area to define a threshold to eliminate small objects, as OpenCV tutorial shown
vector<Point> contour = contours[i];
double area0 = contourArea(contour);
this contourArea() is the number of non-zero pixels
Are you sure filtering by small contour area didn't work? It's always worked for me. Can we see your code?
Also, as sue-ling mentioned, it's a good idea to use both erode and dilate to approximately preserve area. To remove small noisy bits, use erode first, and to fill in holes, use dilate first.
And another aside, you may want to check out the new C++ versions of the cv* functions if you weren't aware of them already (documentation for findContours). They're much easier to use, in my opinion.
Judging by the before and after images, you need to determine the area of all the white areas or blobs, then apply a threshold area value. This would eliminate all areas less than the value and leave only the large white region which is seen in the 2nd image. After using the cvFindContours function, try using 0 order moments. This would return the area of the blobs in the image. This link might be helpful in implementing what I've just described.
http://www.aishack.in/2010/07/tracking-colored-objects-in-opencv/
I believe you can use morphological operators like erode and dilate (read more here)
You need to perform erosion with a kernel size near to the radius of the circle on the right (the one you want to eliminate).
followed by dilation using the same kernel to fill the gaps created by the erosion step.
FYI erosion followed by dilation using the same kernel is called opening.
the code will be something like this
int erosion_size = 30; // adjust with you application
Mat erode_element = getStructuringElement( MORPH_ELLIPSE,
Size( 2*erosion_size + 1, 2*erosion_size+1 ),
Point( erosion_size, erosion_size ) );
erode( binary_img, binary_img, erode_element );
dilate( binary_img, binary_img, erode_element );
It is not a fast way but may be usefull in some cases.
There is a new function in OpencCV 3.0 - connectedComponentsWithStats. With it we can get area of connected components and eliminate unnecessary. So we can easy remove circle with holes, with the same bounding box as solid circle.