How do I read numbers in an image when the lines of the characters aren't aligned with the image? Do I need to rotate the entire image, or can I give KNN character recognition an axis to read from?
In the included image are several numbers angled. If I attempt to read using the current code, it will not produce accurate results because the objects it attempts to match with a character are not straight with respect to the image.
// global variables ///////////////////////////////////////////////////////////////////////////////
const int MIN_CONTOUR_AREA = 60;
const int RESIZED_IMAGE_WIDTH = 20;
const int RESIZED_IMAGE_HEIGHT = 30;
bool Does_image_contain_barcode = 1;
class ContourWithData {
// member variables ///////////////////////////////////////////////////////////////////////////
std::vector<cv::Point> ptContour; // contour
cv::Rect boundingRect; // bounding rect for contour
float fltArea; // area of contour
bool checkIfContourIsValid() { // obviously in a production grade program
if (fltArea < MIN_CONTOUR_AREA) return false; // we would have a much more robust function for
return true; // identifying if a contour is valid !!
static bool sortByBoundingRectXPosition(const ContourWithData& cwdLeft, const ContourWithData& cwdRight) { // this function allows us to sort
return(cwdLeft.boundingRect.x < cwdRight.boundingRect.x); // the contours from left to right
int main() {
std::vector<ContourWithData> allContoursWithData; // declare empty vectors,
std::vector<ContourWithData> validContoursWithData; // we will fill these shortly
// read in training classifications ///////////////////////////////////////////////////
cv::Mat matClassificationInts; // we will read the classification numbers into this variable as though it is a vector
cv::FileStorage fsClassifications("classifications.xml", cv::FileStorage::READ); // open the classifications file
if (fsClassifications.isOpened() == false) { // if the file was not opened successfully
std::cout << "error, unable to open training classifications file, exiting program\n\n"; // show error message
return(0); // and exit program
fsClassifications\["classifications"\] >> matClassificationInts; // read classifications section into Mat classifications variable
fsClassifications.release(); // close the classifications file
// read in training images ////////////////////////////////////////////////////////////
cv::Mat matTrainingImagesAsFlattenedFloats; // we will read multiple images into this single image variable as though it is a vector
cv::FileStorage fsTrainingImages("images.xml", cv::FileStorage::READ); // open the training images file
if (fsTrainingImages.isOpened() == false) { // if the file was not opened successfully
std::cout << "error, unable to open training images file, exiting program\n\n"; // show error message
return(0); // and exit program
fsTrainingImages\["images"\] >> matTrainingImagesAsFlattenedFloats; // read images section into Mat training images variable
fsTrainingImages.release(); // close the traning images file
// train //////////////////////////////////////////////////////////////////////////////
cv::Ptr<cv::ml::KNearest> kNearest(cv::ml::KNearest::create()); // instantiate the KNN object
// finally we get to the call to train, note that both parameters have to be of type Mat (a single Mat)
// even though in reality they are multiple images / numbers
kNearest->train(matTrainingImagesAsFlattenedFloats, cv::ml::ROW_SAMPLE, matClassificationInts);
cv::Mat matTestingNumbers = cv::imread("bc_sick_12_c.jpg"); // read in the test numbers image
if (matTestingNumbers.empty()) { // if unable to open image
std::cout << "error: image not read from file\n\n"; // show error message on command line
return(0); // and exit program
cv::Mat matGrayscale; //
cv::Mat matBlurred; // declare more image variables
cv::Mat matThresh; //
cv::Mat matThreshCopy; //
cv::cvtColor(matTestingNumbers, matGrayscale, CV_BGR2GRAY); // convert to grayscale
// blur
cv::GaussianBlur(matGrayscale, // input image
matBlurred, // output image
cv::Size(5, 5), // smoothing window width and height in pixels
0); // sigma value, determines how much the image will be blurred, zero makes function choose the sigma value
// filter image from grayscale to black and white
cv::adaptiveThreshold(matBlurred, // input image
matThresh, // output image
255, // make pixels that pass the threshold full white
cv::ADAPTIVE_THRESH_GAUSSIAN_C, // use gaussian rather than mean, seems to give better results
cv::THRESH_BINARY_INV, // invert so foreground will be white, background will be black
11, // size of a pixel neighborhood used to calculate threshold value
4); // constant subtracted from the mean or weighted mean (default 2)
matThreshCopy = matThresh.clone(); // make a copy of the thresh image, this in necessary b/c findContours modifies the image
std::vector<std::vector<cv::Point> > ptContours; // declare a vector for the contours
std::vector<cv::Vec4i> v4iHierarchy; // declare a vector for the hierarchy (we won't use this in this program but this may be helpful for reference)
cv::findContours(matThreshCopy, // input image, make sure to use a copy since the function will modify this image in the course of finding contours
ptContours, // output contours
v4iHierarchy, // output hierarchy
cv::RETR_EXTERNAL, // retrieve the outermost contours only
cv::CHAIN_APPROX_SIMPLE); // compress horizontal, vertical, and diagonal segments and leave only their end points
for (int i = 0; i < ptContours.size(); i++) { // for each contour
ContourWithData contourWithData; // instantiate a contour with data object
contourWithData.ptContour = ptContours\[i\]; // assign contour to contour with data
contourWithData.boundingRect = cv::boundingRect(contourWithData.ptContour); // get the bounding rect
contourWithData.fltArea = cv::contourArea(contourWithData.ptContour); // calculate the contour area
allContoursWithData.push_back(contourWithData); // add contour with data object to list of all contours with data
for (int i = 0; i < allContoursWithData.size(); i++) { // for all contours
if (allContoursWithData\[i\].checkIfContourIsValid()) { // check if valid
validContoursWithData.push_back(allContoursWithData\[i\]); // if so, append to valid contour list
// sort contours from left to right
std::sort(validContoursWithData.begin(), validContoursWithData.end(), ContourWithData::sortByBoundingRectXPosition);
std::string strFinalString; // declare final string, this will have the final number sequence by the end of the program
for (int i = 0; i < validContoursWithData.size(); i++) { // for each contour
// draw a green rect around the current char
cv::rectangle(matTestingNumbers, // draw rectangle on original image
validContoursWithData\[i\].boundingRect, // rect to draw
cv::Scalar(0, 255, 0), // green
2); // thickness
cv::Mat matROI = matThresh(validContoursWithData\[i\].boundingRect); // get ROI image of bounding rect
cv::Mat matROIResized;
cv::resize(matROI, matROIResized, cv::Size(RESIZED_IMAGE_WIDTH, RESIZED_IMAGE_HEIGHT)); // resize image, this will be more consistent for recognition and storage
cv::Mat matROIFloat;
matROIResized.convertTo(matROIFloat, CV_32FC1); // convert Mat to float, necessary for call to find_nearest
cv::Mat matROIFlattenedFloat = matROIFloat.reshape(1, 1);
cv::Mat matCurrentChar(0, 0, CV_32F);
kNearest->findNearest(matROIFlattenedFloat, 1, matCurrentChar); // finally we can call find_nearest !!!
float fltCurrentChar = (float)<float>(0, 0);
strFinalString = strFinalString + char(int(fltCurrentChar)); // append current char to full string
std::cout << "\n\n" << "numbers read = " << strFinalString << "\n\n"; // show the full string
cv::imshow("matTestingNumbers", matTestingNumbers); // show input image with green boxes drawn around found digits
//cv::imshow("matTestingNumbers", matThreshCopy);
cv::waitKey(0); // wait for user key press
Firstly I integrate OpenCV framework to XCode and All the OpenCV code is on ObjectiveC and I am using in Swift Using bridging header. I am new to OpenCV Framework and trying to achieve count of vertical lines from the image.
Here is my code:
First I am converting the image to GrayScale
+ (UIImage *)convertToGrayscale:(UIImage *)image {
cv::Mat mat;
UIImageToMat(image, mat);
cv::Mat gray;
cv::cvtColor(mat, gray, CV_RGB2GRAY);
UIImage *grayscale = MatToUIImage(gray);
return grayscale;
Then, I am detecting edges so I can find the line of gray color
+ (UIImage *)detectEdgesInRGBImage:(UIImage *)image {
cv::Mat mat;
UIImageToMat(image, mat);
//Prepare the image for findContours
cv::threshold(mat, mat, 128, 255, CV_THRESH_BINARY);
//Find the contours. Use the contourOutput Mat so the original image doesn't get overwritten
std::vector<std::vector<cv::Point> > contours;
cv::Mat contourOutput = mat.clone();
cv::findContours( contourOutput, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE );
NSLog(#"Count =>%lu", contours.size());
//For Blue
/*cv::GaussianBlur(mat, gray, cv::Size(11, 11), 0); */
UIImage *grayscale = MatToUIImage(mat);
return grayscale;
This both Function is written on Objective C
Here, I am calling both function Swift
override func viewDidLoad() {
let img = UIImage(named: "imagenamed")
let img1 = Wrapper.convert(toGrayscale: img)
self.capturedImageView.image = Wrapper.detectEdges(inRGBImage: img1)
I was doing this for some days and finding some useful documents(Reference Link)
OpenCV - how to count objects in photo?
How to count number of lines (Hough Trasnform) in OpenCV
OPENCV Documents
Basically, I understand the first we need to convert this image to black and white, and then using cvtColor, threshold and findContours we can find the colors or lines.
I am attaching the image that vertical Lines I want to get.
Original Image
Output Image that I am getting
I got number of lines count =>10
I am not able to get accurate count here.
Please guide me on this. Thank You!
Since you want to detect the number of the vertical lines, there is a very simple approach I can suggest for you. You already got a clear output and I used this output in my code. Here are the steps before the code:
Preprocess the input image to get the lines clearly
Check each row and check until get a pixel whose value is higher than 100(threshold value I chose)
Then increase the line counter for that row
Continue on that line until get a pixel whose value is lower than 100
Restart from step 3 and finish the image for each row
At the end, check the most repeated element in the array which you assigned line numbers for each row. This number will be the number of vertical lines.
Note: If the steps are difficult to understand, think like this way:
" I am checking the first row, I found a pixel which is higher than
100, now this is a line edge starting, increase the counter for this
row. Search on this row until get a pixel smaller than 100, and then
research a pixel bigger than 100. when row is finished, assign the
line number for this row to a big array. Do this for all image. At the
end, since some lines looks like two lines at the top and also some
noises can occur, you should take the most repeated element in the big
array as the number of lines."
Here is the code part in C++:
#include <vector>
#include <iostream>
#include <opencv2/opencv.hpp>
#include <opencv2/highgui/highgui.hpp>
int main()
cv::Mat img = cv::imread("/ur/img/dir/img.jpg",cv::IMREAD_GRAYSCALE);
std::vector<int> numberOfVerticalLinesForEachRow;
cv::Rect r(0,0,img.cols-10,200);
img = img(r);
bool blackCheck = 1;
for(int i=0; i<img.rows; i++)
int numberOfLines = 0;
for(int j=0; j<img.cols; j++)
if((int)<uchar>(cv::Point(j,i))>100 && blackCheck)
blackCheck = 0;
blackCheck = 1;
// In this part you need a simple algorithm to check the most repeated element
for(int k:numberOfVerticalLinesForEachRow)
Here's another possible approach. It relies mainly on the cv::thinning function from the extended image processing module to reduce the lines at a width of 1 pixel. We can crop a ROI from this image and count the number of transitions from 255 (white) to 0 (black). These are the steps:
Threshold the image using Otsu's method
Apply some morphology to clean up the binary image
Get the skeleton of the image
Crop a ROI from the center of the image
Count the number of jumps from 255 to 0
This is the code, be sure to include the extended image processing module (ximgproc) and also link it before compiling it:
#include <iostream>
#include <opencv2/opencv.hpp>
#include <opencv2/ximgproc.hpp> // The extended image processing module
// Read Image:
std::string imagePath = "D://opencvImages//";
cv::Mat inputImage = cv::imread( imagePath+"IN2Xh.png" );
// Convert BGR to Grayscale:
cv::cvtColor( inputImage, inputImage, cv::COLOR_BGR2GRAY );
// Get binary image via Otsu:
cv::threshold( inputImage, inputImage, 0, 255, cv::THRESH_OTSU );
The above snippet produces the following image:
Note that there's a little bit of noise due to the thresholding, let's try to remove those isolated blobs of white pixels by applying some morphology. Maybe an opening, which is an erosion followed by dilation. The structuring elements and iterations, though, are not the same, and these where found by experimentation. I wanted to remove the majority of the isolated blobs without modifying too much the original image:
// Apply Morphology. Erosion + Dilation:
// Set rectangular structuring element of size 3 x 3:
cv::Mat SE = cv::getStructuringElement( cv::MORPH_RECT, cv::Size(3, 3) );
// Set the iterations:
int morphoIterations = 1;
cv::morphologyEx( inputImage, inputImage, cv::MORPH_ERODE, SE, cv::Point(-1,-1), morphoIterations);
// Set rectangular structuring element of size 5 x 5:
SE = cv::getStructuringElement( cv::MORPH_RECT, cv::Size(5, 5) );
// Set the iterations:
morphoIterations = 2;
cv::morphologyEx( inputImage, inputImage, cv::MORPH_DILATE, SE, cv::Point(-1,-1), morphoIterations);
This combination of structuring elements and iterations yield the following filtered image:
Its looking alright. Now comes the main idea of the algorithm. If we compute the skeleton of this image, we would "normalize" all the lines to a width of 1 pixel, which is very handy, because we could reduce the image to a 1 x 1 (row) matrix and count the number of jumps. Since the lines are "normalized" we could get rid of possible overlaps between lines. Now, skeletonized images sometimes produce artifacts near the borders of the image. These artifacts resemble thickened anchors at the first and last row of the image. To prevent these artifacts we can extend borders prior to computing the skeleton:
// Extend borders to avoid skeleton artifacts, extend 5 pixels in all directions:
cv::copyMakeBorder( inputImage, inputImage, 5, 5, 5, 5, cv::BORDER_CONSTANT, 0 );
// Get the skeleton:
cv::Mat imageSkelton;
cv::ximgproc::thinning( inputImage, imageSkelton );
This is the skeleton obtained:
Nice. Before we count jumps, though, we must observe that the lines are skewed. If we reduce this image directly to a one row, some overlapping could indeed happen between to lines that are too skewed. To prevent this, I crop a middle section of the skeleton image and count transitions there. Let's crop the image:
// Crop middle ROI:
cv::Rect linesRoi;
linesRoi.x = 0;
linesRoi.y = 0.5 * imageSkelton.rows;
linesRoi.width = imageSkelton.cols;
linesRoi.height = 1;
cv::Mat imageROI = imageSkelton( linesRoi );
This would be the new ROI, which is just the middle row of the skeleton image:
Let me prepare a BGR copy of this just to draw some results:
// BGR version of the Grayscale ROI:
cv::Mat colorROI;
cv::cvtColor( imageROI, colorROI, cv::COLOR_GRAY2BGR );
Ok, let's loop through the image and count the transitions between 255 and 0. That happens when we look at the value of the current pixel and compare it with the value obtained an iteration earlier. The current pixel must be 0 and the past pixel 255. There's more than a way to loop through a cv::Mat in C++. I prefer to use cv::MatIterator_s and pointer arithmetic:
// Set the loop variables:
cv::MatIterator_<cv::Vec3b> it, end;
uchar pastPixel = 0;
int jumpsCounter = 0;
int i = 0;
// Loop thru image ROI and count 255-0 jumps:
for (it = imageROI.begin<cv::Vec3b>(), end = imageROI.end<cv::Vec3b>(); it != end; ++it) {
// Get current pixel
uchar ¤tPixel = (*it)[0];
// Compare it with past pixel:
if ( (currentPixel == 0) && (pastPixel == 255) ){
// We have a jump:
// Draw the point on the BGR version of the image:
cv::line( colorROI, cv::Point(i, 0), cv::Point(i, 0), cv::Scalar(0, 0, 255), 1 );
// current pixel is now past pixel:
pastPixel = currentPixel;
// Show image and print number of jumps found:
cv::namedWindow( "Jumps Found", CV_WINDOW_NORMAL );
cv::imshow( "Jumps Found", colorROI );
cv::waitKey( 0 );
std::cout<<"Jumps Found: "<<jumpsCounter<<std::endl;
The points where the jumps were found are drawn in red, and the number of total jumps printed is:
Jumps Found: 9
I have obtained a labeling with the connectedComponents function of C++ OpenCV, which looks like in the picture :
This is the output of the ccLabels variable, which is a cv::Mat of the same size with the original image.
So what I need to do is :
Count the occurences of each number, and select the ones that
occur more than N times, which are the "big" ones.
Segment the
areas of the "big" components, and then count the number of 4's and
0's inside that area.
My ultimate aim is to count the number of holes in the image, so I aim to infer number of holes from (number of 0's / number of 4's). This is probably not the prettiest way but the images are very uniform in terms of size and illumination, so it will meet my needs.
But I'm new to OpenCV and I don't have much idea how to accomplish this task.
Here is what I've done so far:
cv::Mat1b outImg;
cv::threshold(grayImg, outImg, 150, 255, 0); // Thresholded -binary- image
cv::Mat ccLabels;
cv::connectedComponents(outImg, ccLabels); // Each non-zero pixel is labeled with their connectedComponent ID's
// write the labels to file:
std::ofstream myfile;"ccLabels.txt");
cv::Size s = ccLabels.size();
myfile << "Size: " << s.height << " , " << s.width <<"\n";
for (int r1 = 0; r1 < s.height; r1++) {
for (int c1 = 0; c1 < s.height; c1++) {
myfile <<<int>(r1,c1);
myfile << "\n";
Since I know how to iterate inside the matrix, counting the numbers should be OK, but first I have to separate(eliminate / ignore) the "background" pixels, which are the 0's outside the connected components. Then counting should be easy.
How can I segment these "big" components? Maybe obtaining a mask, and only consider pixels where mask(x,y) = 1?
Thanks for any help !
This is the thresholded image:
And this is what I get after Canny edge detection :
This is the actual image (thresholded) :
Here a simple procedure to find the number on the dices, starting from your thresholded image
find external contours
for each contour
eventually discard small blobs
draw the filled mask
use AND and XOR to isolate internal holes
find contours, again
count contours
Number: 5
Number: 2
#include <opencv2\opencv.hpp>
#include <iostream>
#include <vector>
using namespace std;
using namespace cv;
int main(void)
// Grayscale image
Mat1b img = imread("path_to_image", IMREAD_GRAYSCALE);
// Minimum area of the contour
double minContourArea = 10;
// Prepare outpot
Mat3b result;
cvtColor(img, result, COLOR_GRAY2BGR);
// Find contours
vector<vector<Point>> contours;
findContours(img.clone(), contours, RETR_EXTERNAL, CHAIN_APPROX_SIMPLE);
for (int i = 0; i < contours.size(); ++i)
// Check area
if (contourArea(contours[i]) < minContourArea) continue;
// Black mask
Mat1b mask(img.rows, img.cols, uchar(0));
// Draw filled contour
drawContours(mask, contours, i, Scalar(255), CV_FILLED);
mask = (mask & img) ^ mask;
vector<vector<Point>> cntrs;
findContours(mask, cntrs, RETR_EXTERNAL, CHAIN_APPROX_SIMPLE);
cout << "Number: " << cntrs.size() << endl;
// Just for showing results
drawContours(result, cntrs, -1, Scalar(0,0,255), CV_FILLED);
imshow("Result", result);
return 0;
The easier way is findContours method. You find the inner contours and calculate their area( since the inner contours will be holes) and process this information accordingly.
To solve your 1st problem consider you have a set of values in values.Count the occurences of each number that as appeared.
int m=0;
for(int n=0;n<256;n++)
int c=0;
for(int q=0;q<values.size();q++)
//int c;
cout<<n<<"= "<< c<<endl;
cout<<"Total number of elements "<< m<<endl;
To solve your second problem find the largest contour in the image using findcontours, draw bounding rectangle around it and then crop it. Again use the above code to count the pixel value "4" and "0". You can find the link of it here
I would like to extract color, shape and texture features of superpixel segments of an image . Then, I would like to visualize those features in order to select the important features.
I am using the code at this link:
I would like to access each cluster of the segmented image like this study:
However, I couldn' t find the related part of the code.
The method int* SLIC::GetLabel() returns the label for each pixel. You can create a Mat header for the int* for easy access:
Mat1i labelImg(img.rows, img.cols, slic.GetLabel());
Then you can create a mask for each superpixel (label):
Mat1b superpixel_mask = labelImg == label;
and retrieve the superpixel in the original image:
Mat3b superpixel_in_img;
img.copyTo(superpixel_in_img, superpixel_mask);
Then you can compute whatever statistic you need.
Here the full code for reference:
#include <opencv2/opencv.hpp>
#include "slic.h"
int main()
// Load an image
Mat3b img = imread("path_to_image");
// Set the maximum number of superpixels
UINT n_of_superpixels = 200;
SLIC slic;
// Compute the superpixels
slic.GenerateSuperpixels(img, n_of_superpixels);
// Visualize superpixels
//Mat3b res = slic.GetImgWithContours(Scalar(0,0,255));
// Get the labels
Mat1i labelImg(img.rows, img.cols, slic.GetLabel());
// Get the actual number of labels
// may be less that n_of_superpixels
double max_dlabel;
minMaxLoc(labelImg, NULL, &max_dlabel);
int max_label = int(max_dlabel);
// Iterate over each label
for (int label = 0; label <= max_label; ++label)
// Mask for each label
Mat1b superpixel_mask = labelImg == label;
// Superpixel in original image
Mat3b superpixel_in_img;
img.copyTo(superpixel_in_img, superpixel_mask);
// Now you have the binary mask of each superpixel: superpixel_mask
// and the superpixel in the original image: superpixel_in_img
return 0;
What I'm trying to do is measure the thickness of the eyeglasses frames. I had the idea to measure the thickness of the frame's contours (may be a better way?). I have so far outlined the frame of the glasses, but there are gaps where the lines don't meet. I thought about using HoughLinesP, but I'm not sure if this is what I need.
So far I have conducted the following steps:
Convert image to grayscale
Create ROI around the eye/glasses area
Blur the image
Dilate the image (have done this to remove any thin framed glasses)
Conduct Canny edge detection
Found contours
These are the results:
This is my code so far:
//convert to grayscale
cv::Mat grayscaleImg;
cv::cvtColor( img, grayscaleImg, CV_BGR2GRAY );
//create ROI
cv::Mat eyeAreaROI(grayscaleImg, centreEyesRect);
cv::imshow("roi", eyeAreaROI);
cv::Mat blurredROI;
cv::blur(eyeAreaROI, blurredROI, Size(3,3));
cv::imshow("blurred", blurredROI);
//dilate thin lines
cv::Mat dilated_dst;
int dilate_elem = 0;
int dilate_size = 1;
int dilate_type = MORPH_RECT;
cv::Mat element = getStructuringElement(dilate_type,
cv::Size(2*dilate_size + 1, 2*dilate_size+1),
cv::Point(dilate_size, dilate_size));
cv::dilate(blurredROI, dilated_dst, element);
cv::imshow("dilate", dilated_dst);
//edge detection
int lowThreshold = 100;
int ratio = 3;
int kernel_size = 3;
cv::Canny(dilated_dst, dilated_dst, lowThreshold, lowThreshold*ratio, kernel_size);
//create matrix of the same type and size as ROI
Mat dst;
dst.create(eyeAreaROI.size(), dilated_dst.type());
dst = Scalar::all(0);
dilated_dst.copyTo(dst, dilated_dst);
cv::imshow("edges", dst);
//join the lines and fill in
vector<Vec4i> hierarchy;
vector<vector<Point>> contours;
cv::findContours(dilated_dst, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE);
cv::imshow("contours", dilated_dst);
I'm not entirely sure what the next steps would be, or as I said above, if I should use HoughLinesP and how to implement it. Any help is very much appreciated!
I think there are 2 main problems.
segment the glasses frame
find the thickness of the segmented frame
I'll now post a way to segment the glasses of your sample image. Maybe this method will work for different images too, but you'll probably have to adjust parameters, or you might be able to use the main ideas.
Main idea is:
First, find the biggest contour in the image, which should be the glasses. Second, find the two biggest contours within the previous found biggest contour, which should be the glasses within the frame!
I use this image as input (which should be your blurred but not dilated image):
// this functions finds the biggest X contours. Probably there are faster ways, but it should work...
std::vector<std::vector<cv::Point>> findBiggestContours(std::vector<std::vector<cv::Point>> contours, int amount)
std::vector<std::vector<cv::Point>> sortedContours;
if(amount <= 0) amount = contours.size();
if(amount > contours.size()) amount = contours.size();
for(int chosen = 0; chosen < amount; )
double biggestContourArea = 0;
int biggestContourID = -1;
for(unsigned int i=0; i<contours.size() && contours.size(); ++i)
double tmpArea = cv::contourArea(contours[i]);
if(tmpArea > biggestContourArea)
biggestContourArea = tmpArea;
biggestContourID = i;
if(biggestContourID >= 0)
//std::cout << "found area: " << biggestContourArea << std::endl;
// found biggest contour
// add contour to sorted contours vector:
// remove biggest contour from original vector:
contours[biggestContourID] = contours.back();
// should never happen except for broken contours with size 0?!?
return sortedContours;
return sortedContours;
int main()
cv::Mat input = cv::imread("../Data/glass2.png", CV_LOAD_IMAGE_GRAYSCALE);
cv::Mat inputColors = cv::imread("../Data/glass2.png"); // used for displaying later
cv::imshow("input", input);
//edge detection
int lowThreshold = 100;
int ratio = 3;
int kernel_size = 3;
cv::Mat canny;
cv::Canny(input, canny, lowThreshold, lowThreshold*ratio, kernel_size);
cv::imshow("canny", canny);
// close gaps with "close operator"
cv::Mat mask = canny.clone();
cv::imshow("closed mask",mask);
// extract outermost contour
std::vector<cv::Vec4i> hierarchy;
std::vector<std::vector<cv::Point>> contours;
//cv::findContours(mask, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE);
cv::findContours(mask, contours, hierarchy, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
// find biggest contour which should be the outer contour of the frame
std::vector<std::vector<cv::Point>> biggestContour;
biggestContour = findBiggestContours(contours,1); // find the one biggest contour
if(biggestContour.size() < 1)
std::cout << "Error: no outer frame of glasses found" << std::endl;
return 1;
// draw contour on an empty image
cv::Mat outerFrame = cv::Mat::zeros(mask.rows, mask.cols, CV_8UC1);
cv::imshow("outer frame border", outerFrame);
// now find the glasses which should be the outer contours within the frame. therefore erode the outer border ;)
cv::Mat glassesMask = outerFrame.clone();
cv::erode(glassesMask,glassesMask, cv::Mat());
cv::imshow("eroded outer",glassesMask);
// after erosion if we dilate, it's an Open-Operator which can be used to clean the image.
cv::Mat cleanedOuter;
cv::dilate(glassesMask,cleanedOuter, cv::Mat());
cv::imshow("cleaned outer",cleanedOuter);
// use the outer frame mask as a mask for copying canny edges. The result should be the inner edges inside the frame only
cv::Mat glassesInner;
canny.copyTo(glassesInner, glassesMask);
// there is small gap in the contour which unfortunately cant be closed with a closing operator...
cv::dilate(glassesInner, glassesInner, cv::Mat());
//cv::erode(glassesInner, glassesInner, cv::Mat());
// this part was cheated... in fact we would like to erode directly after dilation to not modify the thickness but just close small gaps.
cv::imshow("innerCanny", glassesInner);
// extract contours from within the frame
std::vector<cv::Vec4i> hierarchyInner;
std::vector<std::vector<cv::Point>> contoursInner;
//cv::findContours(glassesInner, contoursInner, hierarchyInner, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE);
cv::findContours(glassesInner, contoursInner, hierarchyInner, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
// find the two biggest contours which should be the glasses within the frame
std::vector<std::vector<cv::Point>> biggestInnerContours;
biggestInnerContours = findBiggestContours(contoursInner,2); // find the one biggest contour
if(biggestInnerContours.size() < 1)
std::cout << "Error: no inner frames of glasses found" << std::endl;
return 1;
// draw the 2 biggest contours which should be the inner glasses
cv::Mat innerGlasses = cv::Mat::zeros(mask.rows, mask.cols, CV_8UC1);
for(unsigned int i=0; i<biggestInnerContours.size(); ++i)
cv::imshow("inner frame border", innerGlasses);
// since we dilated earlier and didnt erode quite afterwards, we have to erode here... this is a bit of cheating :-(
cv::erode(innerGlasses,innerGlasses,cv::Mat() );
// remove the inner glasses from the frame mask
cv::Mat fullGlassesMask = cleanedOuter - innerGlasses;
cv::imshow("complete glasses mask", fullGlassesMask);
// color code the result to get an impression of segmentation quality
cv::Mat outputColors1 = inputColors.clone();
cv::Mat outputColors2 = inputColors.clone();
for(int y=0; y<fullGlassesMask.rows; ++y)
for(int x=0; x<fullGlassesMask.cols; ++x)
if(!<unsigned char>(y,x))<cv::Vec3b>(y,x)[1] = 255;
else<cv::Vec3b>(y,x)[1] = 255;
cv::imshow("output", outputColors1);
cv::imwrite("../Data/Output/face_colored.png", outputColors1);
cv::imwrite("../Data/Output/glasses_colored.png", outputColors2);
cv::imwrite("../Data/Output/glasses_fullMask.png", fullGlassesMask);
return 0;
I get this result for segmentation:
the overlay in original image will give you an impression of quality:
and inverse:
There are some tricky parts in the code and it's not tidied up yet. I hope it's understandable.
The next step would be to compute the thickness of the the segmented frame. My suggestion is to compute the distance transform of the inversed mask. From this you will want to compute a ridge detection or skeletonize the mask to find the ridge. After that use the median value of ridge distances.
Anyways I hope this posting can help you a little, although it's not a solution yet.
Depending on lighting, frame color etc this may or may not work but how about simple color detection to separate the frame ? Frame color will usually be a lot darker than human skin. You'll end up with a binary image (just black and white) and by calculating the number (area) of black pixels you get the area of the frame.
Another possible way is to get better edge detection, by adjusting/dilating/eroding/both until you get better contours. You will also need to differentiate the contour from the lenses and then apply cvContourArea.
When starting my programm through command line, have such problem:
OpenCV Error: Image step is wrong (The matrix is not continuous, thus its number of rows can not be changed) un cv::Mat::reshape, file C:\builds\2_4_PackSlave-win64-vc12-shared\opencv\modules\core\src\matrix.cpp, line 802.
Code of the program:
#include "opencv2/core/core.hpp"
#include "opencv2/contrib/contrib.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/objdetect/objdetect.hpp"
#include <iostream>
#include <fstream>
#include <sstream>
using namespace cv;
using namespace std;
static void read_csv(const string& filename, vector<Mat>& images, vector<int>& labels, char separator = ';') {
std::ifstream file(filename.c_str(), ifstream::in);
if (!file) {
string error_message = "No valid input file was given, please check the given filename.";
CV_Error(CV_StsBadArg, error_message);
string line, path, classlabel;
while (getline(file, line)) {
stringstream liness(line);
getline(liness, path, separator);
getline(liness, classlabel);
if (!path.empty() && !classlabel.empty()) {
images.push_back(imread(path, 0));
int main(int argc, const char *argv[]) {
// Check for valid command line arguments, print usage
// if no arguments were given.
if (argc != 4) {
cout << "usage: " << argv[0] << " </path/to/haar_cascade> </path/to/csv.ext> </path/to/device id>" << endl;
cout << "\t </path/to/haar_cascade> -- Path to the Haar Cascade for face detection." << endl;
cout << "\t </path/to/csv.ext> -- Path to the CSV file with the face database." << endl;
cout << "\t <device id> -- The webcam device id to grab frames from." << endl;
// Get the path to your CSV:
string fn_haar = string(argv[1]);
string fn_csv = string(argv[2]);
int deviceId = atoi(argv[3]);
// These vectors hold the images and corresponding labels:
vector<Mat> images;
vector<int> labels;
// Read in the data (fails if no valid input filename is given, but you'll get an error message):
try {
read_csv(fn_csv, images, labels);
catch (cv::Exception& e) {
cerr << "Error opening file \"" << fn_csv << "\". Reason: " << e.msg << endl;
// nothing more we can do
// Get the height from the first image. We'll need this
// later in code to reshape the images to their original
// size AND we need to reshape incoming faces to this size:
int im_width = images[0].cols;
int im_height = images[0].rows;
// Create a FaceRecognizer and train it on the given images:
Ptr<FaceRecognizer> model = createFisherFaceRecognizer();
model->train(images, labels);
// That's it for learning the Face Recognition model. You now
// need to create the classifier for the task of Face Detection.
// We are going to use the haar cascade you have specified in the
// command line arguments:
CascadeClassifier haar_cascade;
// Get a handle to the Video device:
VideoCapture cap(deviceId);
// Check if we can use this device at all:
if (!cap.isOpened()) {
cerr << "Capture Device ID " << deviceId << "cannot be opened." << endl;
return -1;
// Holds the current frame from the Video device:
Mat frame;
for (;;) {
cap >> frame;
// Clone the current frame:
Mat original = frame.clone();
// Convert the current frame to grayscale:
Mat gray;
cvtColor(original, gray, CV_BGR2GRAY);
// Find the faces in the frame:
vector< Rect_<int> > faces;
haar_cascade.detectMultiScale(gray, faces);
// At this point you have the position of the faces in
// faces. Now we'll get the faces, make a prediction and
// annotate it in the video. Cool or what?
for (int i = 0; i < faces.size(); i++) {
// Process face by face:
Rect face_i = faces[i];
// Crop the face from the image. So simple with OpenCV C++:
Mat face = gray(face_i);
// Resizing the face is necessary for Eigenfaces and Fisherfaces. You can easily
// verify this, by reading through the face recognition tutorial coming with OpenCV.
// Resizing IS NOT NEEDED for Local Binary Patterns Histograms, so preparing the
// input data really depends on the algorithm used.
// I strongly encourage you to play around with the algorithms. See which work best
// in your scenario, LBPH should always be a contender for robust face recognition.
// Since I am showing the Fisherfaces algorithm here, I also show how to resize the
// face you have just found:
Mat face_resized;
cv::resize(face, face_resized, Size(im_width, im_height), 1.0, 1.0, INTER_CUBIC);
// Now perform the prediction, see how easy that is:
int prediction = model->predict(face_resized);
// And finally write all we've found out to the original image!
// First of all draw a green rectangle around the detected face:
rectangle(original, face_i, CV_RGB(0, 255, 0), 1);
// Create the text we will annotate the box with:
string box_text = format("Prediction = %d", prediction);
// Calculate the position for annotated text (make sure we don't
// put illegal values in there):
int pos_x = std::max( - 10, 0);
int pos_y = std::max( - 10, 0);
// And now put it into the image:
putText(original, box_text, Point(pos_x, pos_y), FONT_HERSHEY_PLAIN, 1.0, CV_RGB(0, 255, 0), 2.0);
// Show the result:
imshow("face_recognizer", original);
// And display it:
char key = (char)waitKey(20);
// Exit this loop on escape:
if (key == 27)
return 0;
What I must to do?
the FisherFaceRecognizer (Eigen, too) tries to 'flatten' the images to a single row (reshape()) for training and testing.
this does not work, if the Mat is non-continuous (because it's either padded or a submat/roi only).
( then again, 'fileNotFound' counts as 'non-continuous', too ;] )
if your images are e.g. .bmp , there's a high chance, that some image-editor padded your images, so the row-size is a factor of 4.
maybe you can get away with batch converting your imgs to .png or .pgm externally
else resizing your train images after loading them will help (anything, that makes a copy of it)
or, change this line in the loading code:
images.push_back(imread(path, 0));
Mat m = imread(path, 1);
Mat m2;
It's a slash/anti-slash problem in your csv file.
For example mine was like this:
Changing for this:
did the tricks
I think that the problem could be in Mat face = gray(face_i).
Read the comments for clarification.
Rect face_i = faces[i];
// This operation makes a new header for the specified sub-array of
// *this, thus it is a 0(1) operation, that is, no matrix data is
// copied. So matrix elements are no longer stored continuously without
// gaps at the end of each row.
Mat face = gray(face_i);
Mat face_resized;
// Here new memory for face_resized should be allocated, but I'm not sure.
// If not then it is the reason of the error, because in this case
// face_resized will contain not continuous data (=face).
cv::resize(face, face_resized, Size(im_width, im_height), 1.0, 1.0, INTER_CUBIC);
// here reshape(face_resized) will be called and will throw the error if
// matrix is not continuous
int prediction = model->predict(face_resized);