To begin, I am a complete novice in OpenCV and am beginner/reasonable in c++ code.
But OpenCV is new to me and I try to learn by doing projects and stuff.
Now for my new project I am trying to find the centre of square in a picture.
In my case there is only 1 square in picture.
I would like to build further upon the square.cpp example of OpenCV.
For my project there are 2 things I need some help with,
1: The edge of the window is detected as a square, I do not want this. Example
2: How could I get the centre of 1 square from the squares array?
This is the code from the example "square.cpp"
// The "Square Detector" program.
// It loads several images sequentially and tries to find squares in
// each image
#include "opencv2/core.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/highgui.hpp"
#include <iostream>
using namespace cv;
using namespace std;
static void help(const char* programName)
{
cout <<
"\nA program using pyramid scaling, Canny, contours and contour simplification\n"
"to find squares in a list of images (pic1-6.png)\n"
"Returns sequence of squares detected on the image.\n"
"Call:\n"
"./" << programName << " [file_name (optional)]\n"
"Using OpenCV version " << CV_VERSION << "\n" << endl;
}
int thresh = 50, N = 11;
const char* wndname = "Square Detection Demo";
// helper function:
// finds a cosine of angle between vectors
// from pt0->pt1 and from pt0->pt2
static double angle(Point pt1, Point pt2, Point pt0)
{
double dx1 = pt1.x - pt0.x;
double dy1 = pt1.y - pt0.y;
double dx2 = pt2.x - pt0.x;
double dy2 = pt2.y - pt0.y;
return (dx1 * dx2 + dy1 * dy2) / sqrt((dx1 * dx1 + dy1 * dy1) * (dx2 * dx2 + dy2 * dy2) + 1e-10);
}
// returns sequence of squares detected on the image.
static void findSquares(const Mat& image, vector<vector<Point> >& squares)
{
squares.clear();
Mat pyr, timg, gray0(image.size(), CV_8U), gray;
// down-scale and upscale the image to filter out the noise
pyrDown(image, pyr, Size(image.cols / 2, image.rows / 2));
pyrUp(pyr, timg, image.size());
vector<vector<Point> > contours;
// find squares in every color plane of the image
for (int c = 0; c < 3; c++)
{
int ch[] = { c, 0 };
mixChannels(&timg, 1, &gray0, 1, ch, 1);
// try several threshold levels
for (int l = 0; l < N; l++)
{
// hack: use Canny instead of zero threshold level.
// Canny helps to catch squares with gradient shading
if (l == 0)
{
// apply Canny. Take the upper threshold from slider
// and set the lower to 0 (which forces edges merging)
Canny(gray0, gray, 0, thresh, 5);
// dilate canny output to remove potential
// holes between edge segments
dilate(gray, gray, Mat(), Point(-1, -1));
}
else
{
// apply threshold if l!=0:
// tgray(x,y) = gray(x,y) < (l+1)*255/N ? 255 : 0
gray = gray0 >= (l + 1) * 255 / N;
}
// find contours and store them all as a list
findContours(gray, contours, RETR_LIST, CHAIN_APPROX_SIMPLE);
vector<Point> approx;
// test each contour
for (size_t i = 0; i < contours.size(); i++)
{
// approximate contour with accuracy proportional
// to the contour perimeter
approxPolyDP(contours[i], approx, arcLength(contours[i], true) * 0.02, true);
// square contours should have 4 vertices after approximation
// relatively large area (to filter out noisy contours)
// and be convex.
// Note: absolute value of an area is used because
// area may be positive or negative - in accordance with the
// contour orientation
if (approx.size() == 4 &&
fabs(contourArea(approx)) > 1000 &&
isContourConvex(approx))
{
double maxCosine = 0;
for (int j = 2; j < 5; j++)
{
// find the maximum cosine of the angle between joint edges
double cosine = fabs(angle(approx[j % 4], approx[j - 2], approx[j - 1]));
maxCosine = MAX(maxCosine, cosine);
}
// if cosines of all angles are small
// (all angles are ~90 degree) then write quandrange
// vertices to resultant sequence
if (maxCosine < 0.3)
squares.push_back(approx);
}
}
}
}
}
int main(int argc, char** argv)
{
static const char* names[] = { "testimg.jpg", 0 };
help(argv[0]);
if (argc > 1)
{
names[0] = argv[1];
names[1] = "0";
}
for (int i = 0; names[i] != 0; i++)
{
string filename = samples::findFile(names[i]);
Mat image = imread(filename, IMREAD_COLOR);
if (image.empty())
{
cout << "Couldn't load " << filename << endl;
continue;
}
vector<vector<Point> > squares;
findSquares(image, squares);
polylines(image, squares, true, Scalar(0, 0, 255), 3, LINE_AA);
imshow(wndname, image);
int c = waitKey();
if (c == 27)
break;
}
return 0;
}
I would like some help to start off.
How could I get some information from 1 of the squares out of the array called "squares" (I am having a difficult time understand what exactly is in the array as well; is it an array of points?)
If something is not clear please let me know and I will try to re-explain.
Thank you in advance
Firstly, you are talking about squares but you are actually detecting rectangles. I provided a shorter code to be able to better answer your questions.
I read the image, apply a Canny filter for binarization and detect all contours. After that I iterate through the contours and find the ones which can be approximated by exactly four points and are convex:
int main(int argc, char** argv)
{
// Reading the images
cv::Mat img = cv::imread("squares_image.jpg", cv::IMREAD_GRAYSCALE);
cv::Mat edge_img;
std::vector <std::vector<cv::Point>> contours;
// Convert the image into a binary image using Canny filter - threshold could be automatically determined using OTSU method
cv::Canny(img, edge_img, 30, 100);
// Find all contours in the Canny image
findContours(edge_img, contours, cv::RETR_LIST, cv::CHAIN_APPROX_SIMPLE);
// Iterate through the contours and test if contours are square
std::vector<std::vector<cv::Point>> all_rectangles;
std::vector<cv::Point> single_rectangle;
for (size_t i = 0; i < contours.size(); i++)
{
// 1. Contours should be approximateable as a polygon
approxPolyDP(contours[i], single_rectangle, arcLength(contours[i], true) * 0.01, true);
// 2. Contours should have exactly 4 vertices and be convex
if (single_rectangle.size() == 4 && cv::isContourConvex(single_rectangle))
{
// 3. Determine if the polygon is really a square/rectangle using its properties (parallelity, angles etc.)
// Not necessary for the provided image
// Push the four points into your vector of squares (could be also std::vector<cv::Rect>)
all_rectangles.push_back(single_rectangle);
}
}
for (size_t num_contour = 0; num_contour < all_rectangles.size(); ++num_contour) {
cv::drawContours(img, all_rectangles, num_contour, cv::Scalar::all(-1));
}
cv::imshow("Detected rectangles", img);
cv::waitKey(0);
return 0;
}
1: The edge of the window is detected as a square, I do not want this.
There are several options depending on your applications. You can filter the outer boundary already using the Canny thresholds, using a different contour retrieval method for finding contours in findContours or by filtering single_rectangle using the area of the found contour (e.g. cv::contourArea(single_rectangle) < 1000).
2: How could I get the centre of 1 square from the squares array?
Since the code is already detecting the four corner points you could e.g. find the intersection of the diagonals. If you know that there are only rectangles in your image you could also try to detect all centroids of the detected contours using the Hu moments.
I am having a difficult time understand what exactly is in the array as well; is it an array of points?
One contour in OpenCV is always represented as a vector of single points. If you are adding multiple contours you are using a vector of vector of points. In the example you provided squares is a vector of a vector of exactly 4 points. You could also use a vector of cv::Rect in this case.
I have created an opencv filter that can detect if a person blinks for Kurento the WebRTC framework. My code works in a standalone opencv app. However, once I converted to the opencv filter for Kurento it started playing up. When the module/filter was compiled without optimisation flags it would briefly detect the face and draw contours around the eyes. However, after compiling the module/filter with optimisation flags, performance improved, but no face was being detected. Here's the code I have in the filter:
void BlinkDetectorOpenCVImpl::process(cv::Mat &mat) {
std::vector <dlib::rectangle> faces;
// Just resize input image if you want
resize(mat, mat, Size(800, 450));
cv_image <rgb_alpha_pixel> cimg(mat);
dlib::array2d<unsigned char> img_gray;
dlib::assign_image(img_gray, cimg);
faces = detector(img_gray);
std::cout << "XXXXXXXXXXXXXXXXXXXXX FACES: " << faces.size() << std::endl;
std::vector <full_object_detection> shapes;
for (unsigned long i = 0; i < faces.size(); ++i) {
full_object_detection shape = pose_model(cimg, faces[i]);
std::vector <Point> left_eye_points = get_points_for_eye(shape, LEFT_EYE_START, LEFT_EYE_END);
std::vector <Point> right_eye_points = get_points_for_eye(shape, RIGHT_EYE_START, RIGHT_EYE_END);
double left_eye_ear = get_eye_aspect_ratio(left_eye_points);
double right_eye_ear = get_eye_aspect_ratio(right_eye_points);
double ear = (left_eye_ear + right_eye_ear) / 2.0;
// Draw left eye
std::vector <std::vector<Point>> contours;
contours.push_back(left_eye_points);
std::vector <std::vector<Point>> hull(1);
convexHull(contours[0], hull[0]);
drawContours(mat, hull, -1, Scalar(0, 255, 0));
// Draw right eye
contours[0] = right_eye_points;
convexHull(contours[0], hull[0]);
drawContours(mat, hull, -1, Scalar(0, 255, 0));
if (ear < EYE_AR_THRESH) {
counter++;
} else {
if (counter >= EYE_AR_CONSEC_FRAMES) {
total++;
/* std::string sJson = "{\"blink\": \"blink\"}";
try
{
onResult event(getSharedFromThis(), onResult::getName(), sJson);
signalonResult(event);
}
catch (std::bad_weak_ptr &e)
{
}*/
}
counter = 0;
}
cv::putText(mat, (boost::format{"Blinks: %d"} % total).str(), cv::Point(10, 30),
cv::FONT_HERSHEY_SIMPLEX,
0.7, Scalar(0, 0, 255), 2);
cv::putText(mat, (boost::format{"EAR: %.2f"} % ear).str(), cv::Point(300, 30),
cv::FONT_HERSHEY_SIMPLEX,
0.7, Scalar(0, 0, 255), 2);
}
}
} /* blinkdetector */
I was able to fix my own problem. I found that instead of resizing the image to an arbitrary resolution you should resize it by half the width and half the height of the actual image resolution. Resizing an image to a lower size makes Dlib face detection fast. So here's what I did to solve the issue:
Mat tmpMat = mat.clone();
resize(tmpMat, tmpMat, Size(tmpMat.size().width / 2, tmpMat.size().height / 2));
I had to clone the image sent by Kurento to my method because for some odd reason the original Mat doesn't show the contours when turned into a Dlib image with cv_image.
I have to write a program that detect 3 types of road signs (speed limit, no parking and warnings). I know how to detect a circle using HoughCircles but I have several images and the parameters for HoughCircles are different for each image. There's a general way to detect circles without changing parameters for each image?
Moreover I need to detect triangle (warning signs) so I'm searching for a general shape detector. Have you any suggestions/code that can help me in this task?
Finally for detect the number on speed limit signs I thought to use SIFT and compare the image with some templates in order to identify the number on the sign. Could it be a good approach?
Thank you for the answer!
I know this is a pretty old question but I had been through the same problem and now I show you how I solved it.
The following images show some of the most accurate results that are displayed by the opencv program.
In the following images the street signs detected are circled with three different colors that distinguish the three kinds of street signs (warning, no parking, speed limit).
Red for warning signs
Blue for no parking signs
Fuchsia for speed limit signs
The speed limit value is written in green above the speed limit signs
[![example][1]][1]
[![example][2]][2]
[![example][3]][3]
[![example][4]][4]
As you can see the program performs quite well, it is able to detect and distinguish the three kinds of sign and to recognize the speed limit value in case of speed limit signs. Everything is done without computing too many false positives when, for instance, in the image there are some signs that do not belong to one of the three categories.
In order to achieve this result the software computes the detection in three main steps.
The first step involves a color based approach where the red objects in the image are detected and their region are extract to be analyzed. This step is particularly useful in order to prevent the detection of false positives, because only a small part of the image is processed.
The second step works with a machine learning algorithm: in particular we use a Cascade Classifier to compute the detection. This operation firstly requires to train the classifiers and on a later stage to use them to detect the signs.
In the last step the speed limit values inside the speed limit signs are read, also in this case through a machine learning algorithm but using the k-nearest neighbor algorithm.
Now we are going to see in detail each step.
COLOR BASED STEP
Since the street signs are always circled by a red frame, we can afford to take out and analyze only the regions where the red objects are detected.
In order to select the red objects, we consider all the ranges of the red color: even if this may produce some false positives, they will be easily discarded in the next steps.
inRange(image, Scalar(0, 70, 50), Scalar(10, 255, 255), mask1);
inRange(image, Scalar(170, 70, 50), Scalar(180, 255, 255), mask2);
In the image below we can see an example of the red objects detected with this method.
After having found the red pixels we can gather them to find the regions using a clustering algorithm, I use the method
partition(<#_ForwardIterator __first#>, _ForwardIterator __last, <#_Predicate __pred#>)
After the execution of this method we can save all the points in the same cluster in a vector (one for each cluster) and extract the bounding boxes which represent the
regions to be analyzed in the next step.
HAAR CASCADE CLASSIFIERS FOR SIGNS DETECTION
This is the real detection step where the street signs are detected. In order to perform a cascade classifier the first step consist in building a dataset of positives and negatives images. Now I explain how I have built my own datasets of images.
The first thing to note is that we need to train three different Haar cascades in order to distinguish between the three kind of signs that we have to detect, hence we must repeat the following steps for each of the three kinds of sign.
We need two datasets: one for the positive samples (which must be a set of images that contains the road signs that we are going to detect) and another one for the negative samples which can be any kind of image without street signs.
After collecting a set of 100 images for the positive samples and a set of 200 images for the negatives in two different folders, we need to write two text files:
Signs.info which contains a list of file names like the one below,
one for each positive sample in the positive folder.
pos/image_name.png 1 0 0 50 45
Here, the numbers after the name represent respectively the number
of street signs in the image, the coordinate of the upper left
corner of the street sign, his height and his width.
Bg.txt which contains a list of file names like the one below, one
for each sign in the negative folder.
neg/street15.png
With the command line below we generate the .vect file which contains all the information that the software retrieves from the positive samples.
opencv_createsamples -info sign.info -num 100 -w 50 -h 50 -vec signs.vec
Afterwards we train the cascade classifier with the following command:
opencv_traincascade -data data -vec signs.vec -bg bg.txt -numPos 60 -numNeg 200 -numStages 15 -w 50 -h 50 -featureType LBP
where the number of stages indicates the number of classifiers that will be generated in order to build the cascade.
At the end of this process we gain a file cascade.xml which will be used from the CascadeClassifier program in order to detect the objects in the image.
Now we have trained our algorithm and we can declare a CascadeClassifier for each kind of street sign, than we detect the signs in the image through
detectMultiScale(<#InputArray image#>, <#std::vector<Rect> &objects#>)
this method creates a Rect around each object that has been detected.
It is important to note that exactly as every machine learning algorithm, in order to perform well, we need a large number of samples in the dataset. The dataset that I have built, is not extremely large, thus in some situations it is not able to detect all the signs. This mostly happens when a small part of the street sign is not visible in the image like in the warning sign below:
I have expanded my dataset up to the point where I have obtained a fairly accurate result without
too many errors.
SPEED LIMIT VALUE DETECTION
Like for the street signs detection also here I used a machine learning algorithm but with a different approach. After some work, I realized that an OCR (tesseract) solution does not perform well, so I decided to build my own ocr software.
For the machine learning algorithm I took the image below as training data which contains some speed limit values:
The amount of training data is small. But, since in speed limit signs all letters have the same font, it is not a huge problem.
To prepare the data for training, I made a small code in OpenCV. It does the following things:
It loads the image on the left;
It selects the digits (obviously by contour finding and applying constraints on area and height of letters to avoid false detections).
It draws the bounding rectangle around one letter and it waits for the key to be manually pressed. This time the user presses the digit key corresponding to the letter in box by himself.
Once the corresponding digit key is pressed, it saves 100 pixel values in an array and the correspondent manually entered digit in another array.
Eventually it saves both the arrays in separate txt files.
Following the manual digit classification all the digits in the train data( train.png) are manually labeled, and the image will look like the one below.
Now we enter into training and testing part.
For training we do as follows:
Load the txt files we already saved earlier
Create an instance of classifier that we are going to use ( KNearest)
Then we use KNearest.train function to train the data
Now the detection:
We load the image with the speed limit sign detected
Process the image as before and extract each digit using contour methods
Draw bounding box for it, then resize to 10x10, and store its pixel values in an array as done earlier.
Then we use KNearest.find_nearest() function to find the nearest item to the one we gave.
And it recognizes the correct digit.
I tested this little OCR on many images, and just with this small dataset I have obtained an accuracy of about 90%.
CODE
Below I post all my openCv c++ code in a single class, following my instruction you should be able to achive my result.
#include "opencv2/objdetect/objdetect.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>
#include <cmath>
#include <stdlib.h>
#include "opencv2/core/core.hpp"
#include "opencv2/highgui.hpp"
#include <string.h>
#include <opencv2/ml/ml.hpp>
using namespace std;
using namespace cv;
std::vector<cv::Rect> getRedObjects(cv::Mat image);
vector<Mat> detectAndDisplaySpeedLimit( Mat frame );
vector<Mat> detectAndDisplayNoParking( Mat frame );
vector<Mat> detectAndDisplayWarning( Mat frame );
void trainDigitClassifier();
string getDigits(Mat image);
vector<Mat> loadAllImage();
int getSpeedLimit(string speed);
//path of the haar cascade files
String no_parking_signs_cascade = "/Users/giuliopettenuzzo/Desktop/cascade_classifiers/no_parking_cascade.xml";
String speed_signs_cascade = "/Users/giuliopettenuzzo/Desktop/cascade_classifiers/speed_limit_cascade.xml";
String warning_signs_cascade = "/Users/giuliopettenuzzo/Desktop/cascade_classifiers/warning_cascade.xml";
CascadeClassifier speed_limit_cascade;
CascadeClassifier no_parking_cascade;
CascadeClassifier warning_cascade;
int main(int argc, char** argv)
{
//train the classifier for digit recognition, this require a manually train, read the report for more details
trainDigitClassifier();
cv::Mat sceneImage;
vector<Mat> allImages = loadAllImage();
for(int i = 0;i<=allImages.size();i++){
sceneImage = allImages[i];
//load the haar cascade files
if( !speed_limit_cascade.load( speed_signs_cascade ) ){ printf("--(!)Error loading\n"); return -1; };
if( !no_parking_cascade.load( no_parking_signs_cascade ) ){ printf("--(!)Error loading\n"); return -1; };
if( !warning_cascade.load( warning_signs_cascade ) ){ printf("--(!)Error loading\n"); return -1; };
Mat scene = sceneImage.clone();
//detect the red objects
std::vector<cv::Rect> allObj = getRedObjects(scene);
//use the three cascade classifier for each object detected by the getRedObjects() method
for(int j = 0;j<allObj.size();j++){
Mat img = sceneImage(Rect(allObj[j]));
vector<Mat> warningVec = detectAndDisplayWarning(img);
if(warningVec.size()>0){
Rect box = allObj[j];
}
vector<Mat> noParkVec = detectAndDisplayNoParking(img);
if(noParkVec.size()>0){
Rect box = allObj[j];
}
vector<Mat> speedLitmitVec = detectAndDisplaySpeedLimit(img);
if(speedLitmitVec.size()>0){
Rect box = allObj[j];
for(int i = 0; i<speedLitmitVec.size();i++){
//get speed limit and skatch it in the image
int digit = getSpeedLimit(getDigits(speedLitmitVec[i]));
if(digit > 0){
Point point = box.tl();
point.y = point.y + 30;
cv::putText(sceneImage,
"SPEED LIMIT " + to_string(digit),
point,
cv::FONT_HERSHEY_COMPLEX_SMALL,
0.7,
cv::Scalar(0,255,0),
1,
cv::CV__CAP_PROP_LATEST);
}
}
}
}
imshow("currentobj",sceneImage);
waitKey(0);
}
}
/*
* detect the red object in the image given in the param,
* return a vector containing all the Rect of the red objects
*/
std::vector<cv::Rect> getRedObjects(cv::Mat image)
{
Mat3b res = image.clone();
std::vector<cv::Rect> result;
cvtColor(image, image, COLOR_BGR2HSV);
Mat1b mask1, mask2;
//ranges of red color
inRange(image, Scalar(0, 70, 50), Scalar(10, 255, 255), mask1);
inRange(image, Scalar(170, 70, 50), Scalar(180, 255, 255), mask2);
Mat1b mask = mask1 | mask2;
Mat nonZeroCoordinates;
vector<Point> pts;
findNonZero(mask, pts);
for (int i = 0; i < nonZeroCoordinates.total(); i++ ) {
cout << "Zero#" << i << ": " << nonZeroCoordinates.at<Point>(i).x << ", " << nonZeroCoordinates.at<Point>(i).y << endl;
}
int th_distance = 2; // radius tolerance
// Apply partition
// All pixels within the radius tolerance distance will belong to the same class (same label)
vector<int> labels;
// With lambda function (require C++11)
int th2 = th_distance * th_distance;
int n_labels = partition(pts, labels, [th2](const Point& lhs, const Point& rhs) {
return ((lhs.x - rhs.x)*(lhs.x - rhs.x) + (lhs.y - rhs.y)*(lhs.y - rhs.y)) < th2;
});
// You can save all points in the same class in a vector (one for each class), just like findContours
vector<vector<Point>> contours(n_labels);
for (int i = 0; i < pts.size(); ++i){
contours[labels[i]].push_back(pts[i]);
}
// Get bounding boxes
vector<Rect> boxes;
for (int i = 0; i < contours.size(); ++i)
{
Rect box = boundingRect(contours[i]);
if(contours[i].size()>500){//prima era 1000
boxes.push_back(box);
Rect enlarged_box = box + Size(100,100);
enlarged_box -= Point(30,30);
if(enlarged_box.x<0){
enlarged_box.x = 0;
}
if(enlarged_box.y<0){
enlarged_box.y = 0;
}
if(enlarged_box.height + enlarged_box.y > res.rows){
enlarged_box.height = res.rows - enlarged_box.y;
}
if(enlarged_box.width + enlarged_box.x > res.cols){
enlarged_box.width = res.cols - enlarged_box.x;
}
Mat img = res(Rect(enlarged_box));
result.push_back(enlarged_box);
}
}
Rect largest_box = *max_element(boxes.begin(), boxes.end(), [](const Rect& lhs, const Rect& rhs) {
return lhs.area() < rhs.area();
});
//draw the rects in case you want to see them
for(int j=0;j<=boxes.size();j++){
if(boxes[j].area() > largest_box.area()/3){
rectangle(res, boxes[j], Scalar(0, 0, 255));
Rect enlarged_box = boxes[j] + Size(20,20);
enlarged_box -= Point(10,10);
rectangle(res, enlarged_box, Scalar(0, 255, 0));
}
}
rectangle(res, largest_box, Scalar(0, 0, 255));
Rect enlarged_box = largest_box + Size(20,20);
enlarged_box -= Point(10,10);
rectangle(res, enlarged_box, Scalar(0, 255, 0));
return result;
}
/*
* code for detect the speed limit sign , it draws a circle around the speed limit signs
*/
vector<Mat> detectAndDisplaySpeedLimit( Mat frame )
{
std::vector<Rect> signs;
vector<Mat> result;
Mat frame_gray;
cvtColor( frame, frame_gray, CV_BGR2GRAY );
//normalizes the brightness and increases the contrast of the image
equalizeHist( frame_gray, frame_gray );
//-- Detect signs
speed_limit_cascade.detectMultiScale( frame_gray, signs, 1.1, 3, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );
cout << speed_limit_cascade.getFeatureType();
for( size_t i = 0; i < signs.size(); i++ )
{
Point center( signs[i].x + signs[i].width*0.5, signs[i].y + signs[i].height*0.5 );
ellipse( frame, center, Size( signs[i].width*0.5, signs[i].height*0.5), 0, 0, 360, Scalar( 255, 0, 255 ), 4, 8, 0 );
Mat resultImage = frame(Rect(center.x - signs[i].width*0.5,center.y - signs[i].height*0.5,signs[i].width,signs[i].height));
result.push_back(resultImage);
}
return result;
}
/*
* code for detect the warning sign , it draws a circle around the warning signs
*/
vector<Mat> detectAndDisplayWarning( Mat frame )
{
std::vector<Rect> signs;
vector<Mat> result;
Mat frame_gray;
cvtColor( frame, frame_gray, CV_BGR2GRAY );
equalizeHist( frame_gray, frame_gray );
//-- Detect signs
warning_cascade.detectMultiScale( frame_gray, signs, 1.1, 3, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );
cout << warning_cascade.getFeatureType();
Rect previus;
for( size_t i = 0; i < signs.size(); i++ )
{
Point center( signs[i].x + signs[i].width*0.5, signs[i].y + signs[i].height*0.5 );
Rect newRect = Rect(center.x - signs[i].width*0.5,center.y - signs[i].height*0.5,signs[i].width,signs[i].height);
if((previus & newRect).area()>0){
previus = newRect;
}else{
ellipse( frame, center, Size( signs[i].width*0.5, signs[i].height*0.5), 0, 0, 360, Scalar( 0, 0, 255 ), 4, 8, 0 );
Mat resultImage = frame(newRect);
result.push_back(resultImage);
previus = newRect;
}
}
return result;
}
/*
* code for detect the no parking sign , it draws a circle around the no parking signs
*/
vector<Mat> detectAndDisplayNoParking( Mat frame )
{
std::vector<Rect> signs;
vector<Mat> result;
Mat frame_gray;
cvtColor( frame, frame_gray, CV_BGR2GRAY );
equalizeHist( frame_gray, frame_gray );
//-- Detect signs
no_parking_cascade.detectMultiScale( frame_gray, signs, 1.1, 3, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );
cout << no_parking_cascade.getFeatureType();
Rect previus;
for( size_t i = 0; i < signs.size(); i++ )
{
Point center( signs[i].x + signs[i].width*0.5, signs[i].y + signs[i].height*0.5 );
Rect newRect = Rect(center.x - signs[i].width*0.5,center.y - signs[i].height*0.5,signs[i].width,signs[i].height);
if((previus & newRect).area()>0){
previus = newRect;
}else{
ellipse( frame, center, Size( signs[i].width*0.5, signs[i].height*0.5), 0, 0, 360, Scalar( 255, 0, 0 ), 4, 8, 0 );
Mat resultImage = frame(newRect);
result.push_back(resultImage);
previus = newRect;
}
}
return result;
}
/*
* train the classifier for digit recognition, this could be done only one time, this method save the result in a file and
* it can be used in the next executions
* in order to train user must enter manually the corrisponding digit that the program shows, press space if the red box is just a point (false positive)
*/
void trainDigitClassifier(){
Mat thr,gray,con;
Mat src=imread("/Users/giuliopettenuzzo/Desktop/all_numbers.png",1);
cvtColor(src,gray,CV_BGR2GRAY);
threshold(gray,thr,125,255,THRESH_BINARY_INV); //Threshold to find contour
imshow("ci",thr);
waitKey(0);
thr.copyTo(con);
// Create sample and label data
vector< vector <Point> > contours; // Vector for storing contour
vector< Vec4i > hierarchy;
Mat sample;
Mat response_array;
findContours( con, contours, hierarchy,CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE ); //Find contour
for( int i = 0; i< contours.size(); i=hierarchy[i][0] ) // iterate through first hierarchy level contours
{
Rect r= boundingRect(contours[i]); //Find bounding rect for each contour
rectangle(src,Point(r.x,r.y), Point(r.x+r.width,r.y+r.height), Scalar(0,0,255),2,8,0);
Mat ROI = thr(r); //Crop the image
Mat tmp1, tmp2;
resize(ROI,tmp1, Size(10,10), 0,0,INTER_LINEAR ); //resize to 10X10
tmp1.convertTo(tmp2,CV_32FC1); //convert to float
imshow("src",src);
int c=waitKey(0); // Read corresponding label for contour from keyoard
c-=0x30; // Convert ascii to intiger value
response_array.push_back(c); // Store label to a mat
rectangle(src,Point(r.x,r.y), Point(r.x+r.width,r.y+r.height), Scalar(0,255,0),2,8,0);
sample.push_back(tmp2.reshape(1,1)); // Store sample data
}
// Store the data to file
Mat response,tmp;
tmp=response_array.reshape(1,1); //make continuous
tmp.convertTo(response,CV_32FC1); // Convert to float
FileStorage Data("TrainingData.yml",FileStorage::WRITE); // Store the sample data in a file
Data << "data" << sample;
Data.release();
FileStorage Label("LabelData.yml",FileStorage::WRITE); // Store the label data in a file
Label << "label" << response;
Label.release();
cout<<"Training and Label data created successfully....!! "<<endl;
imshow("src",src);
waitKey(0);
}
/*
* get digit from the image given in param, using the classifier trained before
*/
string getDigits(Mat image)
{
Mat thr1,gray1,con1;
Mat src1 = image.clone();
cvtColor(src1,gray1,CV_BGR2GRAY);
threshold(gray1,thr1,125,255,THRESH_BINARY_INV); // Threshold to create input
thr1.copyTo(con1);
// Read stored sample and label for training
Mat sample1;
Mat response1,tmp1;
FileStorage Data1("TrainingData.yml",FileStorage::READ); // Read traing data to a Mat
Data1["data"] >> sample1;
Data1.release();
FileStorage Label1("LabelData.yml",FileStorage::READ); // Read label data to a Mat
Label1["label"] >> response1;
Label1.release();
Ptr<ml::KNearest> knn(ml::KNearest::create());
knn->train(sample1, ml::ROW_SAMPLE,response1); // Train with sample and responses
cout<<"Training compleated.....!!"<<endl;
vector< vector <Point> > contours1; // Vector for storing contour
vector< Vec4i > hierarchy1;
//Create input sample by contour finding and cropping
findContours( con1, contours1, hierarchy1,CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE );
Mat dst1(src1.rows,src1.cols,CV_8UC3,Scalar::all(0));
string result;
for( int i = 0; i< contours1.size(); i=hierarchy1[i][0] ) // iterate through each contour for first hierarchy level .
{
Rect r= boundingRect(contours1[i]);
Mat ROI = thr1(r);
Mat tmp1, tmp2;
resize(ROI,tmp1, Size(10,10), 0,0,INTER_LINEAR );
tmp1.convertTo(tmp2,CV_32FC1);
Mat bestLabels;
float p=knn -> findNearest(tmp2.reshape(1,1),4, bestLabels);
char name[4];
sprintf(name,"%d",(int)p);
cout << "num = " << (int)p;
result = result + to_string((int)p);
putText( dst1,name,Point(r.x,r.y+r.height) ,0,1, Scalar(0, 255, 0), 2, 8 );
}
imwrite("dest.jpg",dst1);
return result ;
}
/*
* from the digits detected, it returns a speed limit if it is detected correctly, -1 otherwise
*/
int getSpeedLimit(string numbers){
if ((numbers.find("30") != std::string::npos) || (numbers.find("03") != std::string::npos)) {
return 30;
}
if ((numbers.find("50") != std::string::npos) || (numbers.find("05") != std::string::npos)) {
return 50;
}
if ((numbers.find("80") != std::string::npos) || (numbers.find("08") != std::string::npos)) {
return 80;
}
if ((numbers.find("70") != std::string::npos) || (numbers.find("07") != std::string::npos)) {
return 70;
}
if ((numbers.find("90") != std::string::npos) || (numbers.find("09") != std::string::npos)) {
return 90;
}
if ((numbers.find("100") != std::string::npos) || (numbers.find("001") != std::string::npos)) {
return 100;
}
if ((numbers.find("130") != std::string::npos) || (numbers.find("031") != std::string::npos)) {
return 130;
}
return -1;
}
/*
* load all the image in the file with the path hard coded below
*/
vector<Mat> loadAllImage(){
vector<cv::String> fn;
glob("/Users/giuliopettenuzzo/Desktop/T1/dataset/*.jpg", fn, false);
vector<Mat> images;
size_t count = fn.size(); //number of png files in images folder
for (size_t i=0; i<count; i++)
images.push_back(imread(fn[i]));
return images;
}
maybe you should try implementing the ransac algorithm, if you are using color images, migt be a good idea (if you are in europe) to get the red channel only since the speed limits are surrounded by a red cricle (or a thin white i think also).
For that you need to filter the image to get the edges, (canny filter).
Here are some useful links:
OpenCV detect partial circle with noise
https://hal.archives-ouvertes.fr/hal-00982526/document
Finally for the numbers detection i think its ok. Other approach is to use something like Viola-Jones algorithm to detect the signals, with pretrained existing models... It's up to you!
I work on a method that detect an object from my webcam (which is already color filtered) with OpenCV. It already works fine, but after two minutes the fps rate decreases for some reason.
Some facts:
OS: Windows 8.1
OpenCV: 2.4.10
IDE: Visual Studio 2013
I think it has something to do with the memory allocation, because as I said it works fine for the first two minutes.
In order to find out if it is computer related, I already tested it on another pc, but there also the same error occurs.
void findAndDrawRect(Mat* imgThreshold, Mat* imgOriginal, SerialPort^ arduino){
// contains rectangles - each rectangle exists of points- of imgThreshold
std::vector<std::vector<cv::Point> > contours;
// contains hierarchical information from black-n white image
std::vector<cv::Vec4i> hierarchy;
cv::findContours(*imgThreshold,
contours,
hierarchy,
CV_RETR_TREE,
CV_CHAIN_APPROX_SIMPLE,
cv::Point(0, 0));
std::vector<std::vector<cv::Point> > contours_poly(contours.size());
// holds place for rectangle of each contour on the image
cv::Rect boundRect;
// tempory variable to identify largest rectangle
double refArea = 0;
double area = 0;
// pointer to largest rectangle
cv::Rect* obj = NULL;
for (int i = 0; i < contours.size(); i++)
{
// approximates a polygonal curve for each contour of third grade
cv::approxPolyDP(cv::Mat(contours[i]), contours_poly[i], 3, true);
// converts polynom of thrid grade to a rectangle
boundRect = cv::boundingRect(cv::Mat(contours_poly[i]));
// calculation of the area
area = boundRect.width * boundRect.height;
// if area is less than MIN_OBJECT_AREA, the rectangle is probably a noise
// if area is larger than MAX_OBJECT_AREA, size of rectangle is larger than 2/3 of screen = filter fail or approching cyclist is too near
if (area > MIN_OBJECT_AREA && area<MAX_OBJECT_AREA && area>refArea){
refArea = area;
obj = &boundRect;
}
}
// if rectangle exists, estimate the distance to the object and send warnings to the helmet
if (obj != NULL){
//calculation of distance to object
double currentDistance = (objRealWidth * focal_length) / obj->width;
// focal length = current measured width of A4 Paper in pixel * distance to camera(measured) / real width of A4 (measured)
focal_length = (measuredPixelWidth * distanceToCam) / objRealWidth;
// printf("width: %d \n", obj->width);
// convert from cm to m
currentDistance /= 100;
sprintf_s(Distance, "Distance :%f m", currentDistance);
putText(*imgOriginal, Distance, Point(0, 50), 1, 1, Scalar(0, 255, 0), 2);
rectangle(*imgOriginal, obj->tl(), obj->br(), cv::Scalar(0, 0, 255), 2, 8, 0);
switch (modus){
case 0: // NOISE
arduino->WriteLine("1");
objDetected = '1';
break;
case 1: // VIBRATION
arduino->WriteLine("2");
objDetected = '2';
break;
case 2: // LED
arduino->WriteLine("3");
objDetected = '3';
break;
case 3:
//measurement without alerting subject
arduino->WriteLine("4");
objDetected = '4';
break;
default:
break;
}
}
else{
// no object recognized
arduino->WriteLine("0");
objDetected = '0';
}
}
After this part of my function, I just draw a rectangle over the picture, which is referenced by the variable obj in the last line of the code snippet.
EDIT: The global variable 'objDetected' is used for communication via named pipes:
for (;;){
// This call blocks until a client process reads all the data
//const wchar_t *data = L"*** Hello Pipe ONUR ***";
char *data = &objDetected;
DWORD numBytesWritten = 0;
result = WriteFile(
pipe, // handle to our outbound pipe
data, // data to send
1, // length of data to send (bytes) + terminating !
&numBytesWritten, // will store actual amount of data sent
NULL // not using overlapped IO
);
I really don't know how to remove this problem, since I tried everything I could think of. E.g.: I removed all unused vector entries during calculation, but it doesn't help, so I removed it from the code)
If there any questions feel free to ask and thank you in advance!
Greetings,
a frustrated OpenCV Newbie!
The complete error:
OpenCV Error: Assertion failed (nimages > 0 && nimages ==
(int)imagePoints1.total() && (!imgPtMat2 || nimages ==
(int)imagePoints2.total())) in collectCalibrationData, file C:\OpenCV
\sources\modules\calib3d\src\calibration.cpp, line 3164
The code:
cv::VideoCapture kalibrowanyPlik; //the video
cv::Mat frame;
cv::Mat testTwo; //undistorted
cv::Mat cameraMatrix = (cv::Mat_<double>(3, 3) << 2673.579, 0, 1310.689, 0, 2673.579, 914.941, 0, 0, 1);
cv::Mat distortMat = (cv::Mat_<double>(1, 4) << -0.208143, 0.235290, 0.001005, 0.001339);
cv::Mat intrinsicMatrix = (cv::Mat_<double>(3, 3) << 1, 0, 0, 0, 1, 0, 0, 0, 1);
cv::Mat distortCoeffs = cv::Mat::zeros(8, 1, CV_64F);
//there are two sets for testing purposes. Values for the first two came from GML camera calibration app.
std::vector<cv::Mat> rvecs;
std::vector<cv::Mat> tvecs;
std::vector<std::vector<cv::Point2f> > imagePoints;
std::vector<std::vector<cv::Point3f> > objectPoints;
kalibrowanyPlik.open("625.avi");
//cv::namedWindow("Distorted", CV_WINDOW_AUTOSIZE); //gotta see things
//cv::namedWindow("Undistorted", CV_WINDOW_AUTOSIZE);
int maxFrames = kalibrowanyPlik.get(CV_CAP_PROP_FRAME_COUNT);
int success = 0; //so we can do the calibration only after we've got a bunch
for(int i=0; i<maxFrames-1; i++) {
kalibrowanyPlik.read(frame);
std::vector<cv::Point2f> corners; //creating these here so they're effectively reset each time
std::vector<cv::Point3f> objectCorners;
int sizeX = kalibrowanyPlik.get(CV_CAP_PROP_FRAME_WIDTH); //imageSize
int sizeY = kalibrowanyPlik.get(CV_CAP_PROP_FRAME_HEIGHT);
cv::cvtColor(frame, frame, CV_BGR2GRAY); //must be gray
cv::Size patternsize(9,6); //interior number of corners
bool patternfound = cv::findChessboardCorners(frame, patternsize, corners, cv::CALIB_CB_ADAPTIVE_THRESH + cv::CALIB_CB_NORMALIZE_IMAGE + cv::CALIB_CB_FAST_CHECK); //finding them corners
if(patternfound == false) { //gotta know
qDebug() << "failure";
}
if(patternfound) {
qDebug() << "success!";
std::vector<cv::Point3f> objectCorners; //low priority issue - if I don't do this here, it becomes empty. Not sure why.
for(int y=0; y<6; ++y) {
for(int x=0; x<9; ++x) {
objectCorners.push_back(cv::Point3f(x*28,y*28,0)); //filling the array
}
}
cv::cornerSubPix(frame, corners, cv::Size(11, 11), cv::Size(-1, -1),
cv::TermCriteria(CV_TERMCRIT_EPS + CV_TERMCRIT_ITER, 30, 0.1));
cv::cvtColor(frame, frame, CV_GRAY2BGR); //I don't want gray lines
imagePoints.push_back(corners); //filling array of arrays with pixel coord array
objectPoints.push_back(objectCorners); //filling array of arrays with real life coord array, or rather copies of the same thing over and over
cout << corners << endl << objectCorners;
cout << endl << objectCorners.size() << "___" << objectPoints.size() << "___" << corners.size() << "___" << imagePoints.size() << endl;
cv::drawChessboardCorners(frame, patternsize, cv::Mat(corners), patternfound); //drawing.
if(success > 5) {
double rms = cv::calibrateCamera(objectPoints, corners, cv::Size(sizeX, sizeY), intrinsicMatrix, distortCoeffs, rvecs, tvecs, cv::CALIB_USE_INTRINSIC_GUESS);
//error - caused by passing CORNERS instead of IMAGEPOINTS. Also, imageSize is 640x480, and I've set the central point to 1310... etc
cout << endl << intrinsicMatrix << endl << distortCoeffs << endl;
cout << "\nrms - " << rms << endl;
}
success = success + 1;
//cv::imshow("Distorted", frame);
//cv::imshow("Undistorted", testTwo);
}
}
I've done a little bit of reading (This was an especially informative read), including over a dozen threads made here on StackOverflow, and all I found is that this error is produced by either by uneven imagePoints and objectPoints or by them being partially null or empty or zero (and links to tutorials that don't help). None of that is the case - the output from .size() check is:
54___7___54___7
For objectCorners (real life coords), objectPoints (number of arrays inserted) and the same for corners (pixel coords) and imagePoints. They're not empty either, the output is:
(...)
277.6792, 208.92903;
241.83429, 208.93048;
206.99866, 208.84637;
(...)
84, 56, 0;
112, 56, 0;
140, 56, 0;
168, 56, 0;
(...)
A sample frame:
I know it's a mess, but so far I'm trying to complete the code rather than get an accurate reading.
Each one hs exactly 54 lines of that. Does anyone have any ideas on what is causing the error? I'm using OpenCV 2.4.8 and Qt Creator 5.4 on Windows 7.
First of all, corners and imagePoints have to be switched, as you have aready noticed.
In most cases (if not all), size <= 25 is enough to get a good result. Focal length around 633 is not wierd, it means the focal length is 633 * sensor size. The CCD or CMOS size must be somewhere on the INSTRUCTIONS along with your camera. Find it out , times 633, the result is your focal length.
One suggestion to reduce the number of images used: using images taken from different viewpoints. 10 images from 10 different viewpoints bring much better result than 100 images from the same ( or nearby ) viewpoints. That is one of the reasons why video is not a good input. I guess with your code, all the images passed to calibratecamera may be from nearby viewpoints. If so, the calibration accuracy degrades.