How to remove all vertical and horizontal lines that form boxes/tables
I have searched and tried.. But can't make it work
Have tried to search for it the last couple of days.. have found a few examples which doesn't work.. Have tried to get the pieces together..
cv:Mat img = cv::imread(input, CV_LOAD_IMAGE_GRAYSCALE);
cv::Mat grad;
cv::Mat morphKernel = cv::getStructuringElement(cv::MORPH_ELLIPSE, cv::Size(3, 3));
cv::morphologyEx(img, grad, cv::MORPH_GRADIENT, morphKernel);
cv::Mat res;
cv::threshold(grad, res, 0, 255, cv::THRESH_BINARY | cv::THRESH_OTSU);
// find contours
cv::Mat mask = cv::Mat::zeros(res.size(), CV_8UC1);
std::vector<std::vector<cv::Point>> contours;
std::vector<cv::Vec4i> hierarchy;
cv::findContours(res, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
for(int i = 0; i < contours.size(); i++){
cv::Mat approx;
double peri = cv::arcLength(contours[i], true);
cv::approxPolyDP(contours[i], approx, 0.04 * peri, true);
int num_vertices = approx.rows;
if(num_vertices == 4){
cv::Rect rect = cv::boundingRect(contours[i]);
// this is a rectangle
}
}
You could try something like that :
threshold your image
compute connected components
remove particules for which at least 3 of 4 bounding box tops are in touch with particule
This should give you something like that :
Here is the associated source code :
#include <iostream>
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <limits>
using namespace cv;
struct BBox {
BBox() :
_xMin(std::numeric_limits<int>::max()),
_xMax(std::numeric_limits<int>::min()),
_yMin(std::numeric_limits<int>::max()),
_yMax(std::numeric_limits<int>::min())
{}
int _xMin;
int _xMax;
int _yMin;
int _yMax;
};
int main()
{
// read input image
Mat inputImg = imread("test3_1.tif", IMREAD_GRAYSCALE);
// create binary image
Mat binImg;
threshold(inputImg, binImg, 254, 1, THRESH_BINARY_INV);
// compute connected components
Mat labelImg;
const int nbComponents = connectedComponents(binImg, labelImg, 8, CV_32S);
// compute associated bboxes
std::vector<BBox> bboxColl(nbComponents);
for (int y = 0; y < labelImg.rows; ++y) {
for (int x = 0; x < labelImg.cols; ++x) {
const int curLabel = labelImg.at<int>(y, x);
BBox& curBBox = bboxColl[curLabel];
if (curBBox._xMin > x)
curBBox._xMin = x;
if (curBBox._xMax < x)
curBBox._xMax = x;
if (curBBox._yMin > y)
curBBox._yMin = y;
if (curBBox._yMax < y)
curBBox._yMax = y;
}
}
// parse all labels
std::vector<bool> lutTable(nbComponents);
for (int i=0; i<nbComponents; ++i) {
// check current label width
const BBox& curBBox = bboxColl[i];
if (curBBox._xMax - curBBox._xMin > labelImg.cols * 0.3)
lutTable[i] = false;
else
lutTable[i] = true;
}
// create output image
Mat resImg(binImg);
MatConstIterator_<int> iterLab = labelImg.begin<int>();
MatIterator_<unsigned char> iterRes = resImg.begin<unsigned char>();
while (iterLab != labelImg.end<int>()) {
if (lutTable[*iterLab] == true)
*iterRes = 1;
else
*iterRes = 0;
++iterLab;
++iterRes;
}
// write result
imwrite("resImg3_1.tif", resImg);
}
I simply remove all labels for which with is greater than 30% of image total width. Your image is quite noisy so I can't use bounding box tops touches as said before, sorry...
Don't know if this will match with all your images but you could add some geometrical filters to improve this first version.
Regards,
You can use LineSegmentDetector for this purpose:
import numpy as np
import cv2
image = cv2.imread("image.png")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# This is the detector, you might have to play with the parameters
lsd = cv2.createLineSegmentDetector(0, _scale=0.6)
lines, widths, _, _ = lsd.detect(gray)
if lines is not None:
for i in range(0, len(lines)):
l = lines[i][0]
# Much slower version of Euclidean distance
if np.sqrt((l[0]-l[2])**2 + (l[1]-l[3])**2) > 50:
# You might have to tweak the threshold as well for other images
cv2.line(image, (l[0], l[1]), (l[2], l[3]), (255, 255, 255), 3,
cv2.LINE_AA)
cv2.imwrite("result.png", image)
Output:
As you can see, the lines aren't completely removed in the top image so I am leaving the tweaking part to you. Hope it helps!
I'd like to use this answer box to make a few comments.
First off, its way easier to see progress, if you can easily see what the output looks like visually. With that in mind, here is an update to your code with an emphasis on viewing interim results. I'm using VS Studio Community 2017, and OpenCV version 4.0.1 (64bit) in Win10 for anyone who wants to repeat this exercise. There were a few routines used that required updates for OpenCV 4...
#include "pch.h"
#include <iostream>
#include <opencv2/opencv.hpp>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
int main()
{
cv::Mat img = cv::imread("0zx9Q.png", cv::IMREAD_GRAYSCALE ); // --> Contour size = 0x000000e7 hex (231 each)
// cv::Mat img = cv::imread("0zx9Q.png", cv::IMREAD_REDUCED_GRAYSCALE_2); // --> Contour size = 0x00000068 hex (104 each)
// cv::Mat img = cv::imread("0zx9Q.png", cv::IMREAD_REDUCED_GRAYSCALE_4); // --> Contour size = 0x0000001f hex (31 each)
// cv::Mat img = cv::imread("0zx9Q.png", cv::IMREAD_REDUCED_GRAYSCALE_8); // --> Contour size = 0x00000034 hex (52 each)
if (!img.data) // Check for invalid input
{
std::cout << "Could not open or find the image" << std::endl;
return -1;
}
// cv::namedWindow("Display Window - GrayScale Image", cv::WINDOW_NORMAL); // Create a window for display.
// cv::imshow("Display Window - GrayScale Image", img); // Show our image inside it.
// cv::waitKey(0); // Wait for a keystroke in the window
cv::Mat imgOriginal = cv::imread("0zx9Q.png", cv::IMREAD_UNCHANGED);
cv::namedWindow("Display Window of Original Document", cv::WINDOW_NORMAL); // Create a window for display.
cv::Mat grad;
cv::Mat morphKernel = cv::getStructuringElement(cv::MORPH_RECT, cv::Size(25, 25));
// MORPH_ELLIPSE, contourSize: 0x00000005 when 60,60... but way slow...
// MORPH_ELLIPSE, contourSize: 0x00000007 when 30,30...
// MORPH_ELLIPSE, contourSize: 0x00000007 when 20,20...
// MORPH_ELLIPSE, contourSize: 0x0000000a when 15,15...
// MORPH_ELLIPSE, contourSize: 0x0000007a when 5,5...
// MORPH_ELLIPSE, contourSize: 0x000000e7 when 3,3 and IMREAD_GRAYSCALE
// MORPH_CROSS, contourSize: 0x0000008e when 5,5
// MORPH_CROSS, contourSize: 0x00000008 when 25,25
// MORPH_RECT, contourSize: 0x00000007 when 25,25
cv::morphologyEx(img, grad, cv::MORPH_GRADIENT, morphKernel);
cv::Mat res;
cv::threshold(grad, res, 0, 255, cv::THRESH_BINARY | cv::THRESH_OTSU);
// find contours
cv::Mat mask = cv::Mat::zeros(res.size(), CV_8UC1);
std::vector<std::vector<cv::Point>> contours;
std::vector<cv::Vec4i> hierarchy;
cv::findContours(res, contours, cv::RETR_EXTERNAL, cv::CHAIN_APPROX_SIMPLE);
int contourSize = contours.size();
std::cout << " There are a total of " << contourSize << " contours. \n";
for (int i = 0; i < contourSize; i++) {
cv::Mat approx;
double peri = cv::arcLength(contours[i], true);
cv::approxPolyDP(contours[i], approx, 0.04 * peri, true);
int num_vertices = approx.rows;
std::cout << " Contour # " << i << " has " << num_vertices << " vertices.\n";
if (num_vertices == 4) {
cv::Rect rect = cv::boundingRect(contours[i]);
cv::rectangle(imgOriginal, rect, cv::Scalar(255, 0, 0), 4);
}
}
cv::imshow("Display Window of Original Document", imgOriginal); // Show our image inside it.
cv::waitKey(0); // Wait for a keystroke in the window
}
With that said, the parameters for getStructuringElement() matter huge. I spent a lot of time trying different choices, with very mixed results. And it turns out there are a whole lot of findContours() responses that don't have four vertices. I suspect the whole findContours() approach is probably flawed. I often would get false rectangles identified around text characters in words and phrases. Additionally the lighter lines surrounding some boxed areas would be ignored.
Instead, I think I'd be looking hard at straight line detection, via techniques discussed here, if such a response exists for a C++ and not python. Perhaps here, or here? I hoping line detection techniques would ultimately get better results. And hey, if the documents / images selected always include a white background, it would be pretty easy to solid rectangle them OUT of the image, via LineTypes: cv::FILLED
Info here provided, not as an answer to the posted question but as a methodology to visually determine success in the future.
I have to write a program that detect 3 types of road signs (speed limit, no parking and warnings). I know how to detect a circle using HoughCircles but I have several images and the parameters for HoughCircles are different for each image. There's a general way to detect circles without changing parameters for each image?
Moreover I need to detect triangle (warning signs) so I'm searching for a general shape detector. Have you any suggestions/code that can help me in this task?
Finally for detect the number on speed limit signs I thought to use SIFT and compare the image with some templates in order to identify the number on the sign. Could it be a good approach?
Thank you for the answer!
I know this is a pretty old question but I had been through the same problem and now I show you how I solved it.
The following images show some of the most accurate results that are displayed by the opencv program.
In the following images the street signs detected are circled with three different colors that distinguish the three kinds of street signs (warning, no parking, speed limit).
Red for warning signs
Blue for no parking signs
Fuchsia for speed limit signs
The speed limit value is written in green above the speed limit signs
[![example][1]][1]
[![example][2]][2]
[![example][3]][3]
[![example][4]][4]
As you can see the program performs quite well, it is able to detect and distinguish the three kinds of sign and to recognize the speed limit value in case of speed limit signs. Everything is done without computing too many false positives when, for instance, in the image there are some signs that do not belong to one of the three categories.
In order to achieve this result the software computes the detection in three main steps.
The first step involves a color based approach where the red objects in the image are detected and their region are extract to be analyzed. This step is particularly useful in order to prevent the detection of false positives, because only a small part of the image is processed.
The second step works with a machine learning algorithm: in particular we use a Cascade Classifier to compute the detection. This operation firstly requires to train the classifiers and on a later stage to use them to detect the signs.
In the last step the speed limit values inside the speed limit signs are read, also in this case through a machine learning algorithm but using the k-nearest neighbor algorithm.
Now we are going to see in detail each step.
COLOR BASED STEP
Since the street signs are always circled by a red frame, we can afford to take out and analyze only the regions where the red objects are detected.
In order to select the red objects, we consider all the ranges of the red color: even if this may produce some false positives, they will be easily discarded in the next steps.
inRange(image, Scalar(0, 70, 50), Scalar(10, 255, 255), mask1);
inRange(image, Scalar(170, 70, 50), Scalar(180, 255, 255), mask2);
In the image below we can see an example of the red objects detected with this method.
After having found the red pixels we can gather them to find the regions using a clustering algorithm, I use the method
partition(<#_ForwardIterator __first#>, _ForwardIterator __last, <#_Predicate __pred#>)
After the execution of this method we can save all the points in the same cluster in a vector (one for each cluster) and extract the bounding boxes which represent the
regions to be analyzed in the next step.
HAAR CASCADE CLASSIFIERS FOR SIGNS DETECTION
This is the real detection step where the street signs are detected. In order to perform a cascade classifier the first step consist in building a dataset of positives and negatives images. Now I explain how I have built my own datasets of images.
The first thing to note is that we need to train three different Haar cascades in order to distinguish between the three kind of signs that we have to detect, hence we must repeat the following steps for each of the three kinds of sign.
We need two datasets: one for the positive samples (which must be a set of images that contains the road signs that we are going to detect) and another one for the negative samples which can be any kind of image without street signs.
After collecting a set of 100 images for the positive samples and a set of 200 images for the negatives in two different folders, we need to write two text files:
Signs.info which contains a list of file names like the one below,
one for each positive sample in the positive folder.
pos/image_name.png 1 0 0 50 45
Here, the numbers after the name represent respectively the number
of street signs in the image, the coordinate of the upper left
corner of the street sign, his height and his width.
Bg.txt which contains a list of file names like the one below, one
for each sign in the negative folder.
neg/street15.png
With the command line below we generate the .vect file which contains all the information that the software retrieves from the positive samples.
opencv_createsamples -info sign.info -num 100 -w 50 -h 50 -vec signs.vec
Afterwards we train the cascade classifier with the following command:
opencv_traincascade -data data -vec signs.vec -bg bg.txt -numPos 60 -numNeg 200 -numStages 15 -w 50 -h 50 -featureType LBP
where the number of stages indicates the number of classifiers that will be generated in order to build the cascade.
At the end of this process we gain a file cascade.xml which will be used from the CascadeClassifier program in order to detect the objects in the image.
Now we have trained our algorithm and we can declare a CascadeClassifier for each kind of street sign, than we detect the signs in the image through
detectMultiScale(<#InputArray image#>, <#std::vector<Rect> &objects#>)
this method creates a Rect around each object that has been detected.
It is important to note that exactly as every machine learning algorithm, in order to perform well, we need a large number of samples in the dataset. The dataset that I have built, is not extremely large, thus in some situations it is not able to detect all the signs. This mostly happens when a small part of the street sign is not visible in the image like in the warning sign below:
I have expanded my dataset up to the point where I have obtained a fairly accurate result without
too many errors.
SPEED LIMIT VALUE DETECTION
Like for the street signs detection also here I used a machine learning algorithm but with a different approach. After some work, I realized that an OCR (tesseract) solution does not perform well, so I decided to build my own ocr software.
For the machine learning algorithm I took the image below as training data which contains some speed limit values:
The amount of training data is small. But, since in speed limit signs all letters have the same font, it is not a huge problem.
To prepare the data for training, I made a small code in OpenCV. It does the following things:
It loads the image on the left;
It selects the digits (obviously by contour finding and applying constraints on area and height of letters to avoid false detections).
It draws the bounding rectangle around one letter and it waits for the key to be manually pressed. This time the user presses the digit key corresponding to the letter in box by himself.
Once the corresponding digit key is pressed, it saves 100 pixel values in an array and the correspondent manually entered digit in another array.
Eventually it saves both the arrays in separate txt files.
Following the manual digit classification all the digits in the train data( train.png) are manually labeled, and the image will look like the one below.
Now we enter into training and testing part.
For training we do as follows:
Load the txt files we already saved earlier
Create an instance of classifier that we are going to use ( KNearest)
Then we use KNearest.train function to train the data
Now the detection:
We load the image with the speed limit sign detected
Process the image as before and extract each digit using contour methods
Draw bounding box for it, then resize to 10x10, and store its pixel values in an array as done earlier.
Then we use KNearest.find_nearest() function to find the nearest item to the one we gave.
And it recognizes the correct digit.
I tested this little OCR on many images, and just with this small dataset I have obtained an accuracy of about 90%.
CODE
Below I post all my openCv c++ code in a single class, following my instruction you should be able to achive my result.
#include "opencv2/objdetect/objdetect.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>
#include <cmath>
#include <stdlib.h>
#include "opencv2/core/core.hpp"
#include "opencv2/highgui.hpp"
#include <string.h>
#include <opencv2/ml/ml.hpp>
using namespace std;
using namespace cv;
std::vector<cv::Rect> getRedObjects(cv::Mat image);
vector<Mat> detectAndDisplaySpeedLimit( Mat frame );
vector<Mat> detectAndDisplayNoParking( Mat frame );
vector<Mat> detectAndDisplayWarning( Mat frame );
void trainDigitClassifier();
string getDigits(Mat image);
vector<Mat> loadAllImage();
int getSpeedLimit(string speed);
//path of the haar cascade files
String no_parking_signs_cascade = "/Users/giuliopettenuzzo/Desktop/cascade_classifiers/no_parking_cascade.xml";
String speed_signs_cascade = "/Users/giuliopettenuzzo/Desktop/cascade_classifiers/speed_limit_cascade.xml";
String warning_signs_cascade = "/Users/giuliopettenuzzo/Desktop/cascade_classifiers/warning_cascade.xml";
CascadeClassifier speed_limit_cascade;
CascadeClassifier no_parking_cascade;
CascadeClassifier warning_cascade;
int main(int argc, char** argv)
{
//train the classifier for digit recognition, this require a manually train, read the report for more details
trainDigitClassifier();
cv::Mat sceneImage;
vector<Mat> allImages = loadAllImage();
for(int i = 0;i<=allImages.size();i++){
sceneImage = allImages[i];
//load the haar cascade files
if( !speed_limit_cascade.load( speed_signs_cascade ) ){ printf("--(!)Error loading\n"); return -1; };
if( !no_parking_cascade.load( no_parking_signs_cascade ) ){ printf("--(!)Error loading\n"); return -1; };
if( !warning_cascade.load( warning_signs_cascade ) ){ printf("--(!)Error loading\n"); return -1; };
Mat scene = sceneImage.clone();
//detect the red objects
std::vector<cv::Rect> allObj = getRedObjects(scene);
//use the three cascade classifier for each object detected by the getRedObjects() method
for(int j = 0;j<allObj.size();j++){
Mat img = sceneImage(Rect(allObj[j]));
vector<Mat> warningVec = detectAndDisplayWarning(img);
if(warningVec.size()>0){
Rect box = allObj[j];
}
vector<Mat> noParkVec = detectAndDisplayNoParking(img);
if(noParkVec.size()>0){
Rect box = allObj[j];
}
vector<Mat> speedLitmitVec = detectAndDisplaySpeedLimit(img);
if(speedLitmitVec.size()>0){
Rect box = allObj[j];
for(int i = 0; i<speedLitmitVec.size();i++){
//get speed limit and skatch it in the image
int digit = getSpeedLimit(getDigits(speedLitmitVec[i]));
if(digit > 0){
Point point = box.tl();
point.y = point.y + 30;
cv::putText(sceneImage,
"SPEED LIMIT " + to_string(digit),
point,
cv::FONT_HERSHEY_COMPLEX_SMALL,
0.7,
cv::Scalar(0,255,0),
1,
cv::CV__CAP_PROP_LATEST);
}
}
}
}
imshow("currentobj",sceneImage);
waitKey(0);
}
}
/*
* detect the red object in the image given in the param,
* return a vector containing all the Rect of the red objects
*/
std::vector<cv::Rect> getRedObjects(cv::Mat image)
{
Mat3b res = image.clone();
std::vector<cv::Rect> result;
cvtColor(image, image, COLOR_BGR2HSV);
Mat1b mask1, mask2;
//ranges of red color
inRange(image, Scalar(0, 70, 50), Scalar(10, 255, 255), mask1);
inRange(image, Scalar(170, 70, 50), Scalar(180, 255, 255), mask2);
Mat1b mask = mask1 | mask2;
Mat nonZeroCoordinates;
vector<Point> pts;
findNonZero(mask, pts);
for (int i = 0; i < nonZeroCoordinates.total(); i++ ) {
cout << "Zero#" << i << ": " << nonZeroCoordinates.at<Point>(i).x << ", " << nonZeroCoordinates.at<Point>(i).y << endl;
}
int th_distance = 2; // radius tolerance
// Apply partition
// All pixels within the radius tolerance distance will belong to the same class (same label)
vector<int> labels;
// With lambda function (require C++11)
int th2 = th_distance * th_distance;
int n_labels = partition(pts, labels, [th2](const Point& lhs, const Point& rhs) {
return ((lhs.x - rhs.x)*(lhs.x - rhs.x) + (lhs.y - rhs.y)*(lhs.y - rhs.y)) < th2;
});
// You can save all points in the same class in a vector (one for each class), just like findContours
vector<vector<Point>> contours(n_labels);
for (int i = 0; i < pts.size(); ++i){
contours[labels[i]].push_back(pts[i]);
}
// Get bounding boxes
vector<Rect> boxes;
for (int i = 0; i < contours.size(); ++i)
{
Rect box = boundingRect(contours[i]);
if(contours[i].size()>500){//prima era 1000
boxes.push_back(box);
Rect enlarged_box = box + Size(100,100);
enlarged_box -= Point(30,30);
if(enlarged_box.x<0){
enlarged_box.x = 0;
}
if(enlarged_box.y<0){
enlarged_box.y = 0;
}
if(enlarged_box.height + enlarged_box.y > res.rows){
enlarged_box.height = res.rows - enlarged_box.y;
}
if(enlarged_box.width + enlarged_box.x > res.cols){
enlarged_box.width = res.cols - enlarged_box.x;
}
Mat img = res(Rect(enlarged_box));
result.push_back(enlarged_box);
}
}
Rect largest_box = *max_element(boxes.begin(), boxes.end(), [](const Rect& lhs, const Rect& rhs) {
return lhs.area() < rhs.area();
});
//draw the rects in case you want to see them
for(int j=0;j<=boxes.size();j++){
if(boxes[j].area() > largest_box.area()/3){
rectangle(res, boxes[j], Scalar(0, 0, 255));
Rect enlarged_box = boxes[j] + Size(20,20);
enlarged_box -= Point(10,10);
rectangle(res, enlarged_box, Scalar(0, 255, 0));
}
}
rectangle(res, largest_box, Scalar(0, 0, 255));
Rect enlarged_box = largest_box + Size(20,20);
enlarged_box -= Point(10,10);
rectangle(res, enlarged_box, Scalar(0, 255, 0));
return result;
}
/*
* code for detect the speed limit sign , it draws a circle around the speed limit signs
*/
vector<Mat> detectAndDisplaySpeedLimit( Mat frame )
{
std::vector<Rect> signs;
vector<Mat> result;
Mat frame_gray;
cvtColor( frame, frame_gray, CV_BGR2GRAY );
//normalizes the brightness and increases the contrast of the image
equalizeHist( frame_gray, frame_gray );
//-- Detect signs
speed_limit_cascade.detectMultiScale( frame_gray, signs, 1.1, 3, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );
cout << speed_limit_cascade.getFeatureType();
for( size_t i = 0; i < signs.size(); i++ )
{
Point center( signs[i].x + signs[i].width*0.5, signs[i].y + signs[i].height*0.5 );
ellipse( frame, center, Size( signs[i].width*0.5, signs[i].height*0.5), 0, 0, 360, Scalar( 255, 0, 255 ), 4, 8, 0 );
Mat resultImage = frame(Rect(center.x - signs[i].width*0.5,center.y - signs[i].height*0.5,signs[i].width,signs[i].height));
result.push_back(resultImage);
}
return result;
}
/*
* code for detect the warning sign , it draws a circle around the warning signs
*/
vector<Mat> detectAndDisplayWarning( Mat frame )
{
std::vector<Rect> signs;
vector<Mat> result;
Mat frame_gray;
cvtColor( frame, frame_gray, CV_BGR2GRAY );
equalizeHist( frame_gray, frame_gray );
//-- Detect signs
warning_cascade.detectMultiScale( frame_gray, signs, 1.1, 3, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );
cout << warning_cascade.getFeatureType();
Rect previus;
for( size_t i = 0; i < signs.size(); i++ )
{
Point center( signs[i].x + signs[i].width*0.5, signs[i].y + signs[i].height*0.5 );
Rect newRect = Rect(center.x - signs[i].width*0.5,center.y - signs[i].height*0.5,signs[i].width,signs[i].height);
if((previus & newRect).area()>0){
previus = newRect;
}else{
ellipse( frame, center, Size( signs[i].width*0.5, signs[i].height*0.5), 0, 0, 360, Scalar( 0, 0, 255 ), 4, 8, 0 );
Mat resultImage = frame(newRect);
result.push_back(resultImage);
previus = newRect;
}
}
return result;
}
/*
* code for detect the no parking sign , it draws a circle around the no parking signs
*/
vector<Mat> detectAndDisplayNoParking( Mat frame )
{
std::vector<Rect> signs;
vector<Mat> result;
Mat frame_gray;
cvtColor( frame, frame_gray, CV_BGR2GRAY );
equalizeHist( frame_gray, frame_gray );
//-- Detect signs
no_parking_cascade.detectMultiScale( frame_gray, signs, 1.1, 3, 0|CV_HAAR_SCALE_IMAGE, Size(30, 30) );
cout << no_parking_cascade.getFeatureType();
Rect previus;
for( size_t i = 0; i < signs.size(); i++ )
{
Point center( signs[i].x + signs[i].width*0.5, signs[i].y + signs[i].height*0.5 );
Rect newRect = Rect(center.x - signs[i].width*0.5,center.y - signs[i].height*0.5,signs[i].width,signs[i].height);
if((previus & newRect).area()>0){
previus = newRect;
}else{
ellipse( frame, center, Size( signs[i].width*0.5, signs[i].height*0.5), 0, 0, 360, Scalar( 255, 0, 0 ), 4, 8, 0 );
Mat resultImage = frame(newRect);
result.push_back(resultImage);
previus = newRect;
}
}
return result;
}
/*
* train the classifier for digit recognition, this could be done only one time, this method save the result in a file and
* it can be used in the next executions
* in order to train user must enter manually the corrisponding digit that the program shows, press space if the red box is just a point (false positive)
*/
void trainDigitClassifier(){
Mat thr,gray,con;
Mat src=imread("/Users/giuliopettenuzzo/Desktop/all_numbers.png",1);
cvtColor(src,gray,CV_BGR2GRAY);
threshold(gray,thr,125,255,THRESH_BINARY_INV); //Threshold to find contour
imshow("ci",thr);
waitKey(0);
thr.copyTo(con);
// Create sample and label data
vector< vector <Point> > contours; // Vector for storing contour
vector< Vec4i > hierarchy;
Mat sample;
Mat response_array;
findContours( con, contours, hierarchy,CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE ); //Find contour
for( int i = 0; i< contours.size(); i=hierarchy[i][0] ) // iterate through first hierarchy level contours
{
Rect r= boundingRect(contours[i]); //Find bounding rect for each contour
rectangle(src,Point(r.x,r.y), Point(r.x+r.width,r.y+r.height), Scalar(0,0,255),2,8,0);
Mat ROI = thr(r); //Crop the image
Mat tmp1, tmp2;
resize(ROI,tmp1, Size(10,10), 0,0,INTER_LINEAR ); //resize to 10X10
tmp1.convertTo(tmp2,CV_32FC1); //convert to float
imshow("src",src);
int c=waitKey(0); // Read corresponding label for contour from keyoard
c-=0x30; // Convert ascii to intiger value
response_array.push_back(c); // Store label to a mat
rectangle(src,Point(r.x,r.y), Point(r.x+r.width,r.y+r.height), Scalar(0,255,0),2,8,0);
sample.push_back(tmp2.reshape(1,1)); // Store sample data
}
// Store the data to file
Mat response,tmp;
tmp=response_array.reshape(1,1); //make continuous
tmp.convertTo(response,CV_32FC1); // Convert to float
FileStorage Data("TrainingData.yml",FileStorage::WRITE); // Store the sample data in a file
Data << "data" << sample;
Data.release();
FileStorage Label("LabelData.yml",FileStorage::WRITE); // Store the label data in a file
Label << "label" << response;
Label.release();
cout<<"Training and Label data created successfully....!! "<<endl;
imshow("src",src);
waitKey(0);
}
/*
* get digit from the image given in param, using the classifier trained before
*/
string getDigits(Mat image)
{
Mat thr1,gray1,con1;
Mat src1 = image.clone();
cvtColor(src1,gray1,CV_BGR2GRAY);
threshold(gray1,thr1,125,255,THRESH_BINARY_INV); // Threshold to create input
thr1.copyTo(con1);
// Read stored sample and label for training
Mat sample1;
Mat response1,tmp1;
FileStorage Data1("TrainingData.yml",FileStorage::READ); // Read traing data to a Mat
Data1["data"] >> sample1;
Data1.release();
FileStorage Label1("LabelData.yml",FileStorage::READ); // Read label data to a Mat
Label1["label"] >> response1;
Label1.release();
Ptr<ml::KNearest> knn(ml::KNearest::create());
knn->train(sample1, ml::ROW_SAMPLE,response1); // Train with sample and responses
cout<<"Training compleated.....!!"<<endl;
vector< vector <Point> > contours1; // Vector for storing contour
vector< Vec4i > hierarchy1;
//Create input sample by contour finding and cropping
findContours( con1, contours1, hierarchy1,CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE );
Mat dst1(src1.rows,src1.cols,CV_8UC3,Scalar::all(0));
string result;
for( int i = 0; i< contours1.size(); i=hierarchy1[i][0] ) // iterate through each contour for first hierarchy level .
{
Rect r= boundingRect(contours1[i]);
Mat ROI = thr1(r);
Mat tmp1, tmp2;
resize(ROI,tmp1, Size(10,10), 0,0,INTER_LINEAR );
tmp1.convertTo(tmp2,CV_32FC1);
Mat bestLabels;
float p=knn -> findNearest(tmp2.reshape(1,1),4, bestLabels);
char name[4];
sprintf(name,"%d",(int)p);
cout << "num = " << (int)p;
result = result + to_string((int)p);
putText( dst1,name,Point(r.x,r.y+r.height) ,0,1, Scalar(0, 255, 0), 2, 8 );
}
imwrite("dest.jpg",dst1);
return result ;
}
/*
* from the digits detected, it returns a speed limit if it is detected correctly, -1 otherwise
*/
int getSpeedLimit(string numbers){
if ((numbers.find("30") != std::string::npos) || (numbers.find("03") != std::string::npos)) {
return 30;
}
if ((numbers.find("50") != std::string::npos) || (numbers.find("05") != std::string::npos)) {
return 50;
}
if ((numbers.find("80") != std::string::npos) || (numbers.find("08") != std::string::npos)) {
return 80;
}
if ((numbers.find("70") != std::string::npos) || (numbers.find("07") != std::string::npos)) {
return 70;
}
if ((numbers.find("90") != std::string::npos) || (numbers.find("09") != std::string::npos)) {
return 90;
}
if ((numbers.find("100") != std::string::npos) || (numbers.find("001") != std::string::npos)) {
return 100;
}
if ((numbers.find("130") != std::string::npos) || (numbers.find("031") != std::string::npos)) {
return 130;
}
return -1;
}
/*
* load all the image in the file with the path hard coded below
*/
vector<Mat> loadAllImage(){
vector<cv::String> fn;
glob("/Users/giuliopettenuzzo/Desktop/T1/dataset/*.jpg", fn, false);
vector<Mat> images;
size_t count = fn.size(); //number of png files in images folder
for (size_t i=0; i<count; i++)
images.push_back(imread(fn[i]));
return images;
}
maybe you should try implementing the ransac algorithm, if you are using color images, migt be a good idea (if you are in europe) to get the red channel only since the speed limits are surrounded by a red cricle (or a thin white i think also).
For that you need to filter the image to get the edges, (canny filter).
Here are some useful links:
OpenCV detect partial circle with noise
https://hal.archives-ouvertes.fr/hal-00982526/document
Finally for the numbers detection i think its ok. Other approach is to use something like Viola-Jones algorithm to detect the signals, with pretrained existing models... It's up to you!
I'm playing around with OpenCV and I want to know how you would build a simple version of a perspective transform program. I have a image of a parallelogram and each corner of it consists of a pixel with a specific color, which is nowhere else in the image. I want to iterate through all pixels and find these 4 pixels. Then I want to use them as corner points in a new image in order to warp the perspective of the original image. In the end I should have a zoomed on square.
Point2f src[4]; //Is this the right datatype to use here?
int lineNumber=0;
//iterating through the pixels
for(int y = 0; y < image.rows; y++)
{
for(int x = 0; x < image.cols; x++)
{
Vec3b colour = image.at<Vec3b>(Point(x, y));
if(color.val[1]==245 && color.val[2]==111 && color.val[0]==10) {
src[lineNumber]=this pixel // something like Point2f(x,y) I guess
lineNumber++;
}
}
}
/* I also need to get the dst points for getPerspectiveTransform
and afterwards warpPerspective, how do I get those? Take the other
points, check the biggest distance somehow and use it as the maxlength to calculate
the rest? */
How should you use OpenCV in order to solve the problem? (I just guess I'm not doing it the "normal and clever way") Also how do I do the next step, which would be using more than one pixel as a "marker" and calculate the average point in the middle of multiple points. Is there something more efficient than running through each pixel?
Something like this basically:
Starting from an image with colored circles as markers, like:
Note that is a png image, i.e. with a loss-less compression which preserves the actual color. If you use a lossy compression like jpeg the colors will change a little, and you cannot segment them with an exact match, as done here.
You need to find the center of each marker.
Segment the (known) color, using inRange
Find all connected components with the given color, with findContours
Find the largest blob, here done with max_element with a lambda function, and distance. You can use a for loop for this.
Find the center of mass of the largest blob, here done with moments. You can use a loop also here, eventually.
Add the center to your source vertices.
Your destination vertices are just the four corners of the destination image.
You can then use getPerspectiveTransform and warpPerspective to find and apply the warping.
The resulting image is:
Code:
#include <opencv2/opencv.hpp>
#include <vector>
#include <algorithm>
using namespace std;
using namespace cv;
int main()
{
// Load image
Mat3b img = imread("path_to_image");
// Create a black output image
Mat3b out(300,300,Vec3b(0,0,0));
// The color of your markers, in order
vector<Scalar> colors{ Scalar(0, 0, 255), Scalar(0, 255, 0), Scalar(255, 0, 0), Scalar(0, 255, 255) }; // red, green, blue, yellow
vector<Point2f> src_vertices(colors.size());
vector<Point2f> dst_vertices = { Point2f(0, 0), Point2f(0, out.rows - 1), Point2f(out.cols - 1, out.rows - 1), Point2f(out.cols - 1, 0) };
for (int idx_color = 0; idx_color < colors.size(); ++idx_color)
{
// Detect color
Mat1b mask;
inRange(img, colors[idx_color], colors[idx_color], mask);
// Find connected components
vector<vector<Point>> contours;
findContours(mask, contours, RETR_EXTERNAL, CHAIN_APPROX_NONE);
// Find largest
int idx_largest = distance(contours.begin(), max_element(contours.begin(), contours.end(), [](const vector<Point>& lhs, const vector<Point>& rhs) {
return lhs.size() < rhs.size();
}));
// Find centroid of largest component
Moments m = moments(contours[idx_largest]);
Point2f center(m.m10 / m.m00, m.m01 / m.m00);
// Found marker center, add to source vertices
src_vertices[idx_color] = center;
}
// Find transformation
Mat M = getPerspectiveTransform(src_vertices, dst_vertices);
// Apply transformation
warpPerspective(img, out, M, out.size());
imshow("Image", img);
imshow("Warped", out);
waitKey();
return 0;
}
I want to blend two images using multiband blending but I am not clear to the input parameter of this function:
void detail::Blender::prepare(const std::vector<Point>& corners, const std::vector<Size>& sizes)
In my case ,I just input two warped images with black gap, and with masks all white.(forgive me can not add pictures...)
And I set the two corners (0.0,0.0),because the warped images has been registered.
but my result is not good enough.with obvious seam in the result
can someone tell me why?How can I solve this problem?
I'm not sure what do you mean when you say "my result is not good enough". It's better to watch that result, but I'll try to guess. My main part of code, which makes panorama, looks like this:
void makePanorama(Rect bounding_box, vector<Mat> images, vector<Mat> homographies, vector<vector<Point>> corners) {
detail::MultiBandBlender blender;
blender.prepare(bounding_box);
Mat mask, bigImage, curImage;
for (int i = 0; i < (int)images.size(); ++i) {
warpPerspective(images[i], curImage, homographies[i],
bounding_box.size(), INTER_LINEAR, ORDER_TRANSPARENT);
mask = makeMask(curImage.size(), corners[i], homographies[i]);
blender.feed(curImage.clone(), mask, Point(0, 0));
}
blender.blend(bigImage, mask);
bigImage.convertTo(bigImage, (bigImage.type() / 8) * 8);
imshow("Result", bigImage);
waitKey();
}
So, prepare blender and then loop: warp image, make the mask after warped image and feed blender. At the end, turn this blender on and that's all. I met two problems, which influence on my result badly. May be you have one of them or both.
The first is type. My images had CV_16SC3, and after blending you need to convert blended image type into unsigned one. Like this
bigImage.convertTo(bigImage, (bigImage.type() / 8) * 8);
If you not, the result image would be gray.
The second is borders. In the beginning, my function makeMask was calculating non-black area of warped images. As a result, the one could see borders of the warped images on the blended image. The solution is to make mask smaller than non-black warped image area. So, my function makeMask is looks like this:
Mat makeMask(Size sz, vector<Point2f> imageCorners, Mat homorgaphy) {
Scalar white(255, 255, 255);
Mat mask = Mat::zeros(sz, CV_8U);
Point2f innerPoint;
vector<Point2f> transformedCorners(4);
perspectiveTransform(imageCorners, transformedCorners, homorgaphy);
// Calculate inner point
for (auto& point : transformedCorners)
innerPoint += point;
innerPoint.x /= 4;
innerPoint.y /= 4;
// Make indent for each corner
vector<Point> corners;
for (int ind = 0; ind < 4; ++ind) {
Point2f direction = innerPoint - transformedCorners[ind];
double normOfDirection = norm(direction);
corners[ind].x += settings.indent * direction.x / normOfDirection;
corners[ind].y += settings.indent * direction.y / normOfDirection;
}
// Draw borders
Point prevPoint = corners[3];
for (auto& point : corners) {
line(mask, prevPoint, point, white);
prevPoint = point;
}
// Fill with white
floodFill(mask, innerPoint, white);
return mask;
}
I took this pieces of code from my real code, so I could possibly forget to specify something. But I hope, the idea of how to work with MultiBandBlender is clear.
I'm looking for a way to place on image on top of another image at a set location.
I have been able to place images on top of each other using cv::addWeighted but when I searched for this particular problem, there wasn't any posts that I could find relating to C++.
Quick Example:
200x200 Red Square & 100x100 Blue Square
&
Blue Square on the Red Square at 70x70 (From top left corner Pixel of Blue Square)
You can also create a Mat that points to a rectangular region of the original image and copy the blue image to that:
Mat bigImage = imread("redSquare.png", -1);
Mat lilImage = imread("blueSquare.png", -1);
Mat insetImage(bigImage, Rect(70, 70, 100, 100));
lilImage.copyTo(insetImage);
imshow("Overlay Image", bigImage);
Building from beaker answer, and generalizing to any input images size, with some error checking:
cv::Mat bigImage = cv::imread("redSquare.png", -1);
const cv::Mat smallImage = cv::imread("blueSquare.png", -1);
const int x = 70;
const int y = 70;
cv::Mat destRoi;
try {
destRoi = bigImage(cv::Rect(x, y, smallImage.cols, smallImage.rows));
} catch (...) {
std::cerr << "Trying to create roi out of image boundaries" << std::endl;
return -1;
}
smallImage.copyTo(destRoi);
cv::imshow("Overlay Image", bigImage);
Check cv::Mat::operator()
Note: Probably this will still fail if the 2 images have different formats, e.g. if one is color and the other grayscale.
Suggested explicit algorithm:
1 - Read two images. E.g., bottom.ppm, top.ppm,
2 - Read the location for overlay. E.g., let the wanted top-left corner of "top.ppm" on "bottom.ppm" be (x,y) where 0 < x < bottom.height() and 0 < y < bottom.width(),
3 - Finally, nested loop on the top image to modify the bottom image pixel by pixel:
for(int i=0; i<top.height(); i++) {
for(int j=0; j<top.width(), j++) {
bottom(x+i, y+j) = top(i,j);
}
}
return bottom image.