Related
Problem : Watershed algorithm
I started app project, for image processing, using OpenCv 4.5.3 and Swift ( with C++ ). I'm fighting with watershaded alg. for a really long time... And i have no clue what did i do wrong. Just don't know...
Error :
libc++abi.dylib: terminating with uncaught exception of type cv::Exception: OpenCV(4.5.3)
/Volumes/build-storage/build/master_iOS-mac/opencv/modules/imgproc/src/segmentation.cpp:161:
error: (-215:Assertion failed) src.type()
== CV_8UC3 && dst.type() == CV_32SC1 in function 'watershed'
terminating with uncaught exception of type cv::Exception: OpenCV(4.5.3)
/Volumes/build-storage/build/master_iOS-mac/opencv/modules/imgproc/src/segmentation.cpp:161: error:
(-215:Assertion failed) src.type()
== CV_8UC3 && dst.type() == CV_32SC1 in function 'watershed'
In the definition of openCv's watershed we can find :
#param image Input 8-bit 3-channel image.
#param markers Input/output 32-bit single-channel image (map) of markers. It should have the same size as image .
Code
+(UIImage *) watershed:(UIImage *)src{
cv::Mat img, mask;
UIImageToMat(src, img);
// Change the background from white to black, since that will help later to extract
// better results during the use of Distance Transform
cv::inRange(img, cv::Scalar(255,255,255), cv::Scalar(255,255,255), mask);
img.setTo(cv::Scalar(0,0,0), mask);
// Create a kernel that we will use to sharpen our image
// an approximation of second derivative, a quite strong kernel
cv::Mat kernel = (cv::Mat_<float>(3,3) <<
1, 1, 1,
1, -8, 1,
1, 1, 1);
// do the laplacian filtering as it is
// well, we need to convert everything in something more deeper then CV_8U
// because the kernel has some negative values,
// and we can expect in general to have a Laplacian image with negative values
// BUT a 8bits unsigned int (the one we are working with) can contain values from 0 to 255
// so the possible negative number will be truncated
cv::Mat lapl;
cv::filter2D(img, lapl, CV_32F, kernel);
cv::Mat sharp;
img.convertTo(sharp, CV_32F);
cv::Mat result = sharp - lapl;
// convert back to 8bits gray scale
result.convertTo(result, CV_8UC3);
lapl.convertTo(lapl, CV_8UC3);
cv::Mat bw;
cv::cvtColor(result, bw, cv::COLOR_BGR2GRAY);
cv::threshold(bw, bw, 40, 255, cv::THRESH_BINARY | cv::THRESH_OTSU);
// Perform the distance transform algorithm
cv::Mat dist;
cv::distanceTransform(bw, dist, cv::DIST_L2, cv::DIST_MASK_3);
// Normalize the distance image for range = {0.0, 1.0}
// so we can visualize and threshold it
cv::normalize(dist, dist, 0, 1.0, cv::NORM_MINMAX);
// Threshold to obtain the peaks
// This will be the markers for the foreground objects
cv::threshold(dist, dist, 0.4, 1.0, cv::THRESH_BINARY);
// Dilate a bit the dist image
cv::Mat kernel1 = cv::Mat::ones(3, 3, CV_8U);
dilate(dist, dist, kernel1);
// Create the CV_8U version of the distance image
// It is needed for findContours()
cv::Mat dist_8u;
dist.convertTo(dist_8u, CV_8U);
// Find total markers
std::vector<std::vector<cv::Point> > contours;
findContours(dist_8u, contours, cv::RETR_EXTERNAL, cv::CHAIN_APPROX_SIMPLE);
// Create the marker image for the watershed algorithm
cv::Mat markers = cv::Mat::zeros(dist.size(), CV_32S);
// Draw the foreground markers
for (size_t i = 0; i < contours.size(); i++)
{
drawContours(markers, contours, static_cast<int>(i), cv::Scalar(static_cast<int>(i)+1), -1);
}
// Draw the background marker
circle(markers, cv::Point(5,5), 3, cv::Scalar(255), -1);
cv::Mat markers8u;
markers.convertTo(markers8u, CV_8U, 10);
// Perform the watershed algorithm
watershed(result, markers);
return MatToUIImage(result);
}
You can clearly see, that variables has proper type, as in descr. of function:
result.convertTo(result, CV_8UC3);
cv::Mat markers = cv::Mat::zeros(dist.size(), CV_32S);
The convertTo can not add channels as well can not reduce/convert image to image with smaller amount of channels.
The key in this case is to use :
cvtColor(src, src, COLOR_BGRA2BGR); // change 4 to 3 channels
I am trying to make an average of two blobs in OpenCV. To achieve that I was planning to use watershed algorithm on the image preprocessed in the following way:
cv::Mat common, diff, processed, result;
cv::bitwise_and(blob1, blob2, common); //calc common area of the two blobs
cv::absdiff(blob1, blob2, diff); //calc area where they differ
cv::distanceTransform(diff, processed, CV_DIST_L2, 3); //idea here is that the highest intensity
//will be in the middle of the differing area
cv::normalize(processed, processed, 0, 255, cv::NORM_MINMAX, CV_8U); //convert floats to bytes
cv::Mat watershedMarkers, watershedOutline;
common.convertTo(watershedMarkers, CV_32S, 1. / 255, 1); //change background to label 1, common area to label 2
watershedMarkers.setTo(0, processed); //set 0 (unknown) for area where blobs differ
cv::cvtColor(processed, processed, CV_GRAY2RGB); //watershed wants 3 channels
cv::watershed(processed, watershedMarkers);
cv::rectangle(watershedMarkers, cv::Rect(0, 0, watershedMarkers.cols, watershedMarkers.rows), 1); //remove the outline
//draw the boundary in red (for debugging)
watershedMarkers.convertTo(watershedOutline, CV_16S);
cv::threshold(watershedOutline, watershedOutline, 0, 255, CV_THRESH_BINARY_INV);
watershedOutline.convertTo(watershedOutline, CV_8U);
processed.setTo(cv::Scalar(CV_RGB(255, 0, 0)), watershedOutline);
//convert computed labels back to mask (blob), less relevant but shows my ultimate goal
watershedMarkers.convertTo(watershedMarkers, CV_8U);
cv::threshold(watershedMarkers, watershedMarkers, 1, 0, CV_THRESH_TOZERO_INV);
cv::bitwise_not(watershedMarkers * 255, result);
My problem with the results is that the calculated boundary is (almost) always adjacent to the area common to both blobs. Here are the pictures:
Input markers (black = 0, gray = 1, white = 2)
Watershed input image (distance transform result) with resulting outline drawn in red:
I would expect the boundary to go along the maximum intensity region of the input (that is, along the middle of the differing area). Instead (as you can see) it mostly goes around the area marked as 2, with a bit shifted to touch the background (marked as 1). Do I do something wrong here, or did I misunderstand how watershed works?
Starting from this image:
You can get the correct result simply passing an all-zero image to watershed algorithm. The "basin" is then equally filled of "water" starting from each "side" (then just remember to remove the outer border which is set by default to -1 by watershed algorithm):
Code:
#include <opencv2\opencv.hpp>
using namespace cv;
using namespace std;
int main()
{
Mat1b img = imread("path_to_image", IMREAD_GRAYSCALE);
Mat1i markers(img.rows, img.cols, int(0));
markers.setTo(1, img == 128);
markers.setTo(2, img == 255);
Mat3b image(markers.rows, markers.cols, Vec3b(0,0,0));
markers.convertTo(markers, CV_32S);
watershed(image, markers);
Mat3b result;
cvtColor(img, result, COLOR_GRAY2BGR);
result.setTo(Scalar(0, 0, 255), markers == -1);
imshow("Result", result);
waitKey();
return(0);
}
I'm currently working on a project that uses a Lacatan Banana, and I would like to know how to further separate the foreground from the background:
I already got a segmented image of it using erosion, dilation, and thresholding only. The problem is that it is still not properly segmented.
Here is my code:
cv::Mat imggray, imgthresh, fg, bgt, bg;
cv::cvtColor(src, imggray, CV_BGR2GRAY); //Grayscaling the image from RGB color space
cv::threshold(imggray, imgthresh, 0, 255, CV_THRESH_BINARY_INV | CV_THRESH_OTSU); //Create an inverted binary image from the grayscaled image
cv::erode(imgthresh, fg, cv::Mat(), cv::Point(-1, -1), 1); //erosion of the binary image and setting it as the foreground
cv::dilate(imgthresh, bgt, cv::Mat(), cv::Point(-1, -1), 4); //dilation of the binary image to reduce the background region
cv::threshold(bgt, bg, 1, 128, CV_THRESH_BINARY); //we get the background by setting the threshold to 1
cv::Mat markers = cv::Mat::zeros(src.size(), CV_32SC1); //initializing the markers with a size same as the source image and setting its data type as 32-bit Single channel
cv::add(fg, bg, markers); //setting the foreground and background as markers
cv::Mat mask = cv::Mat::zeros(markers.size(), CV_8UC1);
markers.convertTo(mask, CV_8UC1); //converting the 32-bit single channel marker to a 8-bit single channel
cv::Mat mthresh;
cv::threshold(mask, mthresh, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU); //threshold further the mask to reduce the noise
// cv::erode(mthresh,mthresh,cv::Mat(), cv::Point(-1,-1),2);
cv::Mat result;
cv::bitwise_and(src, src, result, mthresh); //use the mask to subtrack the banana from the background
for (int x = 0; x < result.rows; x++) { //changing the black background to white
for (int y = 0; y < result.cols; y++) {
if (result.at<Vec3b>(x, y) == Vec3b(0, 0, 0)){
result.at<Vec3b>(x, y)[0] = 255;
result.at<Vec3b>(x, y)[1] = 255;
result.at<Vec3b>(x, y)[2] = 255;
}
}
}
This is my result:
As the background is near gray-color, try using Hue channel and Saturation channel instead of grayscale image.
You can get them easily.
cv::Mat hsv;
cv::cvtColor(src, hsv, CV_BGR2HSV);
std::vector<cv::Mat> channels;
cv::split(src, channels);
cv::Mat hue = channels[0];
cv::Mat saturation = channels[1];
// If you want to combine those channels, use this code.
cv::Mat hs = cv::Mat::zeros(src.size(), CV_8U);
for(int r=0; r<src.rows; r++) {
for(int c=0; c<src.cols; c++) {
int hp = h.at<uchar>(r,c);
int sp = s.at<uchar>(r,c);
hs.at<uchar>(r, c) = static_cast<uchar>((h+s)>>1);
}
}
adaptiveThreshold() should work better than just level-cut threshold(), because it does not consider absolute color levels, but rather a change in color in small area around the point being checked.
Try replacing your thresholding with adaptive one.
Use a top-hat instead of just erosion/dilation. It will take care of the background variations at the same time.
Then in your case a simple thresholding should be good enough to have an accurate segmentation. Else, you can couple it with a watershed.
(I will share some images asap).
Thanks guys, I tried to apply your advises and was able to come up with this
However as you can see there are still bits of the background,any ideas how to "clean" these further, i tried thresholding further but it would still have the bits.The Code I came up with is below and i apologize in advance if the variables and coding style is somewhat confusing didn't have the time to properly sort them.
#include <stdio.h>
#include <iostream>
#include <opencv2\core.hpp>
#include <opencv2\opencv.hpp>
#include <opencv2\highgui.hpp>
using namespace cv;
using namespace std;
Mat COLOR_MAX(Scalar(65, 255, 255));
Mat COLOR_MIN(Scalar(15, 45, 45));
int main(int argc, char** argv){
Mat src,hsv_img,mask,gray_img,initial_thresh;
Mat second_thresh,add_res,and_thresh,xor_thresh;
Mat result_thresh,rr_thresh,final_thresh;
// Load source Image
src = imread("sample11.jpg");
imshow("Original Image", src);
cvtColor(src,hsv_img,CV_BGR2HSV);
imshow("HSV Image",hsv_img);
//imwrite("HSV Image.jpg", hsv_img);
inRange(hsv_img,COLOR_MIN,COLOR_MAX, mask);
imshow("Mask Image",mask);
cvtColor(src,gray_img,CV_BGR2GRAY);
adaptiveThreshold(gray_img, initial_thresh, 255,ADAPTIVE_THRESH_GAUSSIAN_C,CV_THRESH_BINARY_INV,257,2);
imshow("AdaptiveThresh Image", initial_thresh);
add(mask,initial_thresh,add_res);
erode(add_res, add_res, Mat(), Point(-1, -1), 1);
dilate(add_res, add_res, Mat(), Point(-1, -1), 5);
imshow("Bitwise Res",add_res);
threshold(gray_img,second_thresh,170,255,CV_THRESH_BINARY_INV | CV_THRESH_OTSU);
imshow("TreshImge", second_thresh);
bitwise_and(add_res,second_thresh,and_thresh);
imshow("andthresh",and_thresh);
bitwise_xor(add_res, second_thresh, xor_thresh);
imshow("xorthresh",xor_thresh);
bitwise_or(and_thresh,xor_thresh,result_thresh);
imshow("Result image", result_thresh);
bitwise_and(add_res,result_thresh,final_thresh);
imshow("Final Thresh",final_thresh);
erode(final_thresh, final_thresh, Mat(), Point(-1,-1),5);
bitwise_and(src,src,rr_thresh,final_thresh);
imshow("Segmented Image", rr_thresh);
imwrite("Segmented Image.jpg", rr_thresh);
waitKey(0);
return 1;
}
I am a beginner in OpenCV, I need to remove the horizontal and vertical lines in the image so that only the text remains ( The lines were causing trouble when extracting text in ocr ). I am trying to extract text from the Nutrient Fact Table. Can anyone help me?
This was an interesting question, so I gave it a shot. Below I will show you how to extract and remove horizontal and vertical lines. You could extrapolate from it. Also, for sake of saving time, I did not preprocess your image to crop out the background as one should, which is an avenue for improvement.
The result:
The code (edit: added vertical lines):
#include <iostream>
#include <opencv2/opencv.hpp>
using namespace std;
using namespace cv;
int main(int, char** argv)
{
// Load the image
Mat src = imread(argv[1]);
// Check if image is loaded fine
if(!src.data)
cerr << "Problem loading image!!!" << endl;
Mat gray;
if (src.channels() == 3)
{
cvtColor(src, gray, CV_BGR2GRAY);
}
else
{
gray = src;
}
//inverse binary img
Mat bw;
//this will hold the result, image to be passed to OCR
Mat fin;
//I find OTSU binarization best for text.
//Would perform better if background had been cropped out
threshold(gray, bw, 0, 255, THRESH_BINARY_INV | THRESH_OTSU);
threshold(gray, fin, 0, 255, THRESH_BINARY | THRESH_OTSU);
imshow("binary", bw);
Mat dst;
Canny( fin, dst, 50, 200, 3 );
Mat str = getStructuringElement(MORPH_RECT, Size(3,3));
dilate(dst, dst, str, Point(-1, -1), 3);
imshow("dilated_canny", dst);
//bitwise_and w/ canny image helps w/ background noise
bitwise_and(bw, dst, dst);
imshow("and", dst);
Mat horizontal = dst.clone();
Mat vertical = dst.clone();
fin = ~dst;
//Image that will be horizontal lines
Mat horizontal = bw.clone();
//Selected this value arbitrarily
int horizontalsize = horizontal.cols / 30;
Mat horizontalStructure = getStructuringElement(MORPH_RECT, Size(horizontalsize,1));
erode(horizontal, horizontal, horizontalStructure, Point(-1, -1));
dilate(horizontal, horizontal, horizontalStructure, Point(-1, -1), 1);
imshow("horizontal_lines", horizontal);
//Need to find horizontal contours, so as to not damage letters
vector<Vec4i> hierarchy;
vector<vector<Point> >contours;
findContours(horizontal, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_NONE);
for (const auto& c : contours)
{
Rect r = boundingRect(c);
float percentage_height = (float)r.height / (float)src.rows;
float percentage_width = (float)r.width / (float)src.cols;
//These exclude contours that probably are not dividing lines
if (percentage_height > 0.05)
continue;
if (percentage_width < 0.50)
continue;
//fills in line with white rectange
rectangle(fin, r, Scalar(255,255,255), CV_FILLED);
}
int verticalsize = vertical.rows / 30;
Mat verticalStructure = getStructuringElement(MORPH_RECT, Size(1,verticalsize));
erode(vertical, vertical, verticalStructure, Point(-1, -1));
dilate(vertical, vertical, verticalStructure, Point(-1, -1), 1);
imshow("verticalal", vertical);
findContours(vertical, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_NONE);
for (const auto& c : contours)
{
Rect r = boundingRect(c);
float percentage_height = (float)r.height / (float)src.rows;
float percentage_width = (float)r.width / (float)src.cols;
//These exclude contours that probably are not dividing lines
if (percentage_width > 0.05)
continue;
if (percentage_height < 0.50)
continue;
//fills in line with white rectange
rectangle(fin, r, Scalar(255,255,255), CV_FILLED);
}
imshow("Result", fin);
waitKey(0);
return 0;
}
The limitations of this approach are that the lines need to be straight. Due to the curve in the bottom line, it cuts slightly into "E" in "Energy". Perhaps with a hough line detection like suggested (I've never used it), a similar but more robust approach could be devised. Also, filling in the lines with rectangles probably is not the best approach.
I've been following this tutorial to get the skew angle of an image. It seems like HoughLinesP is struggling to find lines when characters are a bit scattered on the target image.
This is my input image:
This is the lines the HoughLinesP has found:
It's not really getting most of the lines and it seems pretty obvious to me why. This is because I've set my minLineWidth to be (size.width / 2.f). The point is that because of the few lines it has found it turns out that the skew angle is also wrong. (-3.15825 in this case, when it should be something close to 0.5)
I've tried to erode my input file to make characters get closer and in this case it seems to work out, but I don't feel this is best approach for situations akin to it.
This is my eroded input image:
This is the lines the HoughLinesP has found:
This time it has found a skew angle of -0.2185 degrees, which is what I was expecting but in other hand it is losing the vertical space between lines which in my humble opinion isn't a good thing.
Is there another to pre-process this kind of image to make houghLinesP get better results for scattered characters ?
Here is the source code I'm using:
#include <iostream>
#include <opencv2/opencv.hpp>
using namespace std;
static cv::Scalar randomColor( cv::RNG& rng )
{
int icolor = (unsigned) rng;
return cv::Scalar( icolor&255, (icolor>>8)&255, (icolor>>16)&255 );
}
void rotate(cv::Mat& src, double angle, cv::Mat& dst)
{
int len = std::max(src.cols, src.rows);
cv::Point2f pt(len/2., len/2.);
cv::Mat r = cv::getRotationMatrix2D(pt, angle, 1.0);
cv::warpAffine(src, dst, r, cv::Size(len, len));
}
double compute_skew(cv::Mat& src)
{
// Random number generator
cv::RNG rng( 0xFFFFFFFF );
cv::Size size = src.size();
cv::bitwise_not(src, src);
std::vector<cv::Vec4i> lines;
cv::HoughLinesP(src, lines, 1, CV_PI/180, 100, size.width / 2.f, 20);
cv::Mat disp_lines(size, CV_8UC3, cv::Scalar(0, 0, 0));
double angle = 0.;
unsigned nb_lines = lines.size();
for (unsigned i = 0; i < nb_lines; ++i)
{
cv::line(disp_lines, cv::Point(lines[i][0], lines[i][1]),
cv::Point(lines[i][2], lines[i][3]), randomColor(rng));
angle += atan2((double)lines[i][3] - lines[i][1],
(double)lines[i][2] - lines[i][0]);
}
angle /= nb_lines; // mean angle, in radians.
std::cout << angle * 180 / CV_PI << std::endl;
cv::imshow("HoughLinesP", disp_lines);
cv::waitKey(0);
return angle * 180 / CV_PI;
}
int main()
{
// Load in grayscale.
cv::Mat img = cv::imread("IMG_TESTE.jpg", 0);
cv::Mat rotated;
double angle = compute_skew(img);
rotate(img, angle, rotated);
//Show image
cv::imshow("Rotated", rotated);
cv::waitKey(0);
}
Cheers
I'd suggest finding individual components first (i.e., the lines and the letters), for example using cv::threshold and cv::findContours.
Then, you could drop the individual components that are narrow (i.e., the letters). You can do this using cv::floodFill for example. This should leave you with the lines only.
Effectively, getting rid of the letters might provide easier input for the Hough transform.
Try to detect groups of characters as blocks, then find contours of these blocks. Below I've done it using blurring, a morphological opening and a threshold operation.
Mat im = imread("yCK4t.jpg", 0);
Mat blurred;
GaussianBlur(im, blurred, Size(5, 5), 2, 2);
Mat kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));
Mat morph;
morphologyEx(blurred, morph, CV_MOP_OPEN, kernel);
Mat bw;
threshold(morph, bw, 0, 255, CV_THRESH_BINARY_INV | CV_THRESH_OTSU);
Mat cont = Mat::zeros(im.rows, im.cols, CV_8U);
vector<vector<Point>> contours;
vector<Vec4i> hierarchy;
findContours(bw, contours, hierarchy, CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE, Point(0, 0));
for(int idx = 0; idx >= 0; idx = hierarchy[idx][0])
{
drawContours(cont, contours, idx, Scalar(255, 255, 255), 1);
}
Then use Hough line transform on contour image.
With accumulator threshold 80, I get following lines that results in an angle of -3.81. This is high because of the outlier line that is almost vertical. With this approach, majority of the lines will have similar angle values except few outliers. Detecting and discarding the outliers will give you a better approximation of the angle.
HoughLinesP(cont, lines, 1, CV_PI/180, 80, size.width / 4.0f, size.width / 8.0f);