How can I fill gaps in a binary image in OpenCV? - c++

I have some thresholded images of hand-drawn figures (circuits) but there are some parts where I need to have gaps closed between two points, as I show in the following image:
Binary image
I tried closing (dilation followed by erosion), but it is not working. It doesn't fill the gaps and makes the resistors and other components unrecognizable. I couldn't find a proper value for the morph size and number of iterations that give me a good result without affecting the rest of the picture. It's important not to affect too much the components.
I can't use hough lines because the gaps are not always in lines.
Result after closing:
Result after closing
int morph_size1 = 2;
Mat element1 = getStructuringElement(MORPH_RECT, Size(2 * morph_size1 + 1, 2 * morph_size1 + 1), Point(morph_size1, morph_size1));
Mat dst1; // result matrix
for (int i = 1; i<3; i++)
{
morphologyEx(binary, dst1, CV_MOP_CLOSE, element1, Point(-1, -1), i);
}
imshow("closing ", dst1);
Any idea?
Thanks in advance.

My proposal:
find the endpoints of the breaks by means of morphological thinning (select the white pixels having only one white neighbor);
in small neighborhoods around every endpoint, find the closest endpoint, by circling* up to a limit radius;
draw a thick segment between them.
*In this step, it is very important to look for neighbors in different connected component, to avoid linking a piece to itself; so you need blob labelling as well.
In this thinning, there are more breaks than in your original picture because I erased the boxes.
Of course, you draw the filling segments in the original image.
This process cannot be perfect, as sometimes endpoints will be missing, and sometimes unwanted endpoints will be considered.
As a refinement, you can try and estimate the direction at endpoints, and only search is an angular sector.

My suggestion is to use a custom convolution filter (cv::filter2D) like the one below (can be larger):
0 0 1/12 0 0
0 0 2/12 0 0
1/12 2/12 0 2/12 1/12
0 0 2/12 0 0
0 0 1/12 0 0
The idea is to fill gaps when there are two line segments near each other. You can also use custom structuring elements to obtain the same effect.

Related

extract points that describe lines in a drawing

I have a black and white image of line drawings that may look something like the following:
but can be much more complicated.
I want to convert this drawing into a std::vector<std::vector<cv::Point>> (pretty much a contour). Except the problem is i cant simply call contour. Calling contour will give me an ordered set of points which outline this shape. I want ordered set of points (maybe even multiple of them) that will give me back the line itself.
This is very close to the contour problem, except "half" the amount of points are needed.
I have written a method which will find me points that are "connectors", and points that are "endPoints".
So in the example, there are 3 endPoints, and 1 connector.
My algorithm is as follows:
get actual contours = [[contour]]
classify points to get [endPoints], [connections] map
criticalSeen = 0
for each contour in contours
If seen[current point] continue;
If criticalSeen >= 1
append current point to currPoints
seen[current point] = true
If criticalSeen == 0
continue
If criticalSeen == 2
criticalSeen = 1
append currPoints to result
reset currPoints to empty
return result

Choose rectangles for maximizing the area

I've got a 2D-binary matrix of arbitrary size. I want to find a set of rectangles in this matrix, showing a maximum area. The constraints are:
Rectangles may only cover "0"-fields in the matrix and no "1"-fields.
Each rectangle has to have a given distance from the next rectangle.
So let me illustrate this a bit further by this matrix:
1 0 0 1
0 0 0 0
0 0 1 0
0 0 0 0
0 1 0 0
Let the minimal distance between two rectangles be 1. Consequently, the optimal solution would be by choosing the rectangles with corners (1,0)-(3,1) and (1,3)-(4,3). These rectangles are min. 1 field apart from each other and they do not lie on "1"-fields. Additionally, this solution got the maximum area (6+4=10).
If the minimal distance would be 2, the optimum would be (1,0)-(4,0) and (1,3)-(4,3) with area 4+4=8.
Till now, I achieved to find out rectangles analogous to this post:
Find largest rectangle containing only zeros in an N×N binary matrix
I saved all these rectangles in a list:
list<rectangle> rectangles;
with
struct rectangle {
int i,j; // bottom left corner of rectangle
int width,length; // width=size in neg. i direction, length=size in pos. j direction
};
Till now, I only thought about brute-force-methods but of course, I am not happy with this.
I hope you can give me some hints and tips of how to find the corresponding rectangles in my list and I hope my problem is clear to you.
The following counterexample shows that even a brute-force checking of all combinations of maximal-area rectangles can fail to find the optimum:
110
000
110
In the above example, there are 2 maximal-area rectangles, each of area 3, one vertical and one horizontal. You can't pick both, so if you are restricted to choosing a subset of these rectangles, the best you can do is to pick (either) one for a total area of 3. But if you instead picked the vertical area-3 rectangle, and then also took the non-maximal 1x2 rectangle consisting of just the leftmost two 0s, you could get a better total area of 5. (That's for a minimum separation distance of 0; if the minimum separation distance is 1, as in your own example, then you could instead pick just the leftmost 0 as a 1x1 rectangle for a total area of 4, which is still better than 3.)
For the special case when the separation distance is 0, there's a trivial algorithm: you can simply put a 1x1 rectangle on every single 0 in the matrix. When the separation distance is strictly greater than 0, I don't yet see a fast algorithm, though I'm less sure that the problem is NP-hard now than I was a few minutes ago...

HOG: What is done in the contrast-normalization step?

According to the HOG process, as described in the paper Histogram of Oriented Gradients for Human Detection (see link below), the contrast normalization step is done after the binning and the weighted vote.
I don't understand something - If I already computed the cells' weighted gradients, how can the normalization of the image's contrast help me now?
As far as I understand, contrast normalization is done on the original image, whereas for computing the gradients, I already computed the X,Y derivatives of the ORIGINAL image. So, if I normalize the contrast and I want it to take effect, I should compute everything again.
Is there something I don't understand well?
Should I normalize the cells' values?
Is the normalization in HOG not about contrast anyway, but is about the histogram values (counts of cells in each bin)?
Link to the paper:
http://lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf
The contrast normalization is achieved by normalization of each block's local histogram.
The whole HOG extraction process is well explained here: http://www.geocities.ws/talh_davidc/#cst_extract
When you normalize the block histogram, you actually normalize the contrast in this block, if your histogram really contains the sum of magnitudes for each direction.
The term "histogram" is confusing here, because you do not count how many pixels has direction k, but instead you sum the magnitudes of such pixels. Thus you can normalize the contrast after computing the block's vector, or even after you computed the whole vector, assuming that you know in which indices in the vector a block starts and a block ends.
The steps of the algorithm due to my understanding - worked for me with 95% success rate:
Define the following parameters (In this example, the parameters are like HOG for Human Detection paper):
A cell size in pixels (e.g. 6x6)
A block size in cells (e.g. 3x3 ==> Means that in pixels it is 18x18)
Block overlapping rate (e.g. 50% ==> Means that both block width and block height in pixels have to be even. It is satisfied in this example, because the cell width and cell height are even (6 pixels), making the block width and height also even)
Detection window size. The size must be dividable by a half of the block size without remainder (so it is possible to exactly place the blocks within with 50% overlapping). For example, the block width is 18 pixels, so the windows width must be a multiplication of 9 (e.g. 9, 18, 27, 36, ...). Same for the window height. In our example, the window width is 63 pixels, and the window height is 126 pixels.
Calculate gradient:
Compute the X difference using convolution with the vector [-1 0 1]
Compute the Y difference using convolution with the transpose of the above vector
Compute the gradient magnitude in each pixel using sqrt(diffX^2 + diffY^2)
Compute the gradient direction in each pixel using atan(diffY / diffX). Note that atan will return values between -90 and 90, while you will probably want the values between 0 and 180. So just flip all the negative values by adding to them +180 degrees. Note that in HOG for Human Detection, they use unsigned directions (between 0 and 180). If you want to use signed directions, you should make a little more effort: If diffX and diffY are positive, your atan value will be between 0 and 90 - leave it as is. If diffX and diffY are negative, again, you'll get the same range of possible values - here, add +180, so the direction is flipped to the other side. If diffX is positive and diffY is negative, you'll get values between -90 and 0 - leave them the same (You can add +360 if you want it positive). If diffY is positive and diffX is negative, you'll again get the same range, so add +180, to flip the direction to the other side.
"Bin" the directions. For example, 9 unsigned bins: 0-20, 20-40, ..., 160-180. You can easily achieve that by dividing each value by 20 and flooring the result. Your new binned directions will be between 0 and 8.
Do for each block separately, using copies of the original matrix (because some blocks are overlapping and we do not want to destroy their data):
Split to cells
For each cell, create a vector with 9 members (one for each bin). For each index in the bin, set the sum of all the magnitudes of all the pixels with that direction. We have totally 6x6 pixels in a cell. So for example, if 2 pixels have direction 0 while the magnitude of the first one is 0.231 and the magnitude of the second one is 0.13, you should write in index 0 in your vector the value 0.361 (= 0.231 + 0.13).
Concatenate all the vectors of all the cells in the block into a large vector. This vector size should of course be NUMBER_OF_BINS * NUMBER_OF_CELLS_IN_BLOCK. In our example, it is 9 * (3 * 3) = 81.
Now, normalize this vector. Use k = sqrt(v[0]^2 + v[1]^2 + ... + v[n]^2 + eps^2) (I used eps = 1). After you computed k, divide each value in the vector by k - thus your vector will be normalized.
Create final vector:
Concatenate all the vectors of all the blocks into 1 large vector. In my example, the size of this vector was 6318

Searching jpeg/bmp/pdf image for straight lines, circles and text

I want to create an image parser that shall read an image having following:
1. Straight Lines
2. Circles
3. Arcs
4. Text
I am open for solutions for any type of image format either jpeg, bmp, or PDF format.
I have seen QImage documentation. It shall provide me with pixel data that I can store in the form of a 2D matrix. At the moment I shall assume that there are only two colours black and white. White represents empty pixel and black represents a drawn pixel.
So I will have a sparse matrix like
0 1 1 1 0 0 0
0 0 0 0 0 0 1
0 1 1 0 0 0 1
1 0 0 1 0 0 1
1 0 0 1 0 0 0
0 1 1 0 0 0 0
Now I want to decode this matrix and search for the elements. Searching for horizontal and vertical lines is easy because for each element I can just scan its neighbouring row elements and column elements.
How can I search for other elements (angled lines, circles, arcs and possibly text)?
For text I read that QImage has text() function but I don't know for what type of input file it works.
Is there any other library that I can consider?
Please note that I just want to be able to read the image, processing does not need to be done.
Is there any other way I can accomplish this? Or am I being too ambitious?
Thanks
Take a look at the OpenCV library.
It provides most of the standard algorithms used in image detection and vision and the code quality of its implementation is quite high in general.
Notice though that this is a very difficult problem in general, so you will probably need to do a fair amount of research before getting satisfactory solutions.
One interesting way of tackling this would be with machine learning systems, such as neural networks and genetic algorithms. Neural nets in particular are very good at pattern matching and are often seen being used for tasks such as handwriting recognition.
There's a lot of information on this if you search for it. Here's one such article that is an introduction to NNs.
If your input images are always black and white, I don't think it would be too difficult to adapt a code example to get it working.
I suggest Viola-Jones object detection algorithm.
Though the approach is usually implemented on face detection - the original article discusses general object detection, such as your text, circles and lines.

openCV filter image - replace kernel with local maximum

Some details about my problem:
I'm trying to realize corner detector in openCV (another algorithm, that are built-in: Canny, Harris, etc).
I've got a matrix filled with the response values. The biggest response value is - the biggest probability of corner detected is.
I have a problem, that in neighborhood of a point there are few corners detected (but there is only one). I need to reduce number of false-detected corners.
Exact problem:
I need to walk through the matrix with a kernel, calculate maximum value of every kernel, leave max value, but others values in kernel make equal zero.
Are there build-in openCV functions to do this?
This is how I would do it:
Create a kernel, it defines a pixels neighbourhood.
Create a new image by dilating your image using this kernel. This dilated image contains the maximum neighbourhood value for every point.
Do an equality comparison between these two arrays. Wherever they are equal is a valid neighbourhood maximum, and is set to 255 in the comparison array.
Multiply the comparison array, and the original array together (scaling appropriately).
This is your final array, containing only neighbourhood maxima.
This is illustrated by these zoomed in images:
9 pixel by 9 pixel original image:
After processing with a 5 by 5 pixel kernel, only the local neighbourhood maxima remain (ie. maxima seperated by more than 2 pixels from a pixel with a greater value):
There is one caveat. If two nearby maxima have the same value then they will both be present in the final image.
Here is some Python code that does it, it should be very easy to convert to c++:
import cv
im = cv.LoadImage('fish2.png',cv.CV_LOAD_IMAGE_GRAYSCALE)
maxed = cv.CreateImage((im.width, im.height), cv.IPL_DEPTH_8U, 1)
comp = cv.CreateImage((im.width, im.height), cv.IPL_DEPTH_8U, 1)
#Create a 5*5 kernel anchored at 2,2
kernel = cv.CreateStructuringElementEx(5, 5, 2, 2, cv.CV_SHAPE_RECT)
cv.Dilate(im, maxed, element=kernel, iterations=1)
cv.Cmp(im, maxed, comp, cv.CV_CMP_EQ)
cv.Mul(im, comp, im, 1/255.0)
cv.ShowImage("local max only", im)
cv.WaitKey(0)
I didn't realise until now, but this is what #sansuiso suggested in his/her answer.
This is possibly better illustrated with this image, before:
after processing with a 5 by 5 kernel:
solid regions are due to the shared local maxima values.
I would suggest an original 2-step procedure (there may exist more efficient approaches), that uses opencv built-in functions :
Step 1 : morphological dilation with a square kernel (corresponding to your neighborhood). This step gives you another image, after replacing each pixel value by the maximum value inside the kernel.
Step 2 : test if the cornerness value of each pixel of the original response image is equal to the max value given by the dilation step. If not, then obviously there exists a better corner in the neighborhood.
If you are looking for some built-in functionality, FilterEngine will help you make a custom filter (kernel).
http://docs.opencv.org/modules/imgproc/doc/filtering.html#filterengine
Also, I would recommend some kind of noise reduction, usually blur, before all processing. That is unless you really want the image raw.