Extracting segments from a list of 8-connected pixels - c++

Current situation: I'm trying to extract segments from an image. Thanks to openCV's findContours() method, I now have a list of 8-connected point for every contours. However, these lists are not directly usable, because they contain a lot of duplicates.
The problem: Given a list of 8-connected points, which can contain duplicates, extract segments from it.
Possible solutions:
At first, I used openCV's approxPolyDP() method. However, the results are pretty bad... Here is the zoomed contours:
Here is the result of approxPolyDP(): (9 segments! Some overlap)
but what I want is more like:
It's bad because approxPolyDP() can convert something that "looks like several segments" in "several segments". However, what I have is a list of points that tend to iterate several times over themselves.
For example, if my points are:
0 1 2 3 4 5 6 7 8
9
Then, the list of point will be 0 1 2 3 4 5 6 7 8 7 6 5 4 3 2 1 9... And if the number of points become large (>100) then the segments extracted by approxPolyDP() are unfortunately not duplicates (i.e : they overlap each other, but are not strictly equal, so I can't just say "remove duplicates", as opposed to pixels for example)
Perhaps, I've got a solution, but it's pretty long (though interesting). First of all, for all 8-connected list, I create a sparse matrix (for efficiency) and set the matrix values to 1 if the pixel belongs to the list. Then, I create a graph, with nodes corresponding to pixels, and edges between neighbouring pixels. This also means that I add all the missing edges between pixels (complexity small, possible because of the sparse matrix). Then I remove all possible "squares" (4 neighbouring nodes), and this is possible because I am already working on pretty thin contours. Then I can launch a minimal spanning tree algorithm. And finally, I can approximate every branch of the tree with openCV's approxPolyDP()
To sum up: I've got a tedious method, that I've not yet implemented as it seems error-prone. However, I ask you, people at Stack Overflow: are there other existing methods, possibly with good implementations?
Edit: To clarify, once I have a tree, I can extract "branches" (branches start at leaves or nodes linked to 3 or more other nodes) Then, the algorithm in openCV's approxPolyDP() is the Ramer–Douglas–Peucker algorithm, and here is the Wikipedia picture of what it does:
With this picture, it is easy to understand why it fails when points may be duplicates of each other
Another edit: In my method, there is something that may be interesting to note. When you consider points located in a grid (like pixels), then generally, the minimal spanning tree algorithm is not useful because there are many possible minimal trees
X-X-X-X
|
X-X-X-X
is fundamentally very different from
X-X-X-X
| | | |
X X X X
but both are minimal spanning trees
However, in my case, my nodes rarely form clusters because they are supposed to be contours, and there is already a thinning algorithm that runs beforehand in the findContours().
Answer to Tomalak's comment:
If DP algorithm returns 4 segments (the segment from the point 2 to the center being there twice) I would be happy! Of course, with good parameters, I can get to a state where "by chance" I have identical segments, and I can remove duplicates. However, clearly, the algorithm is not designed for it.
Here is a real example with far too many segments:

Using Mathematica 8, I created a morphological graph from the list of white pixels in the image. It is working fine on your first image:
Create the morphological graph:
graph = MorphologicalGraph[binaryimage];
Then you can query the graph properties that are of interest to you.
This gives the names of the vertex in the graph:
vertex = VertexList[graph]
The list of the edges:
EdgeList[graph]
And that gives the positions of the vertex:
pos = PropertyValue[{graph, #}, VertexCoordinates] & /# vertex
This is what the results look like for the first image:
In[21]:= vertex = VertexList[graph]
Out[21]= {1, 3, 2, 4, 5, 6, 7, 9, 8, 10}
In[22]:= EdgeList[graph]
Out[22]= {1 \[UndirectedEdge] 3, 2 \[UndirectedEdge] 4, 3 \[UndirectedEdge] 4,
3 \[UndirectedEdge] 5, 4 \[UndirectedEdge] 6, 6 \[UndirectedEdge] 7,
6 \[UndirectedEdge] 9, 8 \[UndirectedEdge] 9, 9 \[UndirectedEdge] 10}
In[26]:= pos = PropertyValue[{graph, #}, VertexCoordinates] & /# vertex
Out[26]= {{54.5, 191.5}, {98.5, 149.5}, {42.5, 185.5},
{91.5, 138.5}, {132.5, 119.5}, {157.5, 72.5},
{168.5, 65.5}, {125.5, 52.5}, {114.5, 53.5},
{120.5, 29.5}}
Given the documentation, http://reference.wolfram.com/mathematica/ref/MorphologicalGraph.html, the command MorphologicalGraph first computes the skeleton by morphological thinning:
skeleton = Thinning[binaryimage, Method -> "Morphological"]
Then the vertex are detected; they are the branch points and the end points:
verteximage = ImageAdd[
MorphologicalTransform[skeleton, "SkeletonEndPoints"],
MorphologicalTransform[skeleton, "SkeletonBranchPoints"]]
And then the vertex are linked after analysis of their connectivity.
For example, one could start by breaking the structure around the vertex and then look for the connected components, revealing the edges of the graph:
comp = MorphologicalComponents[
ImageSubtract[
skeleton,
Dilation[vertices, CrossMatrix[1]]]];
Colorize[comp]
The devil is in the details, but that sounds like a solid starting point if you wish to develop your own implementation.

Try math morphology. First you need to dilate or close your image to fill holes.
cvDilate(pimg, pimg, NULL, 3);
cvErode(pimg, pimg, NULL);
I got this image
The next step should be applying thinning algorithm. Unfortunately it's not implemented in OpenCV (MATLAB has bwmorph with thin argument). For example with MATLAB I refined the image to this one:
However OpenCV has all needed basic morphological operations to implement thinning (cvMorphologyEx, cvCreateStructuringElementEx, etc).
Another idea.
They say that distance transform seems to be very useful in such tasks. May be so.
Consider cvDistTransform function. It creates to an image like that:
Then using something like cvAdaptiveThreshold:
That's skeleton. I guess you can iterate over all connected white pixels, find curves and filter out small segments.

I've implemented a similar algorithm before, and I did it in a sort of incremental least-squares fashion. It worked fairly well. The pseudocode is somewhat like:
L = empty set of line segments
for each white pixel p
line = new line containing only p
C = empty set of points
P = set of all neighboring pixels of p
while P is not empty
n = first point in P
add n to C
remove n from P
line' = line with n added to it
perform a least squares fit of line'
if MSE(line) < max_mse and d(line, n) < max_distance
line = line'
add all neighbors of n that are not in C to P
if size(line) > min_num_points
add line to L
where MSE(line) is the mean-square-error of the line (sum over all points in the line of the squared distance to the best fitting line) and d(line,n) is the distance from point n to the line. Good values for max_distance seem to be a pixel or so and max_mse seems to be much less, and will depend on the average size of the line segments in your image. 0.1 or 0.2 pixels have worked in fairly large images for me.
I had been using this on actual images pre-processed with the Canny operator, so the only results I have are of that. Here's the result of the above algorithm on an image:
It's possible to make the algorithm fast, too. The C++ implementation I have (closed source enforced by my job, sorry, else I would give it to you) processed the above image in about 20 milliseconds. That includes application of the Canny operator for edge detection, so it should be even faster in your case.

You can start by extraction straight lines from your contours image using HoughLinesP which is provided with openCV:
HoughLinesP(InputArray image, OutputArray lines, double rho, double theta, int threshold, double minLineLength = 0, double maxLineGap = 0)
If you choose threshold = 1 and minLineLenght small, you can even obtain all single elements. Be careful though, since it yields many results in case you have many edge pixels.

Related

Linear interpolation of two vector arrays with different lengths

I have two curves. One handdrawn and one is a smoothed version of the handdrawn.
The data of each curve is stored in 2 seperate vector arrays.
Time Delta is also stored in the handdrawn curve vector, so i can replay the drawing process and so that it looks natural.
Now i need to transfer the Time Delta from Curve 1 (Raw input) to Curve 2 (already smoothed curve).
Sometimes the size of the first vector is larger and sometimes smaller than the second vector.
(Depends on the input draw speed)
So my question is: How do i fill vector PenSmoot.time with the correct values?
Case 1: Input vector is larger
PenInput.time[0] = 0 PenSmoot.time[0] = 0
PenInput.time[1] = 5 PenSmoot.time[1] = ?
PenInput.time[2] = 12 PenSmoot.time[2] = ?
PenInput.time[3] = 2 PenSmoot.time[3] = ?
PenInput.time[4] = 50 PenSmoot.time[4] = ?
PenInput.time[5] = 100
PenInput.time[6] = 20
PenInput.time[7] = 3
PenInput.time[8] = 9
PenInput.time[9] = 33
Case 2: Input vector is smaller
PenInput.time[0] = 0 PenSmoot.time[0] = 0
PenInput.time[1] = 5 PenSmoot.time[1] = ?
PenInput.time[2] = 12 PenSmoot.time[2] = ?
PenInput.time[3] = 2 PenSmoot.time[3] = ?
PenInput.time[4] = 50 PenSmoot.time[4] = ?
PenSmoot.time[5] = ?
PenSmoot.time[6] = ?
PenSmoot.time[7] = ?
PenSmoot.time[8] = ?
PenSmoot.time[9] = ?
Simplyfied representation:
PenInput holds the whole data of a drawn curve (Raw Input)
PenInput.x // X coordinate)
PenInput.y // Y coordinate)
PenInput.pressure // The pressure of the pen)
PenInput.timetotl // Total elapsed time)
PenInput.timepart // Time fragments)
PenSmoot holds the data of the massaged (smoothed,evenly distributed) curve of PenInput
PenSmoot.x // X coordinate)
PenSmoot.y // Y coordinate)
PenSmoot.pressure // Unknown - The pressure of the pen)
PenSmoot.timetotl // Unknown - Total elapsed time)
PenSmoot.timepart // Unknown - Time fragments)
This is the struct that i have.
struct Pencil
{
sf::VertexArray vertices;
std::vector<int> pressure;
std::vector<sf::Int32> timetotl;
std::vector<sf::Int32> timepart;
};
[This answer has been extensively revised based on editing to the question.]
Okay, it seems to me that you just about need to interpolate the time stamps in parallel with the points.
I'm going to guess that the incoming data is something on the order of an array of points (e.g., X, Y coordinates) and an array of time deltas with the same number of each, so time-delta N tells you the time it took to get from point N-1 to point N.
When you interpolate the points, you're probably going to want to do it intelligently. For example, in the shape shown in the question, we have what look like two nearly straight lines, one with positive slope, and the other with negative slope. According to the picture, that's composed of 263 points. We could reduce that to three points and still have a fairly reasonable representation of the original shape by choosing the two end-points plus one point where the two lines meet.
We probably don't need to go quite that far though. Especially taking time into account, we'd probably want to use at least 7 points for the output--one for each end-point of each colored segment. That would give us 6 straight line segments. Let's say those are at points 0, 30, 140, 180, 200, 250, and 263.
We'd then use exactly the same segmentation on the time deltas. Add up the deltas from 0 to 30 to get an average speed for the first segment. Add up the deltas for 31 through 140 to get an average speed for the second segment (and so on to the end).
Increasing the number of points works out roughly the same way. We need to look at exactly which input points were used to create a pair of output points. For a simplistic example, let's assume we produced output that was precisely double the number of input points. We'd then interpolate time deltas exactly halfway between each pair of input points.
In the case shown in the question, we start with unevenly distributed inputs, but produce evenly distributed outputs. So the second output point might be an average of the first four input points. The next output point might be an average of three input points (and so on). In many cases, it's likely that neither end-point of a segment in the output corresponds precisely to any point in the input.
That's fine too. We interpolate between two points of the input to figure out the time hack for the starting point of the output segment. Likewise for the ending point. Then we can compute the total time it should have taken to travel between them based on the time delta between the points.
If you want to get fancy, you could use a higher order interpolation instead of linear. That does require more input points per interpolation, but it looks like you probably have plenty to do something like a quadratic or cubic interpolation (in most cases). This is likely to make the most differences at transitions--places the "pen" was accelerating or decelerating quickly. In such an place, linear interpolation can give somewhat misleading results (though, given the number of points you seem to be working with, it may not make enough difference to notice).
As an illustration, let's consider a straight line. We're going to start from 5 input points, and produce 7 output points.
So, the input points are [0, 2, 7, 10, 15], and the associated time deltas are [0, 1, 4, 8, 3].
So, out total distance traveled is 16, and we want our output points to be evenly distributed. So, the distance between output points will be 16/7 = (roughly) 2.29.
So, obviously the first output point and time are both 0. The second output point is 2.29. To compute the output time, we take the entirety of the time to the first input point (0->2), plus .29 / (7-2) * (4-1). That interpolated section gives 1.37, so our first output time delta is 2.37.
The next output point should be at a distance of 4.58. Since the second input segment goes from 2 to 7, our entire second output segment will lie within the second input segment. So, we take 2.29 / (7-2), telling use that this output segment occupies .458 of the input segment. We then multiply that by the time for the second input segment to get the time delta for the second output segment: .458 * (4-1) = 1.374.
[...and it continues on the same way until we reach the end.]

Error in calculating exact nearest neighbors in radius with FLANN

I am trying to find the exact number of neighbour nodes in a big 3D points dataset. The goal is for each point of the dataset to retrieve all the possible neighbours in a region with a given radius. FLANN ensures that for lower dimensional data can retrieve the exact neighbors while comparing with brute force search it seems to not be the case. The neighbors are essential for further calculations and therefore I need the exact number. I tested increasing the radius a little bit but doesn't seem to be this the problem. Is anyone aware how to calculate the exact neighbors with FLANN or other C++ library?
The code:
// All nodes to be tested for inclusion in support domain.
flann::Matrix<double> query_nodes = flann::Matrix<double>(&nodes_pos[0].x, nodes_pos.size(), 3);
// Set default search parameters
flann::SearchParams search_parameters = flann::SearchParams();
search_parameters.checks = -1;
search_parameters.sorted = false;
search_parameters.use_heap = flann::FLANN_True;
flann::KDTreeSingleIndexParams index_parameters = flann::KDTreeSingleIndexParams();
flann::KDTreeSingleIndex<flann::L2_3D<double> > index(query_nodes, index_parameters);
index.buildIndex();
//FLANN uses L2 for radius search.
double l2_radius = (this->support_layer_*grid.spacing)*(this->support_layer_*grid.spacing);
double extension = l2_radius/10.;
l2_radius+= extension;
index.radiusSearch(query_nodes, indices, dists, l2_radius, search_parameters);
Try nanoflann. It is designed for low dimensional spaces and gives exact nearest neighbors. Furthermore, it is just one header file that you can either "install" or just copy to your project.
You should check page 6+ from the flann-manual, to fine-tune your search parameters, such as target_precision, which should be set to 1, for "maximum" accuracy.
That parameter is often found as epsilon (ε) in Approximate Nearest Neighbor Search (ANNS), which is used in high dimensional spaces, in order to (try) to beat the curse of dimensionality. FLANN is usually used in 128 dimensions, not 3, as far as I can tell, which may explain the bad performance you are experiencing.
A c++ library that works well in 3 dimensions is CGAL. However, it's much larger than FLANN, because it is a library for computational geometry, thus it provides functionality for many problems, not just NNS.

Find optimal route in farm land-dynamic programming/Dijkstra's

I was trying to solve a question on InterviewStreet (the competition has since ended). The problem is to build a ditch from a pond to a farm, given a N*M grid of elevations. The pond and the farm are one of the tiles within the N*M grid and won't be the same tile.
The elevations are numbers between 0 and 9. Additionally, you are given the coordinates of the pond and the farm (1-indexed, row followed by column), which each take up exactly one tile on the grid. You are to write a program that, given this data, computes the minimum cost to build an irrigation ditch.
More specifically, the input that will be fed into your program will be formatted as follows:
N M
pondLocationX pondLocationY
farmLocationX farmLocationY
elevationX1Y1elevationX1Y2...elevationX1YM
elevationX2Y1elevationX2Y2...elevationX2YM
.
.
.
elevationXNY1elevationXNY2...elevationXNYM
where pondLocationX and farmLocationX are integers in the interval [1, N], and pondLocationY and farmLocationY are integers in the interval [1, M], and all elements are integers in the interval [0, 9]. Note that a single space separates the X and Y coordinates of the farm and pond, but there are no spaces separating the elevations.
Given such an input, your program should print out the minimum cost to build an irrigation ditch from the pond to the farm. The constraints are as follows. The pond and farm will not be at the same location. The elevation of all tiles except for the pond can be increased or decreased at a cost of one for every unit of change (you may leave the elevation the same for a cost of 0). N and M will each be at most 300. After paying for any excavation that is necessary, you can build a ditch at 0 additional cost if there is a sequence of tiles starting at the pond and ending at the farm such that the following are true:
(Contiguous path) Each tile in the sequence is adjacent to the previous tile (no diagonal adjacency -- tiles in the interior of the map have exactly 4 adjacent tiles)
(Downhill path) Each tile in the sequence, including the pond and farm, has an elevation that is at most that of the previous tile in the sequence.
For example, if the input is the following:
3 5
1 1
3 4
27310
21171
77721
then we can build an irrigation ditch at a cost of just 4, since it suffices to lower the tile at location (1, 3) from 3 to 1 (cost 2), raise the tile at position (1, 5) from 0 to 1 (cost 1), and lower the farm, which is at location (3, 4), from 2 to 1 (cost 1). Note that you cannot travel diagonally to get from (2, 3) to (3, 4) in one step.
Solution:
I think this is a variation of the Djikstra's algorithm, i.e. use the farm as the source node, and stop when you calculate the shortest path to the pond. The "adjacent" tiles are your neighbours, and your edge weights are the differences in your elevations.
However, since you can modify the weights in two ways i.e. if you are higher than your neighbour, then you can either 1) decrease your height to match your neighbour's or 2) increase your neighbour's height to match yours. This effect can percolate outwards and I'm not able to capture this in the algorithm.
How can I adjust Djikstra's algorithm to acommodate for the fact that the weights can be changed?
Use the Dijkstra algorithm on the 3D grid N*M*10. Two vertices (x,y,z) and (x',y',z') are connected (with an oriented arc) if (x,y) and (x',y') are adjacent and z' is not greater than z. The cost on the arc is given by the difference between z' and the initial height at (x',y'). Then find the shortedst path from the pond (with its initial length) to the farm (even if the z coordinate is not the same.
It is possible that the minimal path finded in this way passes two times on the same point (x,y). For example it could pass first from (x,y,z') and then from (x,y,z''). But if this happens you can remove the path from (x,y,z') to (x,y,z'') since replacing (x,y,z') with (x,y,z'') costs equal or less then the path from (x,y,z') to (x,y,z''). So you can assume that for every point (x,y) the path uses only a single value of z.
So the path you have found is the solution to the given problem.

openCV filter image - replace kernel with local maximum

Some details about my problem:
I'm trying to realize corner detector in openCV (another algorithm, that are built-in: Canny, Harris, etc).
I've got a matrix filled with the response values. The biggest response value is - the biggest probability of corner detected is.
I have a problem, that in neighborhood of a point there are few corners detected (but there is only one). I need to reduce number of false-detected corners.
Exact problem:
I need to walk through the matrix with a kernel, calculate maximum value of every kernel, leave max value, but others values in kernel make equal zero.
Are there build-in openCV functions to do this?
This is how I would do it:
Create a kernel, it defines a pixels neighbourhood.
Create a new image by dilating your image using this kernel. This dilated image contains the maximum neighbourhood value for every point.
Do an equality comparison between these two arrays. Wherever they are equal is a valid neighbourhood maximum, and is set to 255 in the comparison array.
Multiply the comparison array, and the original array together (scaling appropriately).
This is your final array, containing only neighbourhood maxima.
This is illustrated by these zoomed in images:
9 pixel by 9 pixel original image:
After processing with a 5 by 5 pixel kernel, only the local neighbourhood maxima remain (ie. maxima seperated by more than 2 pixels from a pixel with a greater value):
There is one caveat. If two nearby maxima have the same value then they will both be present in the final image.
Here is some Python code that does it, it should be very easy to convert to c++:
import cv
im = cv.LoadImage('fish2.png',cv.CV_LOAD_IMAGE_GRAYSCALE)
maxed = cv.CreateImage((im.width, im.height), cv.IPL_DEPTH_8U, 1)
comp = cv.CreateImage((im.width, im.height), cv.IPL_DEPTH_8U, 1)
#Create a 5*5 kernel anchored at 2,2
kernel = cv.CreateStructuringElementEx(5, 5, 2, 2, cv.CV_SHAPE_RECT)
cv.Dilate(im, maxed, element=kernel, iterations=1)
cv.Cmp(im, maxed, comp, cv.CV_CMP_EQ)
cv.Mul(im, comp, im, 1/255.0)
cv.ShowImage("local max only", im)
cv.WaitKey(0)
I didn't realise until now, but this is what #sansuiso suggested in his/her answer.
This is possibly better illustrated with this image, before:
after processing with a 5 by 5 kernel:
solid regions are due to the shared local maxima values.
I would suggest an original 2-step procedure (there may exist more efficient approaches), that uses opencv built-in functions :
Step 1 : morphological dilation with a square kernel (corresponding to your neighborhood). This step gives you another image, after replacing each pixel value by the maximum value inside the kernel.
Step 2 : test if the cornerness value of each pixel of the original response image is equal to the max value given by the dilation step. If not, then obviously there exists a better corner in the neighborhood.
If you are looking for some built-in functionality, FilterEngine will help you make a custom filter (kernel).
http://docs.opencv.org/modules/imgproc/doc/filtering.html#filterengine
Also, I would recommend some kind of noise reduction, usually blur, before all processing. That is unless you really want the image raw.

Writing robust (color and size invariant) circle detection with OpenCV (based on Hough transform or other features)

I wrote the following very simple python code to find circles in an image:
import cv
import numpy as np
WAITKEY_DELAY_MS = 10
STOP_KEY = 'q'
cv.NamedWindow("image - press 'q' to quit", cv.CV_WINDOW_AUTOSIZE);
cv.NamedWindow("post-process", cv.CV_WINDOW_AUTOSIZE);
key_pressed = False
while key_pressed != STOP_KEY:
# grab image
orig = cv.LoadImage('circles3.jpg')
# create tmp images
grey_scale = cv.CreateImage(cv.GetSize(orig), 8, 1)
processed = cv.CreateImage(cv.GetSize(orig), 8, 1)
cv.Smooth(orig, orig, cv.CV_GAUSSIAN, 3, 3)
cv.CvtColor(orig, grey_scale, cv.CV_RGB2GRAY)
# do some processing on the grey scale image
cv.Erode(grey_scale, processed, None, 10)
cv.Dilate(processed, processed, None, 10)
cv.Canny(processed, processed, 5, 70, 3)
cv.Smooth(processed, processed, cv.CV_GAUSSIAN, 15, 15)
storage = cv.CreateMat(orig.width, 1, cv.CV_32FC3)
# these parameters need to be adjusted for every single image
HIGH = 50
LOW = 140
try:
# extract circles
cv.HoughCircles(processed, storage, cv.CV_HOUGH_GRADIENT, 2, 32.0, HIGH, LOW)
for i in range(0, len(np.asarray(storage))):
print "circle #%d" %i
Radius = int(np.asarray(storage)[i][0][2])
x = int(np.asarray(storage)[i][0][0])
y = int(np.asarray(storage)[i][0][1])
center = (x, y)
# green dot on center and red circle around
cv.Circle(orig, center, 1, cv.CV_RGB(0, 255, 0), -1, 8, 0)
cv.Circle(orig, center, Radius, cv.CV_RGB(255, 0, 0), 3, 8, 0)
cv.Circle(processed, center, 1, cv.CV_RGB(0, 255, 0), -1, 8, 0)
cv.Circle(processed, center, Radius, cv.CV_RGB(255, 0, 0), 3, 8, 0)
except:
print "nothing found"
pass
# show images
cv.ShowImage("image - press 'q' to quit", orig)
cv.ShowImage("post-process", processed)
cv_key = cv.WaitKey(WAITKEY_DELAY_MS)
key_pressed = chr(cv_key & 255)
As you can see from the following two examples, the 'circle finding quality' varies quite a lot:
CASE1:
CASE2:
Case1 and Case2 are basically the same image, but still the algorithm detects different circles. If I present the algorithm an image with differently sized circles, the circle detection might even fail completely. This is mostly due to the HIGH and LOW parameters which need to be adjusted individually for each new picture.
Therefore my question: What are the various possibilities of making this algorithm more robust? It should be size and color invariant so that different circles with different colors and in different sizes are detected. Maybe using the Hough transform is not the best way of doing things? Are there better approaches?
The following is based on my experience as a vision researcher. From your question you seem to be interested in possible algorithms and methods rather only a working piece of code. First I give a quick and dirty Python script for your sample images and some results are shown to prove it could possibly solve your problem. After getting these out of the way, I try to answer your questions regarding robust detection algorithms.
Quick Results
Some sample images (all the images apart from yours are downloaded from flickr.com and are CC licensed) with the detected circles (without changing/tuning any parameters, exactly the following code is used to extract the circles in all the images):
Code (based on the MSER Blob Detector)
And here is the code:
import cv2
import math
import numpy as np
d_red = cv2.cv.RGB(150, 55, 65)
l_red = cv2.cv.RGB(250, 200, 200)
orig = cv2.imread("c.jpg")
img = orig.copy()
img2 = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
detector = cv2.FeatureDetector_create('MSER')
fs = detector.detect(img2)
fs.sort(key = lambda x: -x.size)
def supress(x):
for f in fs:
distx = f.pt[0] - x.pt[0]
disty = f.pt[1] - x.pt[1]
dist = math.sqrt(distx*distx + disty*disty)
if (f.size > x.size) and (dist<f.size/2):
return True
sfs = [x for x in fs if not supress(x)]
for f in sfs:
cv2.circle(img, (int(f.pt[0]), int(f.pt[1])), int(f.size/2), d_red, 2, cv2.CV_AA)
cv2.circle(img, (int(f.pt[0]), int(f.pt[1])), int(f.size/2), l_red, 1, cv2.CV_AA)
h, w = orig.shape[:2]
vis = np.zeros((h, w*2+5), np.uint8)
vis = cv2.cvtColor(vis, cv2.COLOR_GRAY2BGR)
vis[:h, :w] = orig
vis[:h, w+5:w*2+5] = img
cv2.imshow("image", vis)
cv2.imwrite("c_o.jpg", vis)
cv2.waitKey()
cv2.destroyAllWindows()
As you can see it's based on the MSER blob detector. The code doesn't preprocess the image apart from the simple mapping into grayscale. Thus missing those faint yellow blobs in your images is expected.
Theory
In short: you don't tell us what you know about the problem apart from giving only two sample images with no description of them. Here I explain why I in my humble opinion it is important to have more information about the problem before asking what are efficient methods to attack the problem.
Back to the main question: what is the best method for this problem?
Let's look at this as a search problem. To simplify the discussion assume we are looking for circles with a given size/radius. Thus, the problem boils down to finding the centers. Every pixel is a candidate center, therefore, the search space contains all the pixels.
P = {p1, ..., pn}
P: search space
p1...pn: pixels
To solve this search problem two other functions should be defined:
E(P) : enumerates the search space
V(p) : checks whether the item/pixel has the desirable properties, the items passing the check are added to the output list
Assuming the complexity of the algorithm doesn't matter, the exhaustive or brute-force search can be used in which E takes every pixel and passes to V. In real-time applications it's important to reduce the search space and optimize computational efficiency of V.
We are getting closer to the main question. How we could define V, to be more precise what properties of the candidates should be measures and how should make solve the dichotomy problem of splitting them into desirable and undesirable. The most common approach is to find some properties which can be used to define simple decision rules based on the measurement of the properties. This is what you're doing by trial and error. You're programming a classifier by learning from positive and negative examples. This is because the methods you're using have no idea what you want to do. You have to adjust / tune the parameters of the decision rule and/or preprocess the data such that the variation in the properties (of the desirable candidates) used by the method for the dichotomy problem are reduced. You could use a machine learning algorithm to find the optimal parameter values for a given set of examples. There's a whole host of learning algorithms from decision trees to genetic programming you can use for this problem. You could also use a learning algorithm to find the optimal parameter values for several circle detection algorithms and see which one gives a better accuracy. This takes the main burden on the learning algorithm you just need to collect sample images.
The other approach to improve robustness which is often overlooked is to utilize extra readily available information. If you know the color of the circles with virtually zero extra effort you could improve the accuracy of the detector significantly. If you knew the position of the circles on the plane and you wanted to detect the imaged circles, you should remember the transformation between these two sets of positions is described by a 2D homography. And the homography can be estimated using only four points. Then you could improve the robustness to have a rock solid method. The value of domain-specific knowledge is often underestimated. Look at it this way, in the first approach we try to approximate some decision rules based on a limited number of sample. In the second approach we know the decision rules and only need to find a way to effectively utilize them in an algorithm.
Summary
To summarize, there are two approaches to improve the accuracy / robustness of the solution:
Tool-based: finding an easier to use algorithm / with fewer number of parameters / tweaking the algorithm / automating this process by using machine learning algorithms
Information-based: are you using all the readily available information? In the question you don't mention what you know about the problem.
For these two images you have shared I would use a blob detector not the HT method. For background subtraction I would suggest to try to estimate the color of the background as in the two images it is not varying while the color of the circles vary. And the most of the area is bare.
This is a great modelling problem. I have the following recommendations/ ideas:
Split the image to RGB then process.
pre-processing.
Dynamic parameter search.
Add constraints.
Be sure about what you are trying to detect.
In more detail:
1: As noted in other answers, converting straight to grayscale discards too much information - any circles with a similar brightness to the background will be lost. Much better to consider the colour channels either in isolation or in a different colour space. There are pretty much two ways to go here: perform HoughCircles on each pre-processed channel in isolation, then combine results, or, process the channels, then combine them, then operate HoughCircles. In my attempt below, I've tried the second method, splitting to RGB channels, processing, then combining. Be wary of over saturating the image when combining, I use cv.And to avoid this issue (at this stage my circles are always black rings/discs on white background).
2: Pre-processing is quite tricky, and something its often best to play around with. I've made use of AdaptiveThreshold which is a really powerful convolution method that can enhance edges in an image by thresholding pixels based on their local average (similar processes also occur in the early pathway of the mammalian visual system). This is also useful as it reduces some noise. I've used dilate/erode with only one pass. And I've kept the other parameters how you had them. It seems using Canny before HoughCircles does help a lot with finding 'filled circles', so probably best to keep it in. This pre-processing is quite heavy and can lead to false positives with somewhat more 'blobby circles', but in our case this is perhaps desirable?
3: As you've noted HoughCircles parameter param2 (your parameter LOW) needs to be adjusted for each image in order to get an optimal solution, in fact from the docs:
The smaller it is, the more false circles may be detected.
Trouble is the sweet spot is going to be different for every image. I think the best approach here is to make set a condition and do a search through different param2 values until this condition is met. Your images show non-overlapping circles, and when param2 is too low we typically get loads of overlapping circles. So I suggest searching for the:
maximum number of non-overlapping, and non-contained circles
So we keep calling HoughCircles with different values of param2 until this is met. I do this in my example below, just by incrementing param2 until it reaches the threshold assumption. It would be way faster (and fairly easy to do) if you perform a binary search to find when this is met, but you need to be careful with exception handling as opencv often throws a errors for innocent looking values of param2 (at least on my installation). A different condition that would we very useful to match against would be the number of circles.
4: Are there any more constraints we can add to the model? The more stuff we can tell our model the easy a task we can make it to detect circles. For example, do we know:
The number of circles. - even an upper or lower bound is helpful.
Possible colours of the circles, or of the background, or of 'non-circles'.
Their sizes.
Where they can be in an image.
5: Some of the blobs in your images could only loosely be called circles! Consider the two 'non-circular blobs' in your second image, my code can't find them (good!), but... if I 'photoshop' them so they are more circular, my code can find them... Maybe if you want to detect things that are not circles, a different approach such as Tim Lukins may be better.
Problems
By doing heavy pre-processing AdaptiveThresholding and `Canny' there can be a lot of distortion to features in an image, which may lead to false circle detection, or incorrect radius reporting. For example a large solid disc after processing can appear a ring, so HughesCircles may find the inner ring. Furthermore even the docs note that:
...usually the function detects the circles’ centers well, however it may fail to find the correct radii.
If you need more accurate radii detection, I suggest the following approach (not implemented):
On the original image, ray-trace from reported centre of circle, in an expanding cross (4 rays: up/down/left/right)
Do this seperately in each RGB channel
Combine this info for each channel for each ray in a sensible fashion (ie. flip, offset, scale, etc as necessary)
take the average for the first few pixels on each ray, use this to detect where a significant deviation on the ray occurs.
These 4 points are estimates of points on the circumference.
Use these four estimates to determine a more accurate radius, and centre position(!).
This could be generalised by using an expanding ring instead of four rays.
Results
The code at end does pretty good quite a lot of the time, these examples were done with code as shown:
Detects all circles in your first image:
How the pre-processed image looks before canny filter is applied (different colour circles are highly visible):
Detects all but two (blobs) in second image:
Altered second image (blobs are circle-afied, and large oval made more circular, thus improving detection), all detected:
Does pretty well in detecting centres in this Kandinsky painting (I cannot find concentric rings due to he boundary condition).
Code:
import cv
import numpy as np
output = cv.LoadImage('case1.jpg')
orig = cv.LoadImage('case1.jpg')
# create tmp images
rrr=cv.CreateImage((orig.width,orig.height), cv.IPL_DEPTH_8U, 1)
ggg=cv.CreateImage((orig.width,orig.height), cv.IPL_DEPTH_8U, 1)
bbb=cv.CreateImage((orig.width,orig.height), cv.IPL_DEPTH_8U, 1)
processed = cv.CreateImage((orig.width,orig.height), cv.IPL_DEPTH_8U, 1)
storage = cv.CreateMat(orig.width, 1, cv.CV_32FC3)
def channel_processing(channel):
pass
cv.AdaptiveThreshold(channel, channel, 255, adaptive_method=cv.CV_ADAPTIVE_THRESH_MEAN_C, thresholdType=cv.CV_THRESH_BINARY, blockSize=55, param1=7)
#mop up the dirt
cv.Dilate(channel, channel, None, 1)
cv.Erode(channel, channel, None, 1)
def inter_centre_distance(x1,y1,x2,y2):
return ((x1-x2)**2 + (y1-y2)**2)**0.5
def colliding_circles(circles):
for index1, circle1 in enumerate(circles):
for circle2 in circles[index1+1:]:
x1, y1, Radius1 = circle1[0]
x2, y2, Radius2 = circle2[0]
#collision or containment:
if inter_centre_distance(x1,y1,x2,y2) < Radius1 + Radius2:
return True
def find_circles(processed, storage, LOW):
try:
cv.HoughCircles(processed, storage, cv.CV_HOUGH_GRADIENT, 2, 32.0, 30, LOW)#, 0, 100) great to add circle constraint sizes.
except:
LOW += 1
print 'try'
find_circles(processed, storage, LOW)
circles = np.asarray(storage)
print 'number of circles:', len(circles)
if colliding_circles(circles):
LOW += 1
storage = find_circles(processed, storage, LOW)
print 'c', LOW
return storage
def draw_circles(storage, output):
circles = np.asarray(storage)
print len(circles), 'circles found'
for circle in circles:
Radius, x, y = int(circle[0][2]), int(circle[0][0]), int(circle[0][1])
cv.Circle(output, (x, y), 1, cv.CV_RGB(0, 255, 0), -1, 8, 0)
cv.Circle(output, (x, y), Radius, cv.CV_RGB(255, 0, 0), 3, 8, 0)
#split image into RGB components
cv.Split(orig,rrr,ggg,bbb,None)
#process each component
channel_processing(rrr)
channel_processing(ggg)
channel_processing(bbb)
#combine images using logical 'And' to avoid saturation
cv.And(rrr, ggg, rrr)
cv.And(rrr, bbb, processed)
cv.ShowImage('before canny', processed)
# cv.SaveImage('case3_processed.jpg',processed)
#use canny, as HoughCircles seems to prefer ring like circles to filled ones.
cv.Canny(processed, processed, 5, 70, 3)
#smooth to reduce noise a bit more
cv.Smooth(processed, processed, cv.CV_GAUSSIAN, 7, 7)
cv.ShowImage('processed', processed)
#find circles, with parameter search
storage = find_circles(processed, storage, 100)
draw_circles(storage, output)
# show images
cv.ShowImage("original with circles", output)
cv.SaveImage('case1.jpg',output)
cv.WaitKey(0)
Ah, yes… the old colour/size invariants for circles problem (AKA the Hough transform is too specific and not robust)...
In the past I have relied much more on the structural and shape analysis functions of OpenCV instead. You can get a very good idea of from "samples" folder of what is possible - particularly fitellipse.py and squares.py.
For your elucidation, I present a hybrid version of these examples and based on your original source. The contours detected are in green and the fitted ellipses in red.
It's not quite there yet:
The pre-processing steps need a bit of tweaking to detect the more faint circles.
You could test the contour further to determine if it is a circle or not...
Good luck!
import cv
import numpy as np
# grab image
orig = cv.LoadImage('circles3.jpg')
# create tmp images
grey_scale = cv.CreateImage(cv.GetSize(orig), 8, 1)
processed = cv.CreateImage(cv.GetSize(orig), 8, 1)
cv.Smooth(orig, orig, cv.CV_GAUSSIAN, 3, 3)
cv.CvtColor(orig, grey_scale, cv.CV_RGB2GRAY)
# do some processing on the grey scale image
cv.Erode(grey_scale, processed, None, 10)
cv.Dilate(processed, processed, None, 10)
cv.Canny(processed, processed, 5, 70, 3)
cv.Smooth(processed, processed, cv.CV_GAUSSIAN, 15, 15)
#storage = cv.CreateMat(orig.width, 1, cv.CV_32FC3)
storage = cv.CreateMemStorage(0)
contours = cv.FindContours(processed, storage, cv.CV_RETR_EXTERNAL)
# N.B. 'processed' image is modified by this!
#contours = cv.ApproxPoly (contours, storage, cv.CV_POLY_APPROX_DP, 3, 1)
# If you wanted to reduce the number of points...
cv.DrawContours (orig, contours, cv.RGB(0,255,0), cv.RGB(255,0,0), 2, 3, cv.CV_AA, (0, 0))
def contour_iterator(contour):
while contour:
yield contour
contour = contour.h_next()
for c in contour_iterator(contours):
# Number of points must be more than or equal to 6 for cv.FitEllipse2
if len(c) >= 6:
# Copy the contour into an array of (x,y)s
PointArray2D32f = cv.CreateMat(1, len(c), cv.CV_32FC2)
for (i, (x, y)) in enumerate(c):
PointArray2D32f[0, i] = (x, y)
# Fits ellipse to current contour.
(center, size, angle) = cv.FitEllipse2(PointArray2D32f)
# Convert ellipse data from float to integer representation.
center = (cv.Round(center[0]), cv.Round(center[1]))
size = (cv.Round(size[0] * 0.5), cv.Round(size[1] * 0.5))
# Draw ellipse
cv.Ellipse(orig, center, size, angle, 0, 360, cv.RGB(255,0,0), 2,cv.CV_AA, 0)
# show images
cv.ShowImage("image - press 'q' to quit", orig)
#cv.ShowImage("post-process", processed)
cv.WaitKey(-1)
EDIT:
Just an update to say that I believe a major theme to all these answers is that there are a host of further assumptions and constraints that can be applied to what you seek to recognise as circular. My own answer makes no pretences at this - neither in the low-level pre-processing or the high-level geometric fitting. The fact that many of the circles are not really that round due to the way they are drawn or the non-affine/projective transforms of the image, and with the other properties in how they are rendered/captured (colour, noise, lighting, edge thickness) - all result in any number of possible candidate circles within just one image.
There are much more sophisticated techniques. But they will cost you. Personally I like #fraxel idea of using the addaptive threshold. That is fast, reliable and reasonably robust. You can then test further the final contours (e.g. use Hu moments) or fittings with a simple ratio test of the ellipse axis - e.g. if ((min(size)/max(size))>0.7).
As ever with Computer Vision there is the tension between pragmatism, principle, and parsomony. As I am fond of telling people who think that CV is easy, it is not - it is in fact famously an AI complete problem. The best you can often hope for outside of this is something that works most of the time.
Looking through your code, I noticed the following:
Greyscale conversion. I understand why you're doing it, but realize that you're throwing
away information there. As you see in the "post-process" images, your yellow circles are
the same intensity as the background, just in a different color.
Edge detection after noise removal (erae/dilate). This shouldn't be necessary; Canny ought to take care of this.
Canny edge detection. Your "open" circles have two edges, an inner and outer edge. Since they're fairly close, the Canny gauss filter might add them together. If it doesn't, you'll have two edges close together. I.e. before Canny, you have open and filled circles. Afterwards, you have 0/2 and 1 edge, respectively. Since Hough calls Canny again, in the first case the two edges might be smoothed together (depending on the initial width), which is why the core Hough algorithm can treat open and filled circles the same.
So, my first recommendation would be to change the grayscale mapping. Don't use intensity, but use hue/saturation/value. Also, use a differential approach - you're looking for edges. So, compute a HSV transform, smooth a copy, and then take the difference between the original and smoothed copy. This will get you dH, dS, dV values (local variation in Hue, Saturation, Value) for each point. Square and add to get a one-dimensional image, with peaks near all edges (inner and outer).
My second recommendation would be local normalization, but I'm not sure if that's even necessary. The idea is that you don't care particularly much about the exact value of the edge signal you got out, it should really be binary anyway (edge or not). Therefore, you can normalize each value by dividing by a local average (where local is in the order of magnitude of your edge size).
The Hough transform uses a "model" to find certain features in a (typically) edge-detected image, as you may know. In the case of HoughCircles that model is a perfect circle. This means there probably doesn't exist a combination of parameters that will make it detect the more erratically and ellipse shaped circles in your picture without increasing the number of false positives. On the other hand, due to the underlying voting mechanism, a non-closed perfect circle or a perfect circle with a "dent" might consistently show up. So depending on your expected output you may or may not want to use this method.
That said, there are a few things I see which might help you on your way with this function:
HoughCircles calls Canny internally, so I guess you can leave that call out.
param1 (which you call HIGH) is typically initialised around a value of 200. It is used as a parameter to the internal call to Canny: cv.Canny(processed, cannied, HIGH, HIGH/2). It might help to run Canny yourself like this to see how setting HIGH affects the image being worked with by the Hough transform.
param2 (which you call LOW) is typically initialised around a value 100. It is the voting threshold for the Hough transform's accumulators. Setting it higher means more false negatives, lower more false positives. I believe this is the first one you want to start fiddling around with.
Ref: http://docs.opencv.org/3.0-beta/modules/imgproc/doc/feature_detection.html#houghcircles
Update re: filled circles: After you've found the circle shapes with the Hough transform you can test if they are filled by sampling the boundary colour and comparing it to one or more points inside the supposed circle. Alternatively you can compare one or more points inside the supposed circle to a given background colour. The circle is filled if the former comparison succeeds, or in the case of the alternative comparison if it fails.
Ok looking at the images. I suggest using **Active Contours**
Active Contours
The good thing about active contours is that they almost perfectly fit into the any given shape. Be it squares or triangle and in your case they are the perfect candidates.
If you are able to extract the centre of the circles, that is great. Active contours always need a point to start from which they can either grow or shrink to fit. Not necessary that the centres are always aligned to the centre. A little offset will still be ok.
And in your case, if you let the contours to grow from the centre outwards, they shall rest a the circle boundaries.
Note that active contours that grow or shrink use balloon energy which means you can set the direction of contours, inwards or outwards.
You would probably need to use the gradient image in grey scale. But still you can try in colour as well. If it works!
And if you do not provide centres, throw in lots of active contours, make then grow/shrink. Contours that settle down are kept, unsettled ones are thrown away. This is a brute force approach. Will CPU intensive. But will require more careful work to make sure you leave correct contours and throw out the bad ones.
I hope this way you can solve the problem.