I'm new to OpenCV and was wondering if anybody could direct me to the most suitable algorithm(s) to tackle the challenge of identifying the locations of circles and crosses in images that look like the following . .
[
Sometimes there are lines connecting . .
[
They might even be hand drawn like this one . .
So far I have looked at the template matching example, but it is probably not the correct approach, and it doesn't scale the sizes of the templates to the images.
So given the following observations . . .
The crosses and circles may overlap.
If the diagram is in colour,
the colours will be identical for crosses and identical for circles.
Sometimes they will be joined by lines, sometimes not.
There may be other shape symbols in the plots
They symbols will be of similar size and shape, but might not be computer generated, so will not necessarily be identical.
Where should I begin my adventure?
Not an easy task.
For the colored case, you should start by separating the color planes. There is some chance that you can get the markers apart.
But for the b&w case, there is no escape, you must go deeper.
I would try to detect the grid lines first, for example using a Hough line detector, as accurately as possible. Then erase those lines.
Then try to find the crosses, which are short oblique line segments (most of the time broken by the previous operations).
The circles might be detected by a Hough circle detector, using a small range of radii.
Alternatively, a rige or edge detector can be used to get short segments and short curved arcs. You may have to add some filtering criteria to avoid the joining lines.
Like said, it's not a easy task.
One of possible way could be machine learning. I think of cascade classifier (aka Viola Jones method) that pretty nice for detection of object. It's pretty easy to implement with openCV but it need to understand how it works and a large amount of sample.
You can try to use couple of ideas:
1) FFT can help to remove grid. Something like this.
2) Markers (crosses and circles) are the objects with angle of gradient, that differs from right angles. It may help to localize them.
Related
I have an application where I have to detect the presence of some items in a scene. The items can be rotated and a little scaled (bigger or smaller). I've tried using keypoint detectors but they're not fast and accurate enough. So I've decided to first detect edges in the template and the search area, using Canny ( or a faster edge detection algo ), and then match the edges to find the position, orientation, and size of the match found.
All this needs to be done in less than a second.
I've tried using matchTemplate(), and matchShape() but the former is NOT scale and rotation invariant, and the latter doesn't work well with the actual images. Rotating the template image in order to match is also time consuming.
So far I have been able to detect the edges of the template but I don't know how to match them with the scene.
I've already gone through the following but wasn't able to get them to work (they're either using old version of OpenCV, or just not working with other images apart from those in the demo):
https://www.codeproject.com/Articles/99457/Edge-Based-Template-Matching
Angle and Scale Invariant template matching using OpenCV
https://answers.opencv.org/question/69738/object-detection-kinect-depth-images/
Can someone please suggest me an approach for this? Or a code snipped for the same if possible ?
This is my sample input image ( the parts to detect are marked in red )
These are some software that are doing this and also how I want it should be:
This topic is what I am actually dealing for a year on a project. So I will try to explain what my approach is and how I am doing that. I assume that you already did the preprocess steps(filters,brightness,exposure,calibration etc). And be sure you clean the noises on image.
Note: In my approach, I am collecting data from contours on a reference image which is my desired object. Then I am comparing these data with the other contours on the big image.
Use canny edge detection and find the contours on reference
image. You need to be sure here about that it shouldn't miss some parts of
contours. If it misses, probably preprocess part should have some
problems. The other important point is that you need to find an
appropriate mode of findContours because every modes have
different properties so you need to find an appropriate one for your
case. At the end you need to eliminate the contours which are okey
for you.
After getting contours from reference, you can find the length of
every contours using outputArray of findContours(). You can compare
these values on your big image and eliminate the contours which are
so different.
minAreaRect precisely draws a fitted, enclosing rectangle for
each contour. In my case, this function is very good to use. I am
getting 2 parameters using this function:
a) Calculate the short and long edge of fitted rectangle and compare the
values with the other contours on the big image.
b) Calculate the percentage of blackness or whiteness(if your image is
grayscale, get a percentage how many pixel close to white or black) and
compare at the end.
matchShape can be applied at the end to the rest of contours or you can also apply to all contours(I suggest first approach). Each contour is just an array so you can hold the reference contours in an array and compare them with the others at the end. After doing 3 steps and then applying matchShape is very good on my side.
I think matchTemplate is not good to use directly. I am drawing every contour to a different mat zero image(blank black surface) as a template image and then I compare with the others. Using a reference template image directly doesnt give good results.
OpenCV have some good algorithms about finding circles,convexity etc. If your situations are related with them, you can also use them as a step.
At the end, you just get the all data,values, and you can make a table in your mind. The rest is kind of statistical analysis.
Note: I think the most important part is preprocess part. So be sure about that you have a clean almost noiseless image and reference.
Note: Training can be a good solution for your case if you just want to know the objects exist or not. But if you are trying to do something for an industrial application, this is totally wrong way. I tried YOLO and haarcascade training algorithms several times and also trained some objects with them. The experiences which I get is that: they can find objects almost correctly but the center coordinates, rotation results etc. will not be totally correct even if your calibration is correct. On the other hand, training time and collecting data is painful.
You have rather bad image quality very bad light conditions, so you have only two ways:
1. To use filters -> binary threshold -> find_contours -> matchShape. But this very unstable algorithm for your object type and image quality. You will get a lot of wrong contours and its hard to filter them.
2. Haarcascades -> cut bounding box -> check the shape inside
All "special points/edge matching " algorithms will not work in such bad conditions.
I'm using the ORB algorithm to detect and get the coordinates of the crossings of rope shown in the image, which is represented by the red dot. I want to detect the coordinates of the four points surrounding the crossing represented by the blue dots. All the four points have the same distance from the red spot.
Any idea how to get their coordinates by getting use of the red spot coordinate.
Thank you in Advance
Although you're using ORB, you're still going to need an algorithm to segment the rope from the background, or at least some technique to identify image chunks that belong to the rope and that are equidistant from the red dot. There are a number of options to explore.
It's important to consider your lighting & imaging as separate problems to be solved if this is meant to be a real-world application. This looks a bit like a problem for a class rather than for a application you'll sell and support, but you should still consider lighting:
Will your algorithm(s) still work when light level is reduced?
How will detection be affected by changes in camera pose relative to the surface where the rope will be located?
If you'll be detecting "black" rope, will the algorithm also be required to detect rope of different colors? dirty rope? rope on different backgrounds?
Since you're object of interest is rope, you have to consider a class of algorithms suitable for detection of non-rigid objects. Always consider the simplest solution first!
Connected Components
Connected components labeling is a traditional image processing algorithm and still suitable as the starting point for many applications. The last I knew, this was implemented in OpenCV as findContours(). This can also be called "blob finding" or some variant thereof.
https://en.wikipedia.org/wiki/Connected-component_labeling
https://docs.opencv.org/2.4/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html?highlight=findcontours
Depending on lighting, you may have to take different steps to binarize the image before running connected components. As a start, convert the color image to grayscale, which will simplify the task significantly.
Try a manual threshold since you can quickly test a number of values to see the effect. Don't be too discouraged if the binarization isn't quite right--this can often be fixed with preprocessing.
If a range of manual thresholds works (e.g. 52 - 76 in an 8-bit grayscale range), then use an algorithm that will automatically calculate the threshold for you: Otsu, entropy-based methods, etc., will all offer comparable performance. Whichever technique works best, the code/algorithm can be tweaked further to optimize for your rope application.
If thresholding and binarization don't work--which for your rope application seems unlikely, at least how you've presented it--then switch to thinking in terms of gradient-based (edge-based, energy-based) techniques.
But assuming you can separate the rope from the background, you're still going to need a method to start at the red dot [within the rope] and move equal distances out to the blue points. More about that later after a discussion of other rope segmentation methods.
Note: connected components labeling can work in scenarios beyond just binarizing black & white images. If you can create a texture field or some other 2D representation of the image that makes it possible to distinguish the black rope from the relatively light background, you may be able to use a connected components algorithm. (Finding a "more complicated" or "more modern" algorithm isn't necessarily going to be the right approach.)
In a binarized image, blobs can be nested: on a white background you can have several black blobs, inside of one or more of which are white blobs, inside of which are black blobs, etc. An earlier version of OpenCV handled this reasonably well. (OpenCV is a nice starting point, and a touchpoint for many, but for a number of reasons it doesn't always compare favorably to other open source and commercial packages; popularity notwithstanding, OpenCV has some issues.)
Once you have a "blob" (a 4-connected region of pixels) in a 2D digital image, you can treat the blob as an object, at which point you have a number of options:
Edge tracing: trace around the inside and outside edges of the blob. From what I recall, OpenCV does (or at least should) have some relatively straightforward method to get the edges.
Split the blob into component blobs, each of which can be treated separately
Convert the blob to a polygon
...
A connected components algorithm should be high on the list of techniques to try if you have a non-rigid object.
Boolean Operations
Once you have the rope as a connected component (and possibly even without this), you can use boolean image operations to find the spots at the blue dots in your image:
Create a circular region in data, or even in the image
Find the intersection of the circle (an annulus) and the black region representing the rope. Using your original image, you should have four regions.
Find the center point of the intersection regions.
You could even try this without using connected components at all, but using connected components as part of the solution could make it more robust.
Polygon Simplification
If you have a blob, which in your application would be a connected set of black pixels representing the rope on the floor, then you can consider converting this blob to one or more polygons for further processing. There are advantages to working with polygons.
If you consider only the outside boundary of the rope, then you can see that the set of pixels defining the boundary represents a polygon. It's a polygon with a lot of points, and not a convex polygon, but a polygon nonetheless.
To simplify the polygon, you can use an algorithm such as Ramer-Douglas-Puecker:
https://en.wikipedia.org/wiki/Ramer%E2%80%93Douglas%E2%80%93Peucker_algorithm
Once you have a simplified polygon, you can try a few techniques to render useful data from the polygon
Angle Bisector Network
Triangulation (e.g. using ear clipping)
Triangulation is typically dependent on initial conditions, so the resulting triangulation for slighting different polygons (that is, rope -> blob -> polygon -> simplified polygon). So in your application it might be useful to triangulate the dark rope region, and then to connect the center of one triangle to the center of the next nearest triangle. You'll also have to deal with crossings, such as the rope overlap. Ultimately this can yield a "skeletonization" of the rope. Speaking of which...
Skeletonization
If the rope problem was posed to you as a class exercise, then it may have been a prompt to try skeletonization. You can read about it here:
https://en.wikipedia.org/wiki/Topological_skeleton
Skeletonization and thinning have their own problems to solve, but you should dig into them a bit and see those problems themselves.
The Medial Axis Transform (MAT) is a related concept. Long story there.
Edge-based techniques
There are a number of techniques to generate "edge images" based on edge strength, energy, entropy, etc. Making them robust takes a little effort. If you've had academic training in image processing you've likely heard of Harris, Sobel, Canny, and similar processing methods--none are magic bullets, but they're simple and dependable and will yield data you need.
An "edge image" consists of pixels representing the image gradient strength [and sometimes the gradient direction]. People may call this edge image something else, but it's the concept that matters.
What you then do with the edge data is another subject altogether. But one reason to think of edge images (or at least object borders) is that it reduces the amount of information your algorithm(s) will need to process.
Mean Shift (and related)
To get back to segmentation mentioned in the section on connected components, there are other methods for segmenting figures from a background: K-means, mean shift, and so on. You probably won't need any of those, but they're neat and worth studying.
Stroke Width Transform
This is an intriguing technique used to extract text from noisy backgrounds. Although it's intended for OCR, it could work for rope since the rope width is relatively constant, the rope shape varies, there are crossings, etc.
In short, and simplifying quite a bit, you can think of SWT as a means to find "strokes" (thick lines) by finding gradients antiparallel to each other. On either side of a stroke (or line), the edge gradient points normal to the object edge. The normal on one side of the stroke points opposite the direction of the normal on the other side of the stroke. By filtering for pixel-gradient pairs within a certain distance of each other, you can isolate certain strokes--even automatically. For your example the collection of points representing edge pairs for the rope would be much more common than other point pairs.
Non-Rigid Matching
There are techniques for matching non-rigid shapes, but they would not be worth exploring. If any of the techniques I mentioned above is unfamiliar to you, explore some of those first before you try any fancier algorithms.
CNNs, machine learning, etc.
Just don't even think of these methods as a starting point.
Other Considerations
If this were an application for industry, security, or whatnot, you'd have to determine how well your image processing worked under all environmental considerations. That's not an easy task, and can make all the difference between a setup that "works" in the lab and a setup that actually works in practice.
I hope that's of some help. Feel free to post a reply if I've confused more than helped, or if you want to explore some idea in more detail. Though I tried to touch on some common(ish) techniques, I didn't mention all the different ways of addressing this problem.
And briefly: once you have a skeleton, point network, or whatever representing a reduced data set for the rope and the red dot (the identified feature), a few techniques to find the items at the blue dots:
For a skeleton, trace along each "branch" of the rope outward from the know until the geodesic distance or straight-line 2D distance is the distance D that you want.
To use geometry, create a circle of width 1 - 2 pixels. Find the intersection of that circle and the rope. Find the center point of the intersections of circle and rope. (Also described above.)
Good luck!
I'm trying to track little white dots on edges table. In most of case it works. I'm using cornerHarris function like it's used in this tutorial : http://docs.opencv.org/doc/tutorials/features2d/trackingmotion/harris_detector/harris_detector.html .
Sometimes, I have got a problem : reflection of the light on edges creates point of interest which I have to not consider.
For example :
I'm searching to the two nearest points of the top corners, as you can see on the right edges, i have find dots(red and green dots) and on the left edges, light noise is a problem (cyan and blue dots).
Does someone knows a method to keep only dots white on my picture ? Thankyou and sorry for my english
On the purely image processing part, I would recommend using some kind of shape feature analysis(like comparing the histogram in say 8x8 around your currently examined point of interest to precomputed ones of the features you want .
This would mean that you first look for points with Harris corners, then compare the features to dismiss unwanted ones ( euclidian distance in the 8x8 = 64D ?). This of course assumes the existence of a strong feature (read "taking time to find a good one")
It also assumes you already know what your feature points look like beforehand.
Alternative more on the computer vision side : use geometry of your corner points repartition to your advantage : you probably want a distorted rectangle, so make sure you find one ! Surely you can compute a function that gives the validity of the last feature point assuming 3 others ? (Distance of intersection of the 2 lines generated by the other 3 points ...)
The typical and coolest approach would then apply RANSAC to it : try random (but not all !) combinations of your points and check which one fits best using that function, and consider those as good.
If you intend on tracking over time or over several images, you will have to tune it a bit, as ransac can occasionally fail (statistics of random combinations ...), and you would then use points from your previously successful run to guestimate the position.
Last idea for the moment : use some color-aware derivation technique : do you compute Harris corner of the rgb image or of a flattened version to gray ? Some gradients use color information as extra tip to discern edges, and I'm not sure the corners you're finding use any of those. Then again it might mean reimplementing Harris corners algorithm (try it, it's fun, and not that hard if you have a good algebra library to do the heavy work)
I recommend the geometric test of fitting as it uses wisely model-info of your system rather than assumptions on how reflections look like.
Really funny introduction to RANSAC : danielwedge.com/ransac/
Edit : Trusty photoshop knows what I mean : I highlighted the invalid shapes
Valid grid, photoshop says so
Invalid grid, logical, right ?
I have been working with houghlines in OpenCV and I cant seem to get a more accurate line reading, sometimes there are two duplicate lines on top of each other. I have looked at tutorial on the opencv website but it gives a similar result.
To remove those duplicate lines, there are two things that may help you:
Double edges may appear that may lead to duplicate lines. A sequence of blurring/dilating the input image would solve these issues.
Close lines that have almost same slope can be removed by using lower angle resolutions for the theta argument of Hough Line method. For example using π/180 would result in finding lines that differ only by one degree in their slope. You may use 5*π/180 to find lines in 5 degree resolution.
As an example, the following lines are detected by using the raw image and a 1-degree resolution:
After a bit of blurring and using a 3-degree resolution you can get a result like the following:
By changing the threshold, you can get more or less lines.
About fitting curves you pointed in the comments section, yes you can fit curves, but not with hough lines method. You need to find a parametric definition of that shape and try to run the voting procedure in hough transform yourself. The only other shape that opencv helps you to find is circle.
I am trying to think of the best method to detect rectangles in an image.
My initial thought is to use the Hough transform for lines, and to select combinations of lines where you have two lines intersected at both the lower portion and upper portion by the same two lines, but this is not sufficient.
Would using a corner detector along with the Hough transform work?
Check out /samples/c/squares.c in your OpenCV distribution. This example provides a square detector, and it should be a pretty good start.
My answer here also applies.
I don't think that currently there exists a simple and robust method to detect rectangles in an image. You have to deal with many problems such as the rectangles not being exactly rectangular but only approximately, partial occlusions, lighting changes, etc.
One possible direction is to do a segmentation of the image and then check how close each segment is to being a rectangle. Since you can't trust your segmentation algorithm, you can run it multiple times with different parameters.
Another direction is to try to parametrically fit a rectangle to the image such that the image gradient magnitude along the contour will be maximized.
If you choose to work on a parametric approach, notice that while the trivial way to parameterize a rectangle is by the locations of it's four corners, which is 8 parameters, there are a few other representations that require less parameters.
There is an extension of Hough that can be useful.
http://en.wikipedia.org/wiki/Generalised_Hough_transform