I've created an openCV application for human detection on images.
I run my algorithm on the same image over different scales, and when detections are made, at the end I have information about the bounding box position and at which scale it was taken from. Then I want to transform that rectangle to the original scale, given that position and size will vary.
I've wrapped my head around this and I've gotten nowhere. This should be rather simple, but at the moment I am clueless.
Help anyone?
Ok, got the answer elsewhere
"What you should do is store the scale where you are at for each detection. Then transforming should be rather easy right. Imagine you have the following.
X and Y coordinates (center of bounding box) at scale 1/2 of the original. This means that you should multiply with the inverse of the scale to get the location in the original, which would be 2X, 2Y (again for the bounxing box center).
So first transform the center of the bounding box, than calculate the width and height of your bounding box in the original, again by multiplying with the inverse. Then from the center, your box will be +-width_double/2 and +-height_double/2."
Related
I have an image of a circle but my circle is not perfect
Firstly I found transition coordinates
Detecting Circles without using Hough Circles
and than ı use this formula https://math.stackexchange.com/questions/675203/calculating-centre-of-rotation-given-point-coordinates-at-different-positions/1414344#1414344
Finally ı have fourn radiuses which the longest and shortest
Now I have this image:
BUt they are radius I need to find diameter How to find diameter from image??
Or How to I can find mutual/symetric/ point in a circle
For this image, the approaches mentioned are overkill. Just find the bounding box of the non-black pixels. Because of sampling artifacts, the horizontal and vertical side lengths may differ by one or two pixels.
If I am right, the outer circle is 277 x 273 pixels. If you consider the difference to be significant, then this is an ellipse, not a circle.
3 ways to do this:
I think you need to measure it from image:
so use edge detection (blue line width from left to right = width of bounding box of the blue pixels) then count pixels.
If you need then convert to any unit you want like inch (using pixel per inch).
If your circle is not perfect circle (stretched) measure in many direction so you can find its deviation too.
There is another way named Monte-Carlo method:
first generate random x and y (inside square) and then evaluate that point (x, y) is inside the circle or not and count the number of inside occurrences, then you can calculate the Area of circle using ratio on inside count/total (and therefore diameter).
without using random numbers:
fill(color) circle inside then simply count black pixels this is = to Area outside circle => Total Area(Square Area)- black pixel Area = Circle Area => calculate diameter.
I'm working on an OpenCV program to find the distance from the camera to a rectangle with a known aspect ratio. Finding the distance to a rectangle as seen from a forward-facing view works just fine:
The actual distance is very close to the distance calculated by this:
wtarget · pimage
d = c ——————————————————————————
2 · ptarget · tan(θfov / 2)
Where wtarget is the actual width (in inches) of the target, pimage is the pixel width of the overall image, ptarget is the length of the largest width (in pixels) of the detected quadrilateral, and θfov is the FOV of our webcam. This is then multiplied by some constant c.
The issue occurs when the target rectangle is viewed from a perspective that isn't forward-facing:
The difference in actual distance between these two orientations is small, but the detected distance differs by almost 2 feet.
What I'd like to know is how to calculate the distance consistently, accounting for different perspective angles. I've experimented with getPerspectiveTransform, but that requires me to know the resulting scale of the target - I only know the aspect ratio.
Here's what you know:
The distance between the top left and top right corners in inches (w_target)
The distance between those corners in pixels on a 2D plane (p_target)
So the trouble is that you're not accounting for the shrinking distance of p_target when the rectangle is at an angle. For example, when the rectangle is turned 45 degrees, you'll lose about half of your pixels in p_target, but your formula assumes w_target is constant, so you overestimate distance.
To account for this, you should estimate the angle the box is turned. I'm not sure of an easy way to extract that information out of getPerspectiveTransform, but it may be possible. You could set up a constrained optimization where the decision variables are distance and skew angle and enforce a metric distance between the left and right points on the box.
Finally, no matter what you are doing, you should make sure your camera is calibrated. Depending on your application, you might be able to use AprilTags to just solve your problem.
I am trying to develop box sorting application in qt and using opencv. I want to measure width and length of box.
As shown in image above i want to detect only outermost lines (ie. box edges), which will give me width and length of box, regardless of whatever printed inside the box.
What i tried:
First i tried using Findcontours() and selected contour with max area, but the contour of outer edge is not enclosed(broken somewhere in canny output) many times and hence not get detected as a contour.
Hough line transform gives me too many lines, i dont know how to get only four lines am interested in out of that.
I tried my algorithm as,
Convert image to gray scale.
Take one column of image, compare every pixel with next successive pixel of that column, if difference in there value is greater than some threshold(say 100) that pixel belongs to edge, so store it in array. Do this for all columns and it will give upper line of box parallel to x axis.
Follow the same procedure, but from last column and last row (ie. from bottom to top), it will give lower line parallel to x axis.
Likewise find lines parallel to y axis as well. Now i have four arrays of points, one for each side.
Now this gives me good results if box is placed in such a way that its sides are exactly parallel to X and Y axis. If box is placed even slightly oriented in some direction, it gives me diagonal lines which is obvious as shown in below image.
As shown in image below i removed first 10 and last 10 points from all four arrays of points (which are responsible for drawing diagonal lines) and drew the lines, which is not going to work when box is tilted more and also measurements will go wrong.
Now my question is,
Is there any simpler way in opencv to get only outer edges(rectangle) of box and get there dimensions, ignoring anything printed on the box and oriented in whatever direction?
I am not necessarily asking to correct/improve my algorithm, but any suggestions on that also welcome. Sorry for such a big post.
I would suggest the following steps:
1: Make a mask image by using cv::inRange() (documentation) to select the background color. Then use cv::not() to invert this mask. This will give you only the box.
2: If you're not concerned about shadow, depth effects making your measurment inaccurate you can proceed right away with trying to use cv::findContours() again. You select the biggest contour and store it's cv::rotatedRect.
3: This cv::rotatedRect will give you a rotatedRect.size that defines the width en the height of your box in pixels
Since the box is placed in a contrasting background, you should be able to use Otsu thresholding.
threshold the image (use Otsu method)
filter out any stray pixels that are outside the box region (let's hope you don't get many such pixels and can easily remove them with a median or a morphological filter)
find contours
combine all contour points and get their convex hull (idea here is to find the convex region that bounds all these contours in the box region regardless of their connectivity)
apply a polygon approximation (approxPolyDP) to this convex hull and check if you get a quadrangle
if there are no perspective distortions, you should get a rectangle, otherwise you will have to correct it
if you get a rectangle, you have its dimensions. You can also find the minimum area rectangle (minAreaRect) of the convexhull, which should directly give you a RotatedRect
I have got the following image:
There are curves on the picture.
i would like to find center of the circles containing curves.
i tried opencv and hough circle transform but had no results.
The natural candidate would be cvHoughCircles. Each part of each curve adds a "vote" for an X/Y/R triplet which identifies the centrepoint. Now, you only have part of the circles, so the number of votes is limited and the accuracy reduced, but you probably suspected as much.
Here's what I would try first:
Observe that if you draw rays from the true center of the circles, the local maxima of the image intensity along them occur at intervals that are independent of the ray orientation. These intervals are the differences between the lengths of the radii of consecutive circles.
So fix a number or ray directions, say 16 equally spaced in [0, pi], and define a cost function parametrized on the (xc, yc) coordinates of the center, and the ri radii of the circles, with cost equal to, for example, the variance of the maxima locations along the radii
among different radii.
Threshold the image
erode it until there is little or no noise (small blobs)
dilate it back
find the big blob. If there are still some small blobs, select the max area.
use cv::moments to find its centroid
Good Morning everybody,
Today I wanna concern about the topic "Image Manipulation in C++".
So far I am able to filter all the noisy stuff out of the picture and change the color to black and white.
But now I have two questions.
First Question:
Below you see a screenshot of the image. What is the best way to find out how to rotate the text. In the end it would be nice if the text is horizontal. Does anybody have a good link or an example.
Second Question:
How to go on? Do you think I should send the image to an "Optical Character Recognizer" (a) or should I filter out each letter (b)?
If the answer is (a) what is the smallest ocr lib? All libs I found so far seem to be overpowered and difficult to implement in an existing project. (like gocr or tesseract)
If the answer is (b) what is the best way to save each letter as an own image? Shoul i search for an white pixel an than go from pixel to pixel an save the coordinates in an 2D Array? What is with the letter "i" ;)
Thanks to everybody who will help me to find my way!Sorry for the strange english above. I'm still a language noob :-)
The usual name for the problem in your first question is "Skew Correction"
You may Google for it (lot of references). A nice paper here, showing for example how to get this:
An easy way to start (but not as good as the previously mentioned), is to perform a Principal Component Analysis:
For your first question:
First, Remove any "specs" of noisy white pixels that aren't part of the letter sequence. A gentle low-pass filter (pixel color = average of surrounding pixels) followed by a clamping of the pixel values to pure black or pure white. This should get rid of the little "dot" underneath the "a" character in your image and any other specs.
Now search for the following pixels:
xMin = white pixel with the lowest x value (white pixel closest to the left edge)
xMax = white pixel with the largest x value (white pixel closest to the right edge)
yMin = white pixel with the lowest y value (white pixel closest to the top edge)
yMax = white pixel with the largest y value (white pixel closest to the bottom edge)
with these four pixel values, form a bounding box: Rect(xMin, yMin, xMax, yMax);
compute the area of the bounding box and find the center.
using the center of the bounding box, rotate the box by N degrees. (You can pick N: 1 degree would be an ok value).
Repeat the process of finding xMin,xMax,yMin,yMax and recompute the area
Continue rotating by N degrees until you've rotated K degrees. Also rotate by -N degrees until you've rotated by -K degrees. (Where K is the max rotation... say 30 degrees). At each step recompute the area of the bounding box.
The rotation that produces the bounding box with the smallest area is likely the rotation that aligns the letters parallel to the bottom edge (horizontal alignment).
You could measure the height to each white pixel from the bottom and find how much the text is leaning. It's a very simple approach but it worked fine for me when I tried it.