I am creating a menu system for my game engine and want to know how to be able to detect when the mouse is over a button. This is simple enough to do when the button is a square, rectangle or circle but I was wondering how to handle irregular shaped buttons.
Is this possible and if it is, does the complexity mean that it is better to simply use a bounding area (square or circle)?
Make a bitmask out of the texture or surface data. Decide on a rule; for example where the image is 100% transparent or a certain color the bitmask pixel is set to 0 otherwise set it to 1. Do the same for your cursor. When you check for collision simply check if the bitmask bits set to 1 overlap.
First what comes to my mind is to use mathematical functions. If you know the equation of the curve you can calculate if the point is under or over it by simply checking if right side of the equation is greater or less than the "y".
So if you have simple y = x*x and want to check point (2,1), you substitute it and check:
y = 2
x = 1*1 = 1
y > 1, point is over the curve. For opposite situation, taking the point (1,2), we get:
y = 1
x = 2*2 = 4
y < x, point is under the curve.
Related
I am writing a disparity matching algorithm using block matching, but I am not sure how to find the corresponding pixel values in the secondary image.
Given a square window of some size, what techniques exist to find the corresponding pixels? Do I need to use feature matching algorithms or is there a simpler method, such as summing the pixel values and determining whether they are within some threshold, or perhaps converting the pixel values to binary strings where the values are either greater than or less than the center pixel?
I'm going to assume you're talking about Stereo Disparity, in which case you will likely want to use a simple Sum of Absolute Differences (read that wiki article before you continue here). You should also read this tutorial by Chris McCormick before you read more here.
side note: SAD is not the only method, but it's really common and should solve your problem.
You already have the right idea. Make windows, move windows, sum pixels, find minimums. So I'll give you what I think might help:
To start:
If you have color images, first you will want to convert them to black and white. In python you might use a simple function like this per pixel, where x is a pixel that contains RGB.
def rgb_to_bw(x):
return int(x[0]*0.299 + x[1]*0.587 + x[2]*0.114)
You will want this to be black and white to make the SAD easier to computer. If you're wondering why you don't loose significant information from this, you might be interested in learning what a Bayer Filter is. The Bayer Filter, which is typically RGGB, also explains the multiplication ratios of the Red, Green, and Blue portions of the pixel.
Calculating the SAD:
You already mentioned that you have a window of some size, which is exactly what you want to do. Let's say this window is n x n in size. You would also have some window in your left image WL and some window in your right image WR. The idea is to find the pair that has the smallest SAD.
So, for each left window pixel pl at some location in the window (x,y) you would the absolute value of difference of the right window pixel pr also located at (x,y). you would also want some running value, which is the sum of these absolute differences. In sudo code:
SAD = 0
from x = 0 to n:
from y = 0 to n:
SAD = SAD + absolute_value|pl - pr|
After you calculate the SAD for this pair of windows, WL and WR you will want to "slide" WR to a new location and calculate another SAD. You want to find the pair of WL and WR with the smallest SAD - which you can think of as being the most similar windows. In other words, the WL and WR with the smallest SAD are "matched". When you have the minimum SAD for the current WL you will "slide" WL and repeat.
Disparity is calculated by the distance between the matched WL and WR. For visualization, you can scale this distance to be between 0-255 and output that to another image. I posted 3 images below to show you this.
Typical Results:
Left Image:
Right Image:
Calculated Disparity (from the left image):
you can get test images here: http://vision.middlebury.edu/stereo/data/scenes2003/
Assume that I took two panoramic image with vertical offset of H and each image is presented in equirectangular projection with size Xm and Ym. To do this, I place my panoramic camera at position say A and took an image, then move camera H meter up and took another image.
I know that a point in image 1 with coordinate of X1,Y1 is the same point on image 2 with coordinate X2 and Y2(assuming that X1=X2 as we have only vertical offset).
My question is that How I can calculate the range of selected of point (the point that know its X1and Y1 is on image 1 and its position on image 2 is X2 and Y2 from the Point A (where camera was when image no 1 was taken.).
Yes, you can do it - hold on!!!
Key thing y = focal length of your lens - now I can do it!!!
So, I think your question can be re-stated more simply by saying that if you move your camera (on the right in the diagram) up H metres, a point moves down p pixels in the image taken from the new location.
Like this if you imagine looking from the side, across you taking the picture.
If you know the micron spacing of the camera's CCD from its specification, you can convert p from pixels to metres to match the units of H.
Your range from the camera to the plane of the scene is given by x + y (both in red at the bottom), and
x=H/tan(alpha)
y=p/tan(alpha)
so your range is
R = x + y = H/tan(alpha) + p/tan(alpha)
and
alpha = tan inverse(p/y)
where y is the focal length of your lens. As y is likely to be something like 50mm, it is negligible, so, to a pretty reasonable approximation, your range is
H/tan(alpha)
and
alpha = tan inverse(p in metres/focal length)
Or, by similar triangles
Range = H x focal length of lens
--------------------------------
(Y2-Y1) x CCD photosite spacing
being very careful to put everything in metres.
Here is a shot in the dark, given my understanding of the problem at hand you want to do something similar to computer stereo vision, I point you to http://en.wikipedia.org/wiki/Computer_stereo_vision to start. Not sure if this is still possible to do in the manner you are suggesting, it sounds like you may need some more physical constraints but I do remember being able to correlate two 2d points in images after undergoing a strict translation. Think :
lambda[x,y,1]^t = W[r1, tx;r2, ty;ry, tz][x; y; z; 1]^t
Where lamda is a scale factor, W is a 3x3 matrix covering the intrinsic parameters of your camera, r1, r2, and r3 are row vectors that make up the 3x3 rotation matrix (in your case you can assume the identity matrix since you have only applied a translation), and tx, ty, tz which are your translation components.
Since you are looking at two 2d points at the same 3d point [x,y,z] this 3d point is shared by both 2d points. I cannot say if you can rationalize the actual x,y, and z values particularly for your depth calculation but this is where I would start.
Please anyone help me to resolve my issue. I am working on image processing based project and I stuck at a point. I got this image after some processing and for further processing i need to crop or detect only deer and remove other portion of image.
This is my Initial image:
And my result should be something like this:
It will be more better if I get only a single biggest blob in the image and save it as a image.
It looks like the deer in your image is pretty much connected and closed. What we can do is use regionprops to find all of the bounding boxes in your image. Once we do this, we can find the bounding box that gives the largest area, which will presumably be your deer. Once we find this bounding box, we can crop your image and focus on the deer entirely. As such, assuming your image is stored in im, do this:
im = im2bw(im); %// Just in case...
bound = regionprops(im, 'BoundingBox', 'Area');
%// Obtaining Bounding Box co-ordinates
bboxes = reshape([bound.BoundingBox], 4, []).';
%// Obtain the areas within each bounding box
areas = [bound.Area].';
%// Figure out which bounding box has the maximum area
[~,maxInd] = max(areas);
%// Obtain this bounding box
%// Ensure all floating point is removed
finalBB = floor(bboxes(maxInd,:));
%// Crop the image
out = im(finalBB(2):finalBB(2)+finalBB(4), finalBB(1):finalBB(1)+finalBB(3));
%// Show the images
figure;
subplot(1,2,1);
imshow(im);
subplot(1,2,2);
imshow(out);
Let's go through this code slowly. We first convert the image to binary just in case. Your image may be an RGB image with intensities of 0 or 255... I can't say for sure, so let's just do a binary conversion just in case. We then call regionprops with the BoundingBox property to find every bounding box of every unique object in the image. This bounding box is the minimum spanning bounding box to ensure that the object is contained within it. Each bounding box is a 4 element array that is structured like so:
[x y w h]
Each bounding box is delineated by its origin at the top left corner of the box, denoted as x and y, where x is the horizontal co-ordinate while y is the vertical co-ordinate. x increases positively from left to right, while y increases positively from top to bottom. w,h are the width and height of the bounding box. Because these points are in a structure, I extract them and place them into a single 1D vector, then reshape it so that it becomes a M x 4 matrix. Bear in mind that this is the only way that I know of that can extract values in arrays for each structuring element efficiently without any for loops. This will facilitate our searching to be quicker. I have also done the same for the Area property. For each bounding box we have in our image, we also have the attribute of the total area encapsulated within the bounding box.
Thanks to #Shai for the spot, we can't simply use the bounding box co-ordinates to determine whether or not something has the biggest area within it as we could have a thin diagonal line that could drive the bounding box co-ordinates to be higher. As such, we also need to rely on the total area that the object takes up within the bounding box as well. Simply put, it's just the sum of all of the pixels that are contained within the object.
Therefore, we search the entire area vector that we have created to see which has the maximum area. This corresponds to your deer. Once we find this location, extract the bounding box locations, then use this to crop the image. Bear in mind that the bounding box values may have floating point numbers. As the image co-ordinates are in integer based, we need to remove these floating point values before we decide to crop. I decided to use floor. I then write code that displays the original image, with the cropped result.
Bear in mind that this will only work if there is just one object in the image. If you want to find multiple objects, check bwboundaries in MATLAB. Otherwise, I believe this should get you started.
Just for completeness, we get the following result:
While object detection is a very general CV task, you can start with something simple if the assumptions are strong enough and you can guarantee that the input images will contain a single prominent white blob well described by a bounding box.
One very simple idea is to subdivide the picture in 3x3=9 patches, calculate the statistics for each patch and compute some objective function. In the most simple case you just do a grid search over various partitions and select that with the highest objective metric. Here's an illustration:
If every line is a parameter x_1, x_2, y_1 and y_2, then you want to optimize
either by
grid search (try all x_i, y_i in some quantization steps)
genetic-algorithm-like random search
gradient descent (move every parameter in that direction that optimizes the target function)
The target function F can be define over statistics of the patches, e.g. like this
F(9 patches) {
brightest_patch = max(patches)
others = patches \ brightest_patch
score = brightness(brightest_patch) - 1/8 * brightness(others)
return score
}
or anything else that incorporates relevant statistics of the patches as well as their size. This also allows to incorporate a "prior knowledge": if you expect the blob to appear in the middle of the image, then you can define a "regularization" term that will penalize F if the parameters x_i and y_i deviate from the expected position too much.
Thanks to all who answer and comment on my Question. With your help I got my exact solution. I am posting my final code and result for others.
img = im2bw(imread('deer.png'));
[L, num] = bwlabel(img, 4);
%%// Get biggest blob or object
count_pixels_per_obj = sum(bsxfun(#eq,L(:),1:num));
[~,ind] = max(count_pixels_per_obj);
biggest_blob = (L==ind);
%%// crop only deer
bound = regionprops(biggest_blob, 'BoundingBox');
%// Obtaining Bounding Box co-ordinates
bboxes = reshape([bound.BoundingBox], 4, []).';
%// Obtain this bounding box
%// Ensure all floating point is removed
finalBB = floor(bboxes);
out = biggest_blob(finalBB(2):finalBB(2)+finalBB(4),finalBB(1):finalBB(1)+finalBB(3));
%%// Show images
figure;
imshow(out);
This is quite complicated to explain, so I will do my best, sorry if there is anything I missed out, let me know and I will rectify it.
My question is, I have been tasked to draw this shape,
(source: learnersdictionary.com)
This is to be done using C++ to write code that will calculate the points on this shape.
Important details.
User Input - Centre Point (X, Y), number of points to be shown, Font Size (influences radius)
Output - List of co-ordinates on the shape.
The overall aim once I have the points is to put them into a graph on Excel and it will hopefully draw it for me, at the user inputted size!
I know that the maximum Radius is 165mm and the minimum is 35mm. I have decided that my base [Font Size][1] shall be 20. I then did some thinking and came up with the equation.
Radius = (Chosen Font Size/20)*130. This is just an estimation, I realise it probably not right, but I thought it could work at least as a template.
I then decided that I should create two different circles, with two different centre points, then link them together to create the shape. I thought that the INSIDE line will have to have a larger Radius and a centre point further along the X-Axis (Y staying constant), as then it could cut into the outside line.*
*(I know this is not what it looks like on the picture, just my chain of thought as it will still give the same shape)
So I defined 2nd Centre point as (X+4, Y). (Again, just estimation, thought it doesn't really matter how far apart they are).
I then decided Radius 2 = (Chosen Font Size/20)*165 (max radius)
So, I have my 2 Radii, and two centre points.
This is my code so far (it works, and everything is declared/inputted above)
for(int i=0; i<=n; i++) //output displayed to user
{
Xnew = -i*(Y+R1)/n; //calculate x coordinate
Ynew = pow((((Y+R1)*(Y+R1)) - (Xnew*Xnew)), 0.5); //calculate y coordinate
AND
for(int j=0; j<=n; j++)//calculation for angles and output displayed to user
{
Xnew2 = -j*(Y+R2)/((n)+((0.00001)*(n==0))); //calculate x coordinate
Ynew2 = Y*(pow(abs(1-(pow((Xnew2/X),2))),0.5));
if(abs(Ynew2) <= R1)
cout<<"\n("<<Xnew2<<", "<<Ynew2<<")"<<endl;
I am having the problem drawing the crescent moon that I cannot get the two circles to have the same starting point?
I have managed to get the results to Excel. Everything in that regard works. But when i plot the points on a graph on Excel, they do not have the same starting points. Its essentially just two half circles, one smaller than the other (Stops at the Y axis, giving the half doughnut shape).
If this makes sense, I am trying to get two parts of circles to draw the shape as such that they have the same start and end points.
If anyone has any suggestions on how to do this, it would be great, currently all I am getting more a 'half doughnut' shape, due to the circles not being connected.
So. Does anyone have any hints/tips/links they can share with me on how to fix this exactly?
Thanks again, any problems with the question, sorry will do my best to rectify if you let me know.
Cheers
Formular for points on a circle:
(x-h)^2+(y-k)^2=r^2
The center of the circle is at (h/k)
Solving for y
2y1 = k +/- sqrt( -x^2 + 2hx +r^2 - h^2)
So now if the inner circle has its center # h/k , the half-moon will begin # h and will stretch to h - r2
Now you need to solve the endpoint formular for the inner circle and the outter circle and plot it. Per x you should receive 4 points (solve the equation two times, each with two solutions)
I did not implement it, but this would be my train of thought...
I was looking at a basic Box2D program, more specifically this one.
Everything is fairly simple and makes sense, except for this line:
Shape.SetAsBox((32.f/2)/SCALE, (32.f/2)/SCALE); // SCALE = 30
Now I know we divide by SCALE to scale 1m->30px but why is 32.f divided by 2? I don't understand why we divide by 2, if my box texture is 32x32 pixels.
from the manual :
groundBox.SetAsBox(50.0f, 10.0f);
The SetAsBox function takes the half-width and half-height (extents)
It is because the box is created around the center (0,0).
So,
x = (32.f/2)/SCALE;
y = (32.f/2)/SCALE
SetAsBox(x,y);
will create box with corners at (-x, -y), (-x, y), (x, -y), (x, y), so it will be of expected size.
If you read the manual section 2.2 : http://www.box2d.org/manual.html#_Toc258082968
The SetAsBox function takes the half-width and half-height (extents)
The consider the extend ("50 m on each direction") and not the width ("100m wide"). Hence the factor 2.