Let's say we have a triangle within an image. We zoom into the image, where the center of the zoom is where our cursor is.
The triangle needs to translate and scale along with the zoom of the image.
For example, in the orginal unzoomed image I have the points:
original image triangle: (212,162) , (172,162) , (192,122
Then, after zooming in, we get the points:
2x zoom triangle: (231,173) , (151, 173) , (191,93)
Here is some information I know. The offset for the x and y from the original image to the new image are 97 and 76 respectively. And the image scaled by a factor of 2. Also, the actual image size, the x and y number of pixels, remains the same.
I am able to correctly calculate the new point's location based on the original frame's points using
x = (og_x-ZoomOffsetX)*ZoomLevel + ZoomLevel/2;
y = (og_y-ZoomOffsetY)*ZoomLevel + ZoomLevel/2;
where og_x, og_y are x and y in the original frame, offsetX and Y are the offsets based on where we are zoomed in on the frame (relative to the original image), and ZoomLevel is the factor by which we are zoomed (relative to the original image) which ascends 2,4,8...
Then, the next set of points are
4x zoom triangle: (218,222), (58,222), (138, 62)
where the zoom is now at 4x from the original and the x and y offset are 158 and 107 respectively, relative to the original.
Then,
8x zoom triangle: (236,340), (-84,340), (76, 20)
where the zoom is now at 8x the original and the x and y offset are 183 and 120 respectively.
What do I need to know/ what parameters do I need, to give the new (x,y) coordinates of the now scaled and translated (due to the zoom) triangle, based only on the immediately previous image? i.e. for the 8x zoom, based on the 4x zoom vs for the 8x zoom based on the original image. I can't figure it out with the information that I have.
Note: I am actually not positive whether the offset is relative to the original image or the prior image.. I am reading someone else's code and trying to understand it. ZoomLevel is definitely relative to the original image though.
Also, if it helps come up with a solution, this is all written in cpp, this zooming is being done in a qt widget, where the points are defined using QPointF from QT
These three links answered my question thoroughly.
https://medium.com/#benjamin.botto/zooming-at-the-mouse-coordinates-with-affine-transformations-86e7312fd50b
Zoom in on a fixed point using matrices
Zoom in on a point (using scale and translate)
Related
I have a recorded camera ROS bag file from a Realsense camera. The camera intrinsics for the recorded setting in already know. The initial resolution of the image is 848*480. Because of some visual obstruction in the FOV of the camera I would like to crop out the top of the image so it doesn't gets detected with Visual SLAM Algorithm I am using.
Since SLAM is heavily dependent on the Camera Intrinsics, I would like to know how will the camera parameters f_x, f_y, c_x and c_y change for :
Cropped Image
Resized Image (Image Scaling only)
There is no skew involved in the original camera parameters.
Will the new pricipal point c_x also change as Cropped_image_width?
I am bit confused as to how to calculate the new camera parameters ? Am I correct in assuming the following for the Case 1 - Cropped Case :
Cropping:
cx,cy reduced by the amount of pixels cropped off the left/top. Cropping off of right/bottom edges has no effect.
Scaling:
fx,fy are multiplied by the scaling factor
cx,cy are multiplied by the scaling factor
Remember, the principal point need not be in the center of the image.
Assuming top left image origin. Off-by-one/half (pixel) errors to be checked carefully against the specific scaling algorithm used, or just ignored.
For the case no. 1, cx is the same as the original image, but cy changes as newHeight/2.
For the case no. 2, f changes as fxscale.
I'm actually realising a C++ raytracer and I'm confronting a classic problem on raytracing. When putting a high vertical FOV, the shapes get a bigger distortion the nearer they are from the edges.
I know why this distortion happens, but I don't know to resolve it (of course, reducing the FOV is an option but I think that there is something to change in my code). I've been browsing different computing forums but didn't find any way to resolve it.
Here's a screenshot to illustrate my problem.
I think that the problem is that the view plane where I'm projecting my rays isn't actually flat, but I don't know how to resolve this. If you have any tip to resolve it, I'm open to suggestions.
I'm on a right-handed oriented system.
The Camera system vectors, Direction vector and Light vector are normalized.
If you need some code to check something, I'll put it in an answer with the part you ask.
code of ray generation :
// PixelScreenX = (pixelx + 0.5) / imageWidth
// PixelCameraX = (2 ∗ PixelScreenx − 1) ∗
// ImageAspectRatio ∗ tan(fov / 2)
float x = (2 * (i + 0.5f) / (float)options.width - 1) *
options.imageAspectRatio * options.scale;
// PixelScreeny = (pixely + 0.5) / imageHeight
// PixelCameraY = (1 − 2 ∗ PixelScreeny) ∗ tan(fov / 2)
float y = (1 - 2 * (j + 0.5f) / (float)options.height) * options.scale;
Vec3f dir;
options.cameraToWorld.multDirMatrix(Vec3f(x, y, -1), dir);
dir.normalize();
newColor = _renderer->castRay(options.orig, dir, objects, options);
There is nothing wrong with your projection. It produces exactly what it should produce.
Let's consider the following figure to see how all the quantities interact:
We have the camera position, the field of view (as an angle) and the image plane. The image plane is the plane that you are projecting your 3D scene onto. Essentially, this represents your screen. When you are viewing your rendering on the screen, your eye serves as the camera. It sees the projected image and if it is positioned at the right point, it will see exactly what it would see if the actual 3D scene was there (neglecting effects like depth of field etc.)
Obviously, you cannot modify your screen (you could change the window size but let's stick with a constant-size image plane). Then, there is a direct relationship between the camera's position and the field of view. As the field of view increases, the camera moves closer and closer to the image plane. Like this:
Thus, if you are increasing your field of view in code, you need to move your eye closer to the screen to get the correct perception. You can actually try that with your image. Move your eye very close to the screen (I'm talking about 3cm). If you look at the outer spheres now, they actually look like real balls again.
In summary, the field of view should approximately match the geometry of the viewing setup. For a given screen size and average watch distance, this can be calculated easily. If your viewing setup does not match your assumptions in code, 3D perception will break down.
Assume that I took two panoramic image with vertical offset of H and each image is presented in equirectangular projection with size Xm and Ym. To do this, I place my panoramic camera at position say A and took an image, then move camera H meter up and took another image.
I know that a point in image 1 with coordinate of X1,Y1 is the same point on image 2 with coordinate X2 and Y2(assuming that X1=X2 as we have only vertical offset).
My question is that How I can calculate the range of selected of point (the point that know its X1and Y1 is on image 1 and its position on image 2 is X2 and Y2 from the Point A (where camera was when image no 1 was taken.).
Yes, you can do it - hold on!!!
Key thing y = focal length of your lens - now I can do it!!!
So, I think your question can be re-stated more simply by saying that if you move your camera (on the right in the diagram) up H metres, a point moves down p pixels in the image taken from the new location.
Like this if you imagine looking from the side, across you taking the picture.
If you know the micron spacing of the camera's CCD from its specification, you can convert p from pixels to metres to match the units of H.
Your range from the camera to the plane of the scene is given by x + y (both in red at the bottom), and
x=H/tan(alpha)
y=p/tan(alpha)
so your range is
R = x + y = H/tan(alpha) + p/tan(alpha)
and
alpha = tan inverse(p/y)
where y is the focal length of your lens. As y is likely to be something like 50mm, it is negligible, so, to a pretty reasonable approximation, your range is
H/tan(alpha)
and
alpha = tan inverse(p in metres/focal length)
Or, by similar triangles
Range = H x focal length of lens
--------------------------------
(Y2-Y1) x CCD photosite spacing
being very careful to put everything in metres.
Here is a shot in the dark, given my understanding of the problem at hand you want to do something similar to computer stereo vision, I point you to http://en.wikipedia.org/wiki/Computer_stereo_vision to start. Not sure if this is still possible to do in the manner you are suggesting, it sounds like you may need some more physical constraints but I do remember being able to correlate two 2d points in images after undergoing a strict translation. Think :
lambda[x,y,1]^t = W[r1, tx;r2, ty;ry, tz][x; y; z; 1]^t
Where lamda is a scale factor, W is a 3x3 matrix covering the intrinsic parameters of your camera, r1, r2, and r3 are row vectors that make up the 3x3 rotation matrix (in your case you can assume the identity matrix since you have only applied a translation), and tx, ty, tz which are your translation components.
Since you are looking at two 2d points at the same 3d point [x,y,z] this 3d point is shared by both 2d points. I cannot say if you can rationalize the actual x,y, and z values particularly for your depth calculation but this is where I would start.
I have an app that finds an object in a frame and uses warpPerspective to correct the image to be square. In the course of doing so you specify an output image size. However, I want to know how to do so without harming its apparent size. How can I unwarp the 4-corners of the image without changing the size of the image? I don't need the image itself, I just want to measure its height and width in pixels within the original image.
Get a transform matrix that will square up the corners.
std::vector<cv::Point2f> transformedPoints;
cv::Mat M = cv::getPerspectiveTransform(points, objectCorners);
cv::perspectiveTransform(points, transformedPoints, M);
This will square up the image, but in terms of the objectCorners coordinate system. Which is -0.5f to 0.5f not the original image plane.
BoundingRect almost does what I want.
cv::Rect boundingRectangle = cv::boundingRect(points);
But as the documentation states
The function calculates and returns the minimal up-right bounding rectangle for the specified point set.
And what I want is the bounding rectangle after it has been squared-up, not without squaring it up.
According to my understanding to your post, here is something which should help you.
OpenCV perspective transform example.
Update if it still doesn't help you out in finding the height and width within the image
Minimum bounding rect of the points
cv::RotatedRect box = cv::minAreaRect(cv::Mat(points));
As the minAreaRect reference on OpenCV's website states
Finds a rotated rectangle of the minimum area enclosing the input 2D point set.
You can call box.size and get the width and height.
I have an image (let's say it's a simple rectangle) positioned on the left of my screen, which I can move up and down. When moving it upwards, I use some simple trigonometry to rotate it so that the rectangle "points" towards the upper right corner of the screen. When moving downwards, it points towards the lower left corner of the screen.
Given that my application uses the following coordinate system:
I use the following code to achieve the rotation:
// moving upwards
rotation = -atan2(position.y , res.x - position.x));
// moving downwards
rotation = atan2(res.y - position.y , res.x - position.x));
where res is the reference point and position is the position (upper left corner) of our rectangle image. (For information on atan2(): atan2() on cplusplus.com).
This works just fine: it rotates more when farther away from the reference point (res). However, let's say the image is all the way at the bottom of the screen. If we move it upwards, it will very suddenly rotate. I would like to 'inbetween' this rotation, so that it is smoothened out.
What I mean by suddenly rotating is this:
Let's say the rectangle is not moving in frame n: therefore its rotation is 0 degrees. I then press the up arrow, which makes it calculate the angle. In frame n+1, the angle is 30 degrees (for example). This is ofcourse not very smooth.
Is my question clear? How do I go about this?
You can incrementally change the angle on each frame. For a very "smooth" rotation effect, you can use
target_angle = ...
current_angle += (target_angle - current_angle) * smoothing_factor
where smoothing_factor gives the rate at which current_angle should converge to target_angle. For example, a value of 1 would be instantaneous, a value of 0.1 would probably give a smooth effect.
By doing this you may encounter the wraparound issue whereby something like going from 10 degrees to 350 degrees would go the wrong way. In such a case, use
target_angle = ...
current_angle += diff(target_angle, current_angle) * smoothing_factor
where
diff(a, b) {
return atan2(sin(a - b), cos(a - b))
}
This nice angle difference formula is taken from another question.