Why is iplRotate() not giving me correct results? - c++

sigh I'm sorry to say that I'm using Intel IPL (Image Processing Library) in some image processing code I'm working on. This is the tale of my struggle with getting my images to rotate correctly.
I have a source image. It has a size (w, h) which is not necessarily square.
It is going to be rotated by angle theta.
I've calculated the output size required to fit an image of size (w, h) rotated by angle theta. This size is (dw, dh). I've allocated a destination buffer with that size.
I want to rotate the source image by angle theta about the source image's center (w/2, h/2) and have that rotated image be centered in my destination buffer.
iplRotate() takes 2 shift parameters, xShift and yShift, which indicate the distance the image should be shifted along the x and y axis after the rotate is performed.
The problem is I cannot get iplRotate to center the rotated image in the destination image. It's always off center.
My best guess for what xShift and yShift should be is the following:
xShift = dw - w
yShift = dh - h
But this doesn't work, and I'm not sure what else to do to calculate xShift and yShift. Does anyone have any suggestions for how to use iplRotate to do what I want?
One last bit of info:
I've attempted to use iplGetRotateShift() to calculate xShift and yShift, again, to no avail. I would imagine that this would work:
iplGetRotateShift(dw / 2.0, dh / 2.0, theta, &xShift, &yShift);
But it does not.
Edit:
I rewrote the code using Intel IPP 6.0 instead of IPL and I'm seeing identical wrong results. I can't imagine that Intel got rotation wrong in 2 different libraries, so I must be doing something wrong.
Edit:
I tried the following (IPP) code that Dani van der Meer suggested:
xShift = (dw - w) / 2.0;
yShift = (dh - h) / 2.0;
ippiAddRotateShift(w / 2.0, h / 2.0, angle, &xShift, &yShift);
Unfortunately, still no luck. That does not work either.

When using iplGetRotateShift you need to specify the center of rotation in the source image. This will work well if the size of the source and destination image is the same.
In your case you want an extra shift to center the image in your destination image:
xShift = (dw - w) / 2.0;
yShift = (dh - h) / 2.0;
To combine the two shift you need to use ippiAddRotateShift instead of ippiGetRotateShift.
Note: These functions refer to the IPP library version 5.3 (the version I have). I am not sure that AddRotateShift is available in IPL. But you mentioned in the question that you tried the same using IPP, so hopefully you can use IPP instead of IPL.
You get something like this
xShift = (dw - w) / 2.0;
yShift = (dh - h) / 2.0;
ippiAddRotateShift(w / 2.0, h / 2.0, angle, &xShift, &yShift);
If you use these shifts in the call to ippiRotate the image should be centered in the destination image.
I hope this helps.
EDIT:
Here is the code I used to test (the change from w to dw and h to dh and the rotation angle are just random):
//Ipp8u* dst_p; Initialized somewhere else in the code
//Ipp8u* src_p; Initialized somewhere else in the code
int w = 1032;
int h = 778;
int dw = w - 40; // -40 is just a random change
int dh = h + 200; // 200 is just a random change
int src_step = w * 3;
int dst_step = dw * 3;
IppiSize src_size = { w, h };
IppiRect src_roi = { 0, 0, w, h };
IppiRect dst_rect = { 0, 0, dw, dh };
double xShift = ((double)dw - (double)w) / 2.0;
double yShift = ((double)dh - (double)h) / 2.0;
ippiAddRotateShift((double)w / 2, (double)h / 2, 37.0, &xShift, &yShift);
ippiRotate_8u_C3R(src_p, src_size, src_step, src_roi,
dst_p, dst_step, dst_rect, 37.0, xShift, yShift, IPPI_INTER_NN);

I've never used (or heard of) IPL before, so I'm just merely guessing what the API does. But if iplRotate rotates about (0, 0) and if iplGetRotateShift does similarly, why not try rotating the other 3 corners of your original "box" (ignoring (0, 0) since that stays put): (w, 0), (0, h), and (w, h).
Your result will be a new box with some negative values. You want to "shift back" so whatever negative values you have will become zero, if you get what I mean.

Based on my reading of the documentation I think you're using iplGetRotateShift wrong. In particular, I think you need to specify the center of rotation in the source image, not the destination image, thus:
iplGetRotateShift( w / 2.0, h / 2.0, angle, &xShift, &yShift );

If it's still not working for you then can we confirm that the assumptions are valid? In particular point 3. where you calculate dw and dh.
Mathematically
dw = w * |cos(theta)| + h * |sin(theta)|
dh = h * |cos(theta)| + w * |sin(theta)|
so if theta = pi/6 say, then if w = 100 and h = 150 then presumably dw = 162?
Does the incorrect position you're getting vary with theta? Presumably it works with theta=0? What about theta=pi/2 and theta=pi/4?

Related

Converting Cartesian image to polar, appearance differences

I'm trying to do a polar transform on the first image below and end up with the second. However my result is the third image. I have a feeling it has to do with what location I choose as my "origin" but am unsure.
radius = sqrt(width**2 + height**2)
nheight = int(ceil(radius)/2)
nwidth = int(ceil(radius/2))
for y in range(0, height):
for x in range(0, width):
t = int(atan(y/x))
r = int(sqrt(x**2+y**2)/2)
color = getColor(getPixel(pic, x, y))
setColor( getPixel(radial,r,t), color)
There are a few differences / errors:
They use the centre of the image as the origin
They scale the axis appropriately. In your example, you're plotting your angle (between 0 and in your case, pi), instead of utilising the full height of the image.
You're using the wrong atan function (atan2 works a lot better in this situation :))
Not amazingly important, but you're rounding unnecessarily quite a lot, which throws off accuracy a little and can slow things down.
This is the code combining my suggested improvements. It's not massively efficient, but it should hopefully work :)
maxradius = sqrt(width**2 + height**2)/2
rscale = width / maxradius
tscale = height / (2*math.pi)
for y in range(0, height):
dy = y - height/2
for x in range(0, width):
dx = x - width/2
t = atan2(dy,dx)%(2*math.pi)
r = sqrt(dx**2+dy**2)
color = getColor(getPixel(pic, x, y))
setColor( getPixel(radial,int(r*rscale),int(t*tscale)), color)
In particular, it fixes the above problems in the following ways:
We use dx = x - width / 2 as a measure of distance from the centre, and similarly with dy. We then use these in replace of x, y throughout the computation.
We will have our r satisfying 0 <= r <= sqrt( (width/2)^2 +(height/2)^2 ), and our t eventually satisfying 0 < t <= 2 pi so, I create the appropriate scale factors to put r and t along the x and y axes respectively.
Normal atan can only distinguish based on gradients, and is computationally unstable near vertical lines... Instead, atan2 (see http://en.wikipedia.org/wiki/Atan2) solves both problems, and accepts (y,x) pairs to give an angle. atan2 returns an angle -pi < t <= pi, so we can find the remainder modulo 2 * math.pi to it to get it in the range 0 < t <= 2pi ready for scaling.
I've only rounded at the end, when the new pixels get set.
Any questions, just ask!

Determining coordinates for mandelbrot zoom

I got a mandelbrot set I want to zoom in. The mandelbrot is calculated around a center coordinate, mandelbrot size and a zoom-level. The original mandelbrot is centered around
real=-0.6 and im=0.4 with a size of 2 in both real and im.
I want to be able to click on a point in the image and calculate a new one, zoomed in around that point
The window containing it is 800x800px, so I figured this would make a click in the lower right corner be equal to a center of real=0.4 and im=-0.6, and a click in the upper left corner be real=-1.6 and im=1.4
I calculated it with:
for the real values
800a+b=0.4 => a=0.0025
0a+b=-1.6 => b=-1.6
for imaginary values
800c+d=-0.6 => c=-0.0025
0c+d=1.4 => d=1.4
However, this does not work if I continue with mandelbrot size of 2 and zoom-level of 2. Am I missing something concerning the coordinates with the zoom-levels?
I had similar problems zooming in my C# Mandelbrot. My solution was to calculate the difference from the click position to the center in percents, multiply this with the maximum of units (width / zoom * 0.5, width = height, zoom = n * 100) from the center and add this to your current value. So My code was this (assuming I get sx and sy as parameters from the click):
double[] o = new double[2];
double digressLRUD = width / zoom * 0.5; //max way up or down from the center in coordinates
double shiftCenterCursor_X = sx - width/2.0; //shift of cursor to center
double shiftCenterCursor_X_percentage = shiftCenterCursor_X / width/2.0; //shift in percentage
o[0] = x + digressLRUD * shiftCenterCursor_X_percentage; //new position
double shiftCenterCursor_Y = sy - width/2.0;
double shiftCenterCursor_Y_percentage = shiftCenterCursor_Y / width/2.0;
o[1] = y - digressLRUD * shiftCenterCursor_Y_percentage;
This works, but you'll have to update the zoom (I use to multiply it with 2).
Another point is to move the selected center to the center of the image. I did this using some calculations:
double maxRe = width / zoom;
double centerRe = reC - maxRe * 0.5;
double maxIm = height / zoom;
double centerIm = -imC - maxIm * 0.5;
This will bring you the coordinates you have to pass your algorithm so it'll render the selected place.

Ray Tracing: Sphere distortion due to Camera Movement

I am building a ray Tracer from scratch. My question is:
When I change camera coordinates the Sphere changes to ellipse. I don't understand why it's happening.
Here are some images to show the artifacts:
Sphere: 1 1 -1 1.0 (Center, radius)
Camera: 0 0 5 0 0 0 0 1 0 45.0 1.0 (eyepos, lookat, up, foy, aspect)
But when I changed camera coordinate, the sphere looks distorted as shown below:
Camera: -2 -2 2 0 0 0 0 1 0 45.0 1.0
I don't understand what is wrong. If someone can help that would be great!
I set my imagePlane as follows:
//Computing u,v,w axes coordinates of Camera as follows:
{
Vector a = Normalize(eye - lookat); //Camera_eye - Camera_lookAt
Vector b = up; //Camera Up Vector
m_w = a;
m_u = b.cross(m_w);
m_u.normalize();
m_v = m_w.cross(m_u);
}
After that I compute directions for each pixel from the Camera position (eye) as mentioned below:
//Then Computing direction as follows:
int half_w = m_width * 0.5;
int half_h = m_height * 0.5;
double half_fy = fovy() * 0.5;
double angle = tan( ( M_PI * half_fy) / (double)180.0 );
for(int k=0; k<pixels.size(); k++){
double j = pixels[k].x(); //width
double i = pixels[k].y(); //height
double XX = aspect() * angle * ( (j - half_w ) / (double)half_w );
double YY = angle * ( (half_h - i ) / (double)half_h );
Vector dir = (m_u * XX + m_v * YY) - m_w ;
directions.push_back(dir);
}
After that:
for each dir:
Ray ray(eye, dir);
int depth = 0;
t_color += Trace(g_primitive, ray, depth);
After playing a lot and with the help of the comments of all you guys I was able to create successfully my rayTracer properly. Sorry for answering late, but I would like to close this thread with few remarks.
So, the above mentioned code is perfectly correct. Based on my own assumptions (as mentioned in above comments) I have decided to set my Camera parameters like that.
The problem I mentioned above is a normal behaviour of the camera (as also mentioned above in the comments).
I have got good results now but there are few things to check while coding a rayTracer:
Always make sure to take care of Radians to Degrees (or vice versa) conversion while computing FOV and ASPECT RATIO. I did it as follows:
double angle = tan((M_PI * 0.5 * fovy) / 180.0);
double y = angle;
double x = aspect * angle;
2) While computing Triangle intersections, make sure to implement cross product properly.
3) While using intersections of different objects make sure to find the intersection which is at a minimum distance from the camera.
Here's the result I got:
Above is a very simple model (courtesy UCBerkeley), which I rayTraced.
This is the correct behavior. Get a camera with a wide angle lens, put the sphere near the edge of the field of view and take a picture. Then in a photo app draw a circle on top of the photo of the sphere and you will see that it's not a circular projection.
This effect will be magnified by the fact that you set aspect to 1.0 but your image is not square.
A few things to fix:
A direction vector is (to - from). You have (from - to), so a is pointing backward. You'll want to add m_w at the end, rather than subtract it. Also, this fix will rotate your m_u,m_v by 180 degrees, which will make you about to change (j - half_w) to (half_w - j).
Also, putting all the pixels and all the directions in lists is not as efficient as just looping over x,y values.

How to obtain the scale and rotation angle from LogPolar transform

I'm trying to use LogPolar transform to obtain the scale and the rotation angle from two images. Below are two 300x300 sample images. The first rectangle is 100x100, and the second rectangle is 150x150, rotated by 45 degree.
The algorithm:
Convert both images to LogPolar.
Find the translational shift using Phase Correlation.
Convert the translational shift to scale and rotation angle (how to do this?).
My code:
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/imgproc/imgproc_c.h>
#include <opencv2/highgui/highgui.hpp>
#include <iostream>
int main()
{
cv::Mat a = cv::imread("rect1.png", 0);
cv::Mat b = cv::imread("rect2.png", 0);
if (a.empty() || b.empty())
return -1;
cv::imshow("a", a);
cv::imshow("b", b);
cv::Mat pa = cv::Mat::zeros(a.size(), CV_8UC1);
cv::Mat pb = cv::Mat::zeros(b.size(), CV_8UC1);
IplImage ipl_a = a, ipl_pa = pa;
IplImage ipl_b = b, ipl_pb = pb;
cvLogPolar(&ipl_a, &ipl_pa, cvPoint2D32f(a.cols >> 1, a.rows >> 1), 40);
cvLogPolar(&ipl_b, &ipl_pb, cvPoint2D32f(b.cols >> 1, b.rows >> 1), 40);
cv::imshow("logpolar a", pa);
cv::imshow("logpolar b", pb);
cv::Mat pa_64f, pb_64f;
pa.convertTo(pa_64f, CV_64F);
pb.convertTo(pb_64f, CV_64F);
cv::Point2d pt = cv::phaseCorrelate(pa_64f, pb_64f);
std::cout << "Shift = " << pt
<< "Rotation = " << cv::format("%.2f", pt.y*180/(a.cols >> 1))
<< std::endl;
cv::waitKey(0);
return 0;
}
The log polar images:
For the sample image images above, the translational shift is (16.2986, 36.9105). I have successfully obtain the rotation angle, which is 44.29. But I have difficulty in calculating the scale. How to convert the given translational shift to obtain the scale?
You have two Images f1, f2 with f1(m, n) = f2(m/a , n/a) That is f1 is scaled by factor a
In logarithmic notation that is equivalent to f1(log m, log n) = f2(logm − log a, log n − log a) where log a is the shift in your phasecorrelated image.
Compare B. S. Reddy, B. N. Chatterji: An FFT-Based Technique for Translation, Rotation and
Scale-Invariant Image Registration, IEEE Transactions On Image Processing Vol. 5
No. 8, IEEE, 1996
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.185.4387&rep=rep1&type=pdf
here is python version
which tells
ir = abs(ifft2((f0 * f1.conjugate()) / r0))
i0, i1 = numpy.unravel_index(numpy.argmax(ir), ir.shape)
angle = 180.0 * i0 / ir.shape[0]
scale = log_base ** i1
The value for the scale factor is indeed exp(pt.y). However, since you used a "magnitude scale parameter" of 40 for the cvLogPolar function, you now need to divide pt.x by 40 to get the correct value for the displacement:
Scale = exp( pt.x / 40) = exp(16.2986 / 40) = 1.503
The value of the "magnitude scale parameter" for the cvLogPolar function does not affect the displacement produced by the rotation angle pt.x, because according to the math, it cancels out. For that reason, your formula for the rotation gives the correct value.
On another note, I believe the formula for the rotation should actually be:
Rotation = pt.y*360/(a.cols)
But, for some strange reason, the ">> 1" that you added is causing the result to be multiplied by 2 (which I believe you compensated for by multiplying by 180 instead of 360?) Remove it, and you'll see what I mean.
Also, ">>1" is causing a division by 2 in:
cvPoint2D32f(a.cols >> 1, a.rows >> 1)
If to set the center parameter of the cvLogPolar function to the center of the image (which is what you want):
cvPoint2D32f(a.cols/2, a.rows/2)
and
cvPoint2D32f(b.cols/2, b.rows/2)
then, you'll also get the correct value for the rotation (i.e. the same value that you got), and for the scale.
This thread was helpful in getting me started on rotation-invariant phase correlation, so I hope my input will help resolve any lingering issues.
We aim to calculate the scale and rotation (which is incorrectly calculated in the code). Let's start by gathering the equations from the logPolar docs. There they state the following:
(1) I = (dx,dy) = (x-center.x, y-center.y)
(2) rho = M * ln(magnitude(I))
(3) phi = Ky * angle(I)_0..360
Note: rho is pt.x and phi is pt.y in the code above
We also know that
(4) M = src.cols/ln(maxRadius)
(5) Ky = src.rows/360
First, let's solve for scale. Solving for magnitude(I) (i.e. scale) in equation 2, we get
(6) magnitude(I) = scale = exp(rho/M)
Then we substitute for M and simplify to get
(7) magnitude(I) = scale = exp(rho*ln(maxRadius)/src.cols) = pow(maxRadius, rho/src.cols)
Now let's solve for rotation. Solving for angle(I) (i.e. rotation) in equation 3, we get
(8) angle(I) = rotation = phi/Ky
Then we substitute for Ky and simplify to get
(9) angle(I) = rotation = phi*360/src.rows
So, scale and rotation can be calculated using equations 7 and 9, respectively. It might be worth noting that you should use equation 4 for calculation M and Point2f center( (float)a.cols/2, (float)a.rows/2 ) for calculating center as opposed to what is in the code above. There are good bits of info in this logpolar example opencv code.
From the values by phase correlation, the coordinates are rectangular coordinates hence (16.2986, 36.9105) are (x,y). The scale is calculated as
scale = log((x^2 + y^ 2)^0.5) which is approximately 1.6(near to 1.5).
When we calculate angle by using formulae theta = arctan(y/x) = 66(approx).
The theta value is way of the real value(45 in this case).

gluProject on NDS?

I've been struggling with this for a good while now. I'm trying to determine the screen coordinates of the vertexes in a model on the screen of my NDS using devKitPro. The library seems to implement some functionality of OpenGL, but in particular, the gluProject function is missing, which would (I assume) allow me to do just exactly that, easily.
I've been trying for a good while now to calculate the screen coordinates manually using the projection matricies that are stored in the DS's registers, but I haven't been having much luck, even when trying to build the projection matrix from scratch based on OpenGL's documentation. Here is the code I'm trying to use:
void get2DPoint(v16 x, v16 y, v16 z, float &result_x, float &result_y)
{
//Wait for the graphics engine to be ready
/*while (*(int*)(0x04000600) & BIT(27))
continue;*/
//Read in the matrix that we're currently transforming with
double currentMatrix[4][4]; int i;
for (i = 0; i < 16; i++)
currentMatrix[0][i] =
(double(((int*)0x04000640)[i]))/(double(1<<12));
//Now this hurts-- take that matrix, and multiply it by the projection matrix, so we obtain
//proper screen coordinates.
double f = 1.0 / tan(70.0/2.0);
double aspect = 256.0/192.0;
double zNear = 0.1;
double zFar = 40.0;
double projectionMatrix[4][4] =
{
{ (f/aspect), 0.0, 0.0, 0.0 },
{ 0.0, f, 0.0, 0.0 },
{ 0.0, 0.0, ((zFar + zNear) / (zNear - zFar)), ((2*zFar*zNear)/(zNear - zFar)) },
{ 0.0, 0.0, -1.0, 0.0 },
};
double finalMatrix[4][4];
//Ugh...
int mx = 0; int my = 0;
for (my = 0; my < 4; my++)
for (mx = 0; mx < 4; mx++)
finalMatrix[mx][my] =
currentMatrix[my][0] * projectionMatrix[0][mx] +
currentMatrix[my][1] * projectionMatrix[1][mx] +
currentMatrix[my][2] * projectionMatrix[2][mx] +
currentMatrix[my][3] * projectionMatrix[3][mx] ;
double dx = ((double)x) / (double(1<<12));
double dy = ((double)y) / (double(1<<12));
double dz = ((double)z) / (double(1<<12));
result_x = dx*finalMatrix[0][0] + dy*finalMatrix[0][1] + dz*finalMatrix[0][2] + finalMatrix[0][3];
result_y = dx*finalMatrix[1][0] + dy*finalMatrix[1][1] + dz*finalMatrix[1][2] + finalMatrix[1][3];
result_x = ((result_x*1.0) + 4.0)*32.0;
result_y = ((result_y*1.0) + 4.0)*32.0;
printf("Result: %f, %f\n", result_x, result_y);
}
There are lots of shifts involved, the DS works internally using fixed point notation and I need to convert that to doubles to work with. What I'm getting seems to be somewhat correct-- the pixels are translated perfectly if I'm using a flat quad that's facing the screen, but the rotation is wonky. Also, since I'm going by the projection matrix (which accounts for the screen width/height?) the last steps I'm needing to use don't seem right at all. Shouldn't the projection matrix be accomplishing the step up to screen resolution for me?
I'm rather new to all of this, I've got a fair grasp on matrix math, but I'm not as skilled as I would like to be in 3D graphics. Does anyone here know a way, given the 3D, non-transformed coordinates of a model's vertexes, and also given the matricies which will be applied to it, to actually come up with the screen coordinates, without using OpenGL's gluProject function? Can you see something blatantly obvious that I'm missing in my code? (I'll clarify when possible, I know it's rough, this is a prototype I'm working on, cleanliness isn't a high priority)
Thanks a bunch!
PS: As I understand it, currentMatrix, which I pull from the DS's registers, should be giving me the combined projection, translation, and rotation matrix, as it should be the exact matrix that's going to be used for the translation by the DS's own hardware, at least according to the specs at GBATEK. In practise, it doesn't seem to actually have the projection coordinates applied to it, which I suppose has something to do with my issues. But I'm not sure, as calculating the projection myself isn't generating different results.
That is almost correct.
The correct steps are:
Multiply Modelview with Projection matrix (as you've already did).
Extend your 3D vertex to a homogeneous coordinate by adding a W-component with value 1. E.g your (x,y,z)-vector becomes (x,y,z,w) with w = 1.
Multiply this vector with the matrix product. Your matrix should be 4x4 and your vector of size 4. The result will be a vector of size4 as well (don't drop w yet!). The result of this multiplication is your vector in clip-space. FYI: You can already do a couple of very useful things here with this vector: Test if the point is on the screen. The six conditions are:
x &lt -w : Point is outside the screen (left of the viewport)
x &gt W : Point is outside the screen (right of the viewport)
y &lt -w : Point is outside the screen (above the viewport)
y &gt w : Point is outside the screen (below the viewport)
z &lt -w : Point is outside the screen (beyond znear)
z &gt w : Point is outside the screen (beyond zfar)
Project your point into 2D space. To do this divide x and y by w:
x' = x / w;
y' = y / w;
If you're interested in the depth-value (e.g. what gets written to the zbuffer) you can project z as well:
z' = z / w
Note that the previous step won't work if w is zero. This case happends if your point is equal to the camera position. The best you could do in this case is to set x' and y' to zero. (will move the point into the center of the screen in the next step..).
Final Step: Get the OpenGL viewport coordinates and apply it:
x_screen = viewport_left + (x' + 1) * viewport_width * 0.5;
y_screen = viewport_top + (y' + 1) * viewport_height * 0.5;
Important: The y coordinate of your screen may be upside down. Contrary to most other graphic APIs in OpenGL y=0 denotes the bottom of the screen.
That's all.
I'll add some more thoughts to Nils' thorough answer.
don't use doubles. I'm not familiar with NDS, but I doubt it's got any hardware for double math.
I also doubt model view and projection are not already multiplied if you are reading the hardware registers. I have yet to see a hardware platform that does not use the full MVP in the registers directly.
the matrix storage into registers may or may not be in the same order as OpenGL. if they are not, the multiplication matrix-vector needs to be done in the other order.