Offset Euler Angles using rotation matrix - python-2.7

I'm looking for the correct way to apply an offset to a set of Euler rotations. I would like to have a transformation where given a specific set of Euler angles (x1,y1,z1), if I transform them I would get an Euler sequence of (0,0,0). All other Euler sequences (xi,yi,zi) transformed would then act like (x1,y1,z1) were (0,0,0).
Background:
I'm using Oculus DK2 HMD to display a virtual environment (Vizard 5, Python 2.7.5) while using a Vicon motion capture system (Nexus 2) to update the position and orientation. I can get the Euler angles of the HMD (from a marker cluster, not the DK2 gyroscope) from the Vicon system, but when facing my desired (0,0,0) orientation the HMD has a non-zero rotation sequence.
Problem:
I'm having a hard time thinking about what transformation (rotation matrix?) I could find that could take a sequence like (-105, 110, -30) and make it (0,0,0) without being a useless matrix of zeros. If the rotation matrix was all zeros, any sequence would be transformed to (0,0,0)). The formula I have in mind is (please ignore syntax):
[0,0,0] = (3x3)R*(3x1)[-105,110,-30] what is R? R cannot be 3x3 zero matrix.
Attempt:
I stupidly tried to simply subtract the offest euler angles like so:
import viz
viz.go()
navigationNode = viz.addGroup() #create node
viewLink = viz.link(navigationNode, viz.MainView) #link it to the main view
#I am not showing how I get variables HMDX HMDY HMDZ but it's not important for this question
navigationNode.setEuler(HMDRZ-90,HMDRX+180,-1*(HMDRY)+30) #update the view orientation
Alas,
I'm sure this is possible, in fact I've done similar things in the past where I had to transform marker clusters into a rigid body's frame at a calibration position. But I just can't get around the trivial solution in this case (zeros matrix). If by chance anyone would explain this using quaternions, I can also use those.

The right way to do this is to determine the transformation between the hmd orientation and the desired view orientation at a desired point. I call this time zero, even though time has little to do with it, really it's the "home" orientation. Once the transformation is known, it can be used to determine the view orientation using the data from the hmd, since the coordinate frames of the hmd and view can be thought of as mounted on the same rigid body (meaning their relative transformation will not change with time).
Here is the math:
Here is how I coded it (it works!):
import numpy as np
import math
#setup the desired calibration transformation (hmd orientation that will result in looking straight ahead
#Euler angles for the hmd at desired "zero" orientation
a0 = -177.9*math.pi/180
b0 = 31.2*math.pi/180
g0 = 90.4*math.pi/180
#make the matrices
Ra0 = np.matrix([[1,0,0],[0, float(math.cos(a0)),float(-1*math.sin(a0))],[0,float(math.sin(a0)),float(math.cos(a0))]],dtype=np.float)
Rb0 = np.matrix([[math.cos(b0),0,math.sin(b0)],[0,1,0],[-1*math.sin(b0),0,math.cos(b0)]],dtype=np.float)
Rg0 = np.matrix([[math.cos(g0),-1*math.sin(g0),0],[math.sin(g0),math.cos(g0),0],[0,0,1]],dtype=np.float)
#the hmd rotation matrix in the "zero" orientation
global RhmdU0
RhmdU0 = Ra0*Rb0*Rg0
#the view orientation when the hmd is in the "zero" orientation (basically a zero degree turn about the Z axis)
global RdU0
RdU0 = np.matrix([[1,0,0],[0,1,0],[0,0,1]],dtype=np.float)
#this can be called in a loop, inputs are Euler angles of the hmd
def InverseK(at,bt,gt):
global RdU0
global RhmdU0
#given 3 Euler sequence of the hmd in universal space, determine the viewpoint
Rat = np.matrix([[1,0,0],[0, float(math.cos(at)), float(-1*math.sin(at))],[0,float(math.sin(at)),float(math.cos(at))]],dtype=np.float)
Rbt = np.matrix([[math.cos(bt),0,math.sin(bt)],[0,1,0],[-1*math.sin(bt),0,math.cos(bt)]],dtype=np.float)
Rgt = np.matrix([[math.cos(gt),-1*math.sin(gt),0],[math.sin(gt),math.cos(gt),0],[0,0,1]],dtype=np.float)
RhmdUt = Rat*Rbt*Rgt
RdUt = RhmdUt*RhmdU0.transpose()*RdU0
#do inverse K to get euler sequence out:
beta = math.atan2(RdUt[0,2],math.sqrt(RdUt[1,2]**2+RdUt[2,2]**2))
alpha = math.atan2(-1*RdUt[1,2]/math.cos(beta),RdUt[2,2]/math.cos(beta))
gamma = math.atan2(-1*RdUt[0,1]/math.cos(beta),RdUt[0,0]/math.cos(beta))
return(alpha,beta,gamma)

Related

Estimate the camera pose in the reference system using one marker with ARUCO

I am currently working on a camera pose estimation project using only one marker with ARUCO.
I used Aruco's Marker Detector to detect markers and get the marker's Rvec and Tvec. I understand these two vectors represent the transform from the marker to the camera, which is the marker's pose w.r.t camera. I form a 4 by 4 matrix called T_marker_camera using these two vectors.
Then, I set up a world frame (left handed) and get the marker's world pose, which is a 4 by 4 transform matrix.
I want to calculate the pose of the camera w.r.t the world frame, and I use the following formula to calculate it:
T_camera_world = T_marker_world * T_marker_camera_inv
Before I perform the above formula, I convert the OpenCV coordinates to the left handed one (flip the sign of x axis).
However, I didn't get the correct x, y, z of the camera w.r.t the world frame.
What did I miss to get the correct answer?
Thanks
The one equation you gave looks right, so the issue is probably somewhere that you didn't show/describe.
A fix in your notation will help clarify.
Write the pose/source frame on the right (input), the reference/destination frame on the left (output). Then your matrices "match up" like dominos.
rvec and tvec yield a matrix that should be called T_cam_marker.
If you want the pose of your camera in the world frame, that is
T_world_cam = T_world_marker * T_marker_cam
T_world_cam = T_world_marker * inv(T_cam_marker)
(equivalent to what you wrote, but domino)
Be sure that you do matrix multiplication, not element-wise multiplication.
To move between left-handed and right-handed coordinate systems, insert a matrix that maps coordinates accordingly. Frames:
OpenCV camera/screen: right-handed, {X right, Y down, Z far}
ARUCO (in OpenCV anyway): right-handed, {X right, Y far, Z up}, first corner is top left (-X+Y quadrant)
whatever leftie frame you have, let's say {X right, Y up, Z far} and it's a screen or something
The hand-change matrix for typical frames on screens is an identity but with the entry for Y being a -1. I don't know why you would flip the X but that's "equivalent", ignoring any rotations.

What order should the view matrix be caluculated in?

Currently I have been calculating the view matrix like this:
viewMatrix = cameraRot * cameraTrans
and the model matrix like this:
modelMatrix = modelTrans * modelScale
where cameraTrans and modelTrans are translation matrices, modelScale is a scaling matrix, and cameraRot and modelRot are rotation matrices produced by quaternions.
Is this correct? I've been googling this for a few hours, and no one mentions the order for the view matrix, just the model matrix. It all seems to work, but I wrote the matrix and quaternion implementations myself so I cant' tell if this is a bug.
(Note: The matrices are row major)
Let us talk about transformation between coordinate system. Suppose you have a point defined on a local system. You want to describe it in a global system, so what you do is rotate this point, in order to align its axis, and then, translate it to its final position. You can described this mathematically by:
Pg = T*R*Pl, where M = T*R
In this way, M allows you describe any point, defined in a local coordinate system, into a global coordinate system.
You can do the same with camera, but what you really want is do exactly the inverse that you have done before, i.e., you want to describe any point in global coordinate system to the camera local coordinate system:
Pc = X*Pg, but what is the value of X?
You know that:
Pg = Tc*Rc*Pc, so Pc = inv(Tc*RC)*Pg
in order words:
X = inv(Tc*Rc) = inv(Rc) * inv(Tc)
Therefore, to describe a point, from its local coordinate system, to the camera coordinate system, you just need to concatenate those two matrices:
Pc = inv(Rc)*inv(Tc)*T*R*P, where
M' = inv(Rc)*inv(Tc)*T*R
Note that some systems (glm library, for example), define this matrix (X) as lookAt and its definition can be found here. I would suggest you to here this article too
What you have is correct.
modelMatrix = modelTranslation * modelRotation * modelScale; // M=TRS
viewMatrix = cameraOrientation * cameraTranslation; // V=OT
To make this easier to remember, first note that matrices are essentially applied backwards. Let us consider that M=SRT. So you have a cube and you translate it. But if you rotate, it will rotate from the original pivot point. Then, once you apply a scaling factor, the model will be skewed because the scaling applies after the rotation. This is all hard to deal with- M=TRS is much easier for most purposes once you consider that. This is a little hard to describe in words, so let me know if you'd like some pictures.

Resolving rotation matrices to obtain the angles

I have used this code as a basis to detect my rectangular target in a scene.I use ORB and Flann Matcher.I have been able to draw the bounding box of the detected target in my scene successfully using the findHomography() and perspectiveTransform() functions.
The reference image (img_object in the above code) is a straight view of only the rectangular target.Now the target in my scene image may be tilted forwards or backwards.I want to find out the angle by which it has been tilted.I have read various posts and came to the conclusion that the homography returned by findHomography() can be decomposed to the rotation matrix and translation vector. I have used code from https:/gist.github.com/inspirit/740979 recommended by this link translated to C++.This is the Zhang SVD decomposition code got from the camera calibration module of OpenCV.I got the complete explanation of this decomposition code from O'Reilly's Learning OpenCV book.
I also used solvePnP() on the the keypoints returned by the matcher to cross check the rotation matrix and the translation vector returned from the homography decomposition but they do not seem to the same.
I have already the measurements of the tilts of all my scene images.i found 2 ways to retrieve the angles from the rotation matrix to check how well they match my values.
Given a 3×3 rotation matrix
R = [ r_{11} & r_{12} & r_{13} ]
[ r_{21} & r_{22} & r_{23} ]
[ r_{31} & r_{32} & r_{33} ]
The 3 Euler angles are
theta_{x} = atan2(r_{32}, r_{33})
theta_{y} = atan2(-r_{31}, sqrt{r_{32}^2 + r_{33}^2})
theta_{z} = atan2(r_{21}, r_{11})
The axis,angle representation - Being R a general rotation matrix, its corresponding rotation axis u
and rotation angle θ can be retrieved from:
cos(θ) = ( trace(R) − 1) / 2
[u]× = (R − R⊤) / 2 sin(θ)
I calculated the angles using both the methods for the rotation matrices obtained from the homography decomposition and the solvepnp().All the angles are different and give very unexpected values.
Is there a hole in my understanding?I do not understand where my calculations are wrong.Are there any alternatives i can use?
Why do you expect them to be the same? They are not the same thing at all.
The Euler angles are three angles of rotation about one axis at a time, starting from the world frame.
Rodriguez's formula gives components of one vector in the world frame, and an angle of rotation about that vector.

Creating the camera transform and view matrices

This is a topic that I come across often online, but most websites do a poor job properly explaining this. I'm currently creating my own Camera class as part of a 3D renderer built from scratch (how better to understand what happens, right?). But I've hit a snag with creating the World and View matrices for the camera.
As I understand it, the world matrix of a camera is essentially the matrix that places the camera where in needs to be in world space; which is only necessary when you need to render something in its position and according to its orientation. The View matrix, on the other hand, is the matrix that places the camera from that position to the origin of world space, facing along the z axis in one direction or another (negative for right-handed, positive for left-handed, I believe). Am I correct so far?
Given a matrix with its position defined as m_Eye and a lookat defined as m_LookAt, how do I generate the world matrix? More importantly, how do I generate the view matrix without having to perform an expensive inverse operations? I know that the rotation element's inverse is equal to its transpose, so I'm thinking that will factor into it. Regardless, this is the code I have been tinkering with. As an aside, I'm using a right-handed coordinate system.
The following code should generate the appropriate local coordinate axis:
m_W = AlgebraHelper::Normalize(m_Eye - m_LookAt);
m_U = AlgebraHelper::Normalize(m_Up.Cross(m_W));
m_V = m_W.Cross(m_U);
The following code is what I have, so far, for generating the World matrix (note, I also work with row-based matrices, so m_12 indicates the first row and second column):
Matrix4 matrix = Matrix4Identity;
matrix.m_11 = m_U.m_X;
matrix.m_12 = m_U.m_Y;
matrix.m_13 = m_U.m_Z;
matrix.m_21 = m_V.m_X;
matrix.m_22 = m_V.m_Y;
matrix.m_23 = m_V.m_Z;
matrix.m_31 = m_W.m_X;
matrix.m_32 = m_W.m_Y;
matrix.m_33 = m_W.m_Z;
matrix.m_41 = m_Eye.m_X;
matrix.m_42 = m_Eye.m_Y;
matrix.m_43 = m_Eye.m_Z;
Is this a good way to calculate the World matrix and, subsequently, how do I extract the View matrix?
Thank you in advance for any help in the matter.

Implementing a complex rotation-based camera

I am implementing a 3D engine for spatial visualisation, and am writing a camera with the following navigation features:
Rotate the camera (ie, analogous to rotating your head)
Rotate around an arbitrary 3D point (a point in space, which is probably not in the center of the screen; the camera needs to rotate around this keeping the same relative look direction, ie the look direction changes too. This does not look directly at the chosen rotation point)
Pan in the camera's plane (so move up/down or left/right in the plane orthogonal to the camera's look vector)
The camera is not supposed to roll - that is, 'up' remains up. Because of this I represent the camera with a location and two angles, rotations around the X and Y axes (Z would be roll.) The view matrix is then recalculated using the camera location and these two angles. This works great for pan and rotating the eye, but not for rotating around an arbitrary point. Instead I get the following behaviour:
The eye itself apparently moving further up or down than it should
The eye not moving up or down at all when m_dRotationX is 0 or pi. (Gimbal lock? How can I avoid this?)
The eye's rotation being inverted (changing the rotation makes it look further up when it should look further down, down when it should look further up) when m_dRotationX is between pi and 2pi.
(a) What is causing this 'drift' in rotation?
This may be gimbal lock. If so, the standard answer to this is 'use quaternions to represent rotation', said many times here on SO (1, 2, 3 for example), but unfortunately without concrete details (example. This is the best answer I've found so far; it's rare.) I've struggled to implemented a camera using quaternions combining the above two types of rotations. I am, in fact, building a quaternion using the two rotations, but a commenter below said there was no reason - it's fine to immediately build the matrix.
This occurs when changing the X and Y rotations (which represent the camera look direction) when rotating around a point, but does not occur simply when directly changing the rotations, i.e. rotating the camera around itself. To me, this doesn't make sense. It's the same values.
(b) Would a different approach (quaternions, for example) be better for this camera? If so, how do I implement all three camera navigation features above?
If a different approach would be better, then please consider providing a concrete implemented example of that approach. (I am using DirectX9 and C++, and the D3DX* library the SDK provides.) In this second case, I will add and award a bounty in a couple of days when I can add one to the question. This might sound like I'm jumping the gun, but I'm low on time and need to implement or solve this quickly (this is a commercial project with a tight deadline.) A detailed answer will also improve the SO archives, because most camera answers I've read so far are light on code.
Thanks for your help :)
Some clarifications
Thanks for the comments and answer so far! I'll try to clarify a few things about the problem:
The view matrix is recalculated from the camera position and the two angles whenever one of those things changes. The matrix itself is never accumulated (i.e. updated) - it is recalculated afresh. However, the camera position and the two angle variables are accumulated (whenever the mouse moves, for example, one or both of the angles will have a small amount added or subtracted, based on the number of pixels the mouse moved up-down and/or left-right onscreen.)
Commenter JCooper states I'm suffering from gimbal lock, and I need to:
add another rotation onto your transform that rotates the eyePos to be
completely in the y-z plane before you apply the transformation, and
then another rotation that moves it back afterward. Rotate around the
y axis by the following angle immediately before and after applying
the yaw-pitch-roll matrix (one of the angles will need to be negated;
trying it out is the fastest way to decide which).
double fixAngle = atan2(oEyeTranslated.z,oEyeTranslated.x);
Unfortunately, when implementing this as described, my eye shoots off above the scene at a very fast rate due to one of the rotations. I'm sure my code is simply a bad implementation of this description, but I still need something more concrete. In general, I find unspecific text descriptions of algorithms are less useful than commented, explained implementations. I am adding a bounty for a concrete, working example that integrates with the code below (i.e. with the other navigation methods, too.) This is because I would like to understand the solution, as well as have something that works, and because I need to implement something that works quickly since I am on a tight deadline.
Please, if you answer with a text description of the algorithm, make sure it is detailed enough to implement ('Rotate around Y, then transform, then rotate back' may make sense to you but lacks the details to know what you mean. Good answers are clear, signposted, will allow others to understand even with a different basis, are 'solid weatherproof information boards.')
In turn, I have tried to be clear describing the problem, and if I can make it clearer please let me know.
My current code
To implement the above three navigation features, in a mouse move event moving based on the pixels the cursor has moved:
// Adjust this to change rotation speed when dragging (units are radians per pixel mouse moves)
// This is both rotating the eye, and rotating around a point
static const double dRotatePixelScale = 0.001;
// Adjust this to change pan speed (units are meters per pixel mouse moves)
static const double dPanPixelScale = 0.15;
switch (m_eCurrentNavigation) {
case ENavigation::eRotatePoint: {
// Rotating around m_oRotateAroundPos
const double dX = (double)(m_oLastMousePos.x - roMousePos.x) * dRotatePixelScale * D3DX_PI;
const double dY = (double)(m_oLastMousePos.y - roMousePos.y) * dRotatePixelScale * D3DX_PI;
// To rotate around the point, translate so the point is at (0,0,0) (this makes the point
// the origin so the eye rotates around the origin), rotate, translate back
// However, the camera is represented as an eye plus two (X and Y) rotation angles
// This needs to keep the same relative rotation.
// Rotate the eye around the point
const D3DXVECTOR3 oEyeTranslated = m_oEyePos - m_oRotateAroundPos;
D3DXMATRIX oRotationMatrix;
D3DXMatrixRotationYawPitchRoll(&oRotationMatrix, dX, dY, 0.0);
D3DXVECTOR4 oEyeRotated;
D3DXVec3Transform(&oEyeRotated, &oEyeTranslated, &oRotationMatrix);
m_oEyePos = D3DXVECTOR3(oEyeRotated.x, oEyeRotated.y, oEyeRotated.z) + m_oRotateAroundPos;
// Increment rotation to keep the same relative look angles
RotateXAxis(dX);
RotateYAxis(dY);
break;
}
case ENavigation::ePanPlane: {
const double dX = (double)(m_oLastMousePos.x - roMousePos.x) * dPanPixelScale;
const double dY = (double)(m_oLastMousePos.y - roMousePos.y) * dPanPixelScale;
m_oEyePos += GetXAxis() * dX; // GetX/YAxis reads from the view matrix, so increments correctly
m_oEyePos += GetYAxis() * -dY; // Inverted compared to screen coords
break;
}
case ENavigation::eRotateEye: {
// Rotate in radians around local (camera not scene space) X and Y axes
const double dX = (double)(m_oLastMousePos.x - roMousePos.x) * dRotatePixelScale * D3DX_PI;
const double dY = (double)(m_oLastMousePos.y - roMousePos.y) * dRotatePixelScale * D3DX_PI;
RotateXAxis(dX);
RotateYAxis(dY);
break;
}
The RotateXAxis and RotateYAxis methods are very simple:
void Camera::RotateXAxis(const double dRadians) {
m_dRotationX += dRadians;
m_dRotationX = fmod(m_dRotationX, 2 * D3DX_PI); // Keep in valid circular range
}
void Camera::RotateYAxis(const double dRadians) {
m_dRotationY += dRadians;
// Limit it so you don't rotate around when looking up and down
m_dRotationY = std::min(m_dRotationY, D3DX_PI * 0.49); // Almost fully up
m_dRotationY = std::max(m_dRotationY, D3DX_PI * -0.49); // Almost fully down
}
And to generate the view matrix from this:
void Camera::UpdateView() const {
const D3DXVECTOR3 oEyePos(GetEyePos());
const D3DXVECTOR3 oUpVector(0.0f, 1.0f, 0.0f); // Keep up "up", always.
// Generate a rotation matrix via a quaternion
D3DXQUATERNION oRotationQuat;
D3DXQuaternionRotationYawPitchRoll(&oRotationQuat, m_dRotationX, m_dRotationY, 0.0);
D3DXMATRIX oRotationMatrix;
D3DXMatrixRotationQuaternion(&oRotationMatrix, &oRotationQuat);
// Generate view matrix by looking at a point 1 unit ahead of the eye (transformed by the above
// rotation)
D3DXVECTOR3 oForward(0.0, 0.0, 1.0);
D3DXVECTOR4 oForward4;
D3DXVec3Transform(&oForward4, &oForward, &oRotationMatrix);
D3DXVECTOR3 oTarget = oEyePos + D3DXVECTOR3(oForward4.x, oForward4.y, oForward4.z); // eye pos + look vector = look target position
D3DXMatrixLookAtLH(&m_oViewMatrix, &oEyePos, &oTarget, &oUpVector);
}
It seems to me that "Roll" shouldn't be possible given the way you form your view matrix. Regardless of all the other code (some of which does look a little funny), the call D3DXMatrixLookAtLH(&m_oViewMatrix, &oEyePos, &oTarget, &oUpVector); should create a matrix without roll when given [0,1,0] as an 'Up' vector unless oTarget-oEyePos happens to be parallel to the up vector. This doesn't seem to be the case since you're restricting m_dRotationY to be within (-.49pi,+.49pi).
Perhaps you can clarify how you know that 'roll' is happening. Do you have a ground plane and the horizon line of that ground plane is departing from horizontal?
As an aside, in UpdateView, the D3DXQuaternionRotationYawPitchRoll seems completely unnecessary since you immediately turn around and change it into a matrix. Just use D3DXMatrixRotationYawPitchRoll as you did in the mouse event. Quaternions are used in cameras because they're a convenient way to accumulate rotations happening in eye coordinates. Since you're only using two axes of rotation in a strict order, your way of accumulating angles should be fine. The vector transformation of (0,0,1) isn't really necessary either. The oRotationMatrix should already have those values in the (_31,_32,_33) entries.
Update
Given that it's not roll, here's the problem: you create a rotation matrix to move the eye in world coordinates, but you want the pitch to happen in camera coordinates. Since roll isn't allowed and yaw is performed last, yaw is always the same in both the world and camera frames of reference. Consider the images below:
Your code works fine for local pitch and yaw because those are accomplished in camera coordinates.
But when you rotate around a reference point, you are creating a rotation matrix that is in world coordinates and using that to rotate the camera center. This works okay if the camera's coordinate system happens to line up with the world's. However, if you don't check to see if you're up against the pitch limit before you rotate the camera position, you will get crazy behavior when you hit that limit. The camera will suddenly start to skate around the world--still 'rotating' around the reference point, but no longer changing orientation.
If the camera's axes don't line up with the world's, strange things will happen. In the extreme case, the camera won't move at all because you're trying to make it roll.
The above is what would normally happen, but since you handle the camera orientation separately, the camera doesn't actually roll.
Instead, it stays upright, but you get strange translation going on.
One way to handle this would be to (1)always put the camera into a canonical position and orientation relative to the reference point, (2)make your rotation, and then (3)put it back when you're done (e.g., similar to the way that you translate the reference point to the origin, apply the Yaw-Pitch rotation, and then translate back). Thinking more about it, however, this probably isn't the best way to go.
Update 2
I think that Generic Human's answer is probably the best. The question remains as to how much pitch should be applied if the rotation is off-axis, but for now, we'll ignore that. Maybe it'll give you acceptable results.
The essence of the answer is this: Before mouse movement, your camera is at c1 = m_oEyePos and being oriented by M1 = D3DXMatrixRotationYawPitchRoll(&M_1,m_dRotationX,m_dRotationY,0). Consider the reference point a = m_oRotateAroundPos. From the point of view of the camera, this point is a'=M1(a-c1).
You want to change the orientation of the camera to M2 = D3DXMatrixRotationYawPitchRoll(&M_2,m_dRotationX+dX,m_dRotationY+dY,0). [Important: Since you won't allow m_dRotationY to fall outside of a specific range, you should make sure that dY doesn't violate that constraint.] As the camera changes orientation, you also want its position to rotate around a to a new point c2. This means that a won't change from the perspective of the camera. I.e., M1(a-c1)==M2(a-c2).
So we solve for c2 (remember that the transpose of a rotation matrix is the same as the inverse):
M2TM1(a-c1)==(a-c2) =>
-M2TM1(a-c1)+a==c2
Now if we look at this as a transformation being applied to c1, then we can see that it is first negated, then translated by a, then rotated by M1, then rotated by M2T, negated again, and then translated by a again. These are transformations that graphics libraries are good at and they can all be squished into a single transformation matrix.
#Generic Human deserves credit for the answer, but here's code for it. Of course, you need to implement the function to validate a change in pitch before it's applied, but that's simple. This code probably has a couple typos since I haven't tried to compile:
case ENavigation::eRotatePoint: {
const double dX = (double)(m_oLastMousePos.x - roMousePos.x) * dRotatePixelScale * D3DX_PI;
double dY = (double)(m_oLastMousePos.y - roMousePos.y) * dRotatePixelScale * D3DX_PI;
dY = validatePitch(dY); // dY needs to be kept within bounds so that m_dRotationY is within bounds
D3DXMATRIX oRotationMatrix1; // The camera orientation before mouse-change
D3DXMatrixRotationYawPitchRoll(&oRotationMatrix1, m_dRotationX, m_dRotationY, 0.0);
D3DXMATRIX oRotationMatrix2; // The camera orientation after mouse-change
D3DXMatrixRotationYawPitchRoll(&oRotationMatrix2, m_dRotationX + dX, m_dRotationY + dY, 0.0);
D3DXMATRIX oRotationMatrix2Inv; // The inverse of the orientation
D3DXMatrixTranspose(&oRotationMatrix2Inv,&oRotationMatrix2); // Transpose is the same in this case
D3DXMATRIX oScaleMatrix; // Negative scaling matrix for negating the translation
D3DXMatrixScaling(&oScaleMatrix,-1,-1,-1);
D3DXMATRIX oTranslationMatrix; // Translation by the reference point
D3DXMatrixTranslation(&oTranslationMatrix,
m_oRotateAroundPos.x,m_oRotateAroundPos.y,m_oRotateAroundPos.z);
D3DXMATRIX oTransformMatrix; // The full transform for the eyePos.
// We assume the matrix multiply protects against variable aliasing
D3DXMatrixMultiply(&oTransformMatrix,&oScaleMatrix,&oTranslationMatrix);
D3DXMatrixMultiply(&oTransformMatrix,&oTransformMatrix,&oRotationMatrix1);
D3DXMatrixMultiply(&oTransformMatrix,&oTransformMatrix,&oRotationMatrix2Inv);
D3DXMatrixMultiply(&oTransformMatrix,&oTransformMatrix,&oScaleMatrix);
D3DXMatrixMultiply(&oTransformMatrix,&oTransformMatrix,&oTranslationMatrix);
D3DXVECTOR4 oEyeFinal;
D3DXVec3Transform(&oEyeFinal, &m_oEyePos, &oTransformMatrix);
m_oEyePos = D3DXVECTOR3(oEyeFinal.x, oEyeFinal.y, oEyeFinal.z)
// Increment rotation to keep the same relative look angles
RotateXAxis(dX);
RotateYAxis(dY);
break;
}
I think there is a much simpler solution that lets you sidestep all rotation issues.
Notation: A is the point we want to rotate around, C is the original camera location, M is the original camera rotation matrix that maps global coordinates to the camera's local viewport.
Make a note of the local coordinates of A, which are equal to A' = M × (A - C).
Rotate the camera like you would in normal "eye rotation" mode. Update the view matrix M so that it is modified to M2 and C remains unchanged.
Now we would like to find C2 such that A' = M2 × (A - C2).
This is easily done by the equation C2 = A - M2-1 × A'.
Voilà, the camera has been rotated and because the local coordinates of A are unchanged, A remains at the same location and the same scale and distance.
As an added bonus, the rotation behavior is now consistent between "eye rotation" and "point rotation" mode.
You rotate around the point by repeatedly applying small rotation matrices, this probably cause the drift (small precision errors add up) and I bet you will not really do a perfect circle after some time. Since the angles for the view use simple 1-dimension double, they have much less drift.
A possible fix would be to store a dedicated yaw/pitch and relative position from the point when you enter that view mode, and using those to do the math. This requires a bit more bookkeeping, since you need to update those when moving the camera. Note that it will also make the camera move if the point move, which I think is an improvement.
If I understand correctly, you are satisfied with the rotation component in the final matrix (save for inverted rotation controls in the problem #3), but not with the translation part, is that so?
The problem seems to come from the fact that you treating them differently: you are recalculating the rotation part from scratch every time, but accumulate the translation part (m_oEyePos). Other comments mention precision problems, but it's actually more significant than just FP precision: accumulating rotations from small yaw/pitch values is simply not the same---mathematically---as making one big rotation from the accumulated yaw/pitch. Hence the rotation/translation discrepancy. To fix this, try recalculating eye position from scratch simultaneously with the rotation part, similarly to how you find "oTarget = oEyePos + ...":
oEyePos = m_oRotateAroundPos - dist * D3DXVECTOR3(oForward4.x, oForward4.y, oForward4.z)
dist can be fixed or calculated from the old eye position. That will keep the rotation point in the screen center; in the more general case (which you are interested in), -dist * oForward here should be replaced by the old/initial m_oEyePos - m_oRotateAroundPos multiplied by the old/initial camera rotation to bring it to the camera space (finding a constant offset vector in camera's coordinate system), then multiplied by the inverted new camera rotation to get the new direction in the world.
This will, of course, be subject to gimbal lock when the pitch is straight up or down. You'll need to define precisely what behavior you expect in these cases to solve this part. On the other hand, locking at m_dRotationX=0 or =pi is rather strange (this is yaw, not pitch, right?) and might be related to the above.