I have implemented camera rotation around a centre entity and now want to add camera translation. I cannot just do centre.xy += mouse.delta.xy as if the camera is rotated facing the z axis and I drag to the right, this will obviously move the camera towards me (because x axis being incremented). In this instance, the centre.z attribute would need to be increased. I suppose I need to incorporate the camera's pitch, yaw and roll properties into this calculation but not sure how to go about it... any suggestions/links?
I also tried playing around with ray casting (which I have implemented), in place of the mouse delta, but to no avail.
EDIT - simple method:
val right = Vector3f(viewMatrix.m00(), viewMatrix.m01(), viewMatrix.m02()).mul(lmb.delta.x)
val up = Vector3f(viewMatrix.m10(), viewMatrix.m11(), viewMatrix.m12()).mul(lmb.delta.y)
val delta = right.add(up)
center.add(delta)
You did not write a lot about how you represent your camera, but I assume the following:
The camera is represented by a focus point centre and three Euler angles that describe the rotation about that focus point. Probably also a distance to the focus point.
I'll explain two ways - one rather simple and one more sophisticated.
Simple Way
Let's recap what you were trying to do:
centre.xy += mouse.delta.xy
This fails when the camera is not aligned with the coordinate system. A more general formulation of this approach would be:
centre += mouse.delta.x * right + mouse.delta.y * up
Here, right is a world-space vector pointing to the right side of the screen and up is a world-space vector pointing upwards. Depending on your mouse delta, you may instead want a down vector.
So, where do we get those vectors from? Easy. The view matrix has all we need. The first row (the first three entries of the row) are the right vector. The second row is the up vector. So, simply get the view matrix, read those vectors, and update the focus center. You might also want to add some scale.
More Sophisticated
In many applications, the panning functionality is designed in a way such that a certain 3D point under the mouse stays under the mouse during panning. This can be achieved in the following way:
First, we need the depth of the 3D point that we want to keep under the mouse. Two common options are the depth of the focus point or the actual depth of the 3D scene under the mouse (which you get from the depth map). I will explain the former.
We first need this depth in Normalized Device Coordinates. To do this, we first calculate the view-projection matrix:
VP = ProjectionMatrix * ViewMatrix
Then, we transform the focus point into clip space:
focusClip = VP * (focus, 1)
(focus, 1) is a 4D vector with a 1 as its last component. Finally, we derive NDC depth as
focusDepthNDC = focusClip.z / focusClip.w
Ok, now we have the depth. So we can calculate the 3D point that we want to keep under the mouse. First, lets invert the view-projection matrix because this allows us to go from clip space to world space:
VPInv = inverse(VP)
Then, the point under the mouse is (I'll call it x):
x = VPInv * (mouseStartNDC.x, mouseStartNDC.y, focusDepthNDC, 1)
mouseStartNDC is the mouse position before the shift. Keep in mind that this needs to be in normalized device coordinates. If you only have screen space coordinates, then:
ndcX = 2 * screenX / windowWidth - 1
ndcY = -2 * screenY / windowHeight + 1
x is again a 4D vector. Do the perspective divide:
x *= 1.0 / x.w
Now we have our 3D point. We just need to find a shift of the camera that keeps this position under the mouse at the mouse location after the shift:
newX = VPInv * (mouseEndNDC.x, mouseEndNDC.y, focusDepthNDC, 1)
Do the perspective divide again:
newX *= 1.0 / newX.w
And finally update your camera center:
centre += (x - newX).xyz
This approach works with any camera model that you can express in matrix form.
Related
I have this function to get a 2D pixel location from 3D coordinate position. The x y z are pre-transform coordinates (1 to -1). This is a model view architecture with camera permanently at -3.5,0,0 looking at 0,0,0 while the object/scenes
coordinates are transformed by a horizontal xz rotation and vertical y rotation, etc to produce the final frame.
This function is mostly used to overlay 2D text on top of the 3D scene. Where the 2D text is positioned relative to the 3D underlying scene.
void My3D::Get2Dfrom3Dx(float x, float y, float z, float* psx, float* psy) {
XMVECTOR xmScreenCoord = XMLoadFloat3( (XMFLOAT3*) &screenCoord);
XMMATRIX xmWorldViewProjection = XMLoadFloat4x4( (XMFLOAT4X4*) &m_WorldViewProjection);
XMVECTOR result = XMVector3TransformCoord( xmScreenCoord, xmWorldViewProjection);
XMStoreFloat3( (XMFLOAT3*) &screenCoord, result);
screenCoord.x = ((screenCoord.x + 1.0f) / 2.0f) * m_nCurrWidth;
screenCoord.y = ((-screenCoord.y + 1.0f) / 2.0f) * m_nCurrHeight;
*psx = screenCoord.x;
*psy = screenCoord.y; }
This function works perfectly when the scene is fully/mostly visible, (the eyeat between -4 and -1.5.)
I have a nagging problem with text showing up mirrored in 3D position where it should not be.
This happens when for example I'm viewing the image from below (60+ degrees upward below object), and zooming(moving the eyeat location closer to say -.5,0,0.) The text should not be visible as it should be behind the eye (note eyeat is not past 0,0,0 which really messes the image up),
but somehow the above function causes the calculated screen x y coordinates to show within the viewport in situations where they should not.
I seem to think there is a simple solution to this side effect but can't find it. Hopefully someone has seen this 2d mirrored problem/effect before and knows the simple tweak.
I realize I could go down a more complex path of determining if the view vector is opposite the target point and filter this way, but I seem to think there should be a simpler solution.
Again, the camera is permanently on the line -3.5, 0, 0 to say -.5,0,0 as the world is transformed around it.
The problem lies in the way the projection works. Basically, the perspective projection will divide the x and y coordinates by the z coordinate. That's how you get the effect of perspective, i.e., that things that are farther away (larger z coordinate) appear smaller on screen. One issue with this perspective division is (simplified) that it doesn't work correctly for stuff that's behind the camera. Stuff behind the camera will have a negative z coordinate. When you divide x and y by a negative value, you'll have your point reflected around the origin. Which is exactly what you see. Since stuff that's located behind the camera is not going to be visible anyways, one way to solve this problem is to simply clip all geometry before dividing by z such that everything that has a negative z value is cut off and removed.
I assume the division in your code here happens inside XMVector3TransformCoord(). As you note yourself, the text should not be visible in the problematic cases anyways. So I suggest you simply check whether the text is behind the camera and don't render it if it is. One way to do so would be to simply check the result of transforming your world-space position with the xmWorldViewProjection matrix and only continue if it happens to be in front of the camera. xmScreenCoord holds the homogeneous clipspace coordinates of your point. The point will be in front of the camera iff the z coordinate of xmScreenCoord is larger than zero. So I guess you'd want to do something like
if (XMVectorGetZ(xmScreenCoord) > 0)
{
…
}
Sidenote due to the discussion in the comments below: When one wants to solve a problem involving the projections of objects on screen, one can often avoid explicitly computing the projection by instead transforming the problem into its dual and working directly in projective space on homogeneous coordinates. Since your problem is about placing text in 2D on screen, however, I don't think this is an option here. You could place the geometry for drawing your text in clip-space directly. You would start again by computing the clip-space coordinates of the 3D point to which you want your 2D text attached (by multiplying them with m_WorldViewProjection but not dividing by w). You can then generate homogeneous coordinates for the geometry for drawing your text by simply offsetting the x- and y- coordinates from that point to get the corners of a quad or whatever you need to construct. If you then also scale the size of the quad by the w coordinate of the point, you will get a quad at that position that projects to always the same size on the screen (since the premultiplication with w effectively cancels out the projection). However, all you're effectively doing then is leaving the application of the projection and necessarily clipping to the GPU. If you want to render a large number of quads, that might be an option to consider as it could be done completely on the GPU, e.g., using a geometry shader. However, if you just have a few text elements, it would be much simpler and probably also more efficient to just skip the drawing of text elements that would be behind the camera as described above…
Michael's response was very helpful in making me continue down a path that the solution should be a simple comparison. In my case, I had to re-evaluate the screen coordinates by only applying the World transform, versus the full WorldViewProjection. I call this TargetTransformed. My comparison value was then simply the Eye/Camera location (this never gets adjusted (except zoom) as the world is transformed around the Eye.) And again note my Camera in this case is at -3.5,0,0 looking at 0,0,0 (center of model, really 8,0,0 thus a line through the center). So I had to compare the x component, not the z component. I add a bit of fudge .1F as the mirror artifact happens when the target is significantly behind the camera. In which case I return the final screenCoord locations translated (-8000) way out in outer space as to guarantee they are not seen in the viewport.
if ((Eye.x + 0.1F) > TargetTransformed.x)
{
screenCoord.x += -8000;
screenCoord.y += -8000;
//TRACE("point is behind camera.\n");
*psx = screenCoord.x;
*psy = screenCoord.y;
}
else
{
*psx = screenCoord.x;
*psy = screenCoord.y;
}
And for completeness, my project has 2 view models: a) looking along a line through the center of the model. Which this can be translated to look from any direction and offset by screen x and y. The first view model works fine with this above code. The second view model b) targets the camera to look at a focal point of the model and then allows full rotation around that random point (not the center of the model) which requires calculating a tricky additional translation matrix and vector I call TargetViewTranslation. And for this additional translation, the formula adds the z component of the additional transform.
if ((Eye.x + 0.1F - m_structTargetViewTranslation.Z) > TargetTransformed.x)
{
screenCoord.x += -8000;
screenCoord.y += -8000;
//TRACE("point is behind camera.\n");
*psx = screenCoord.x;
*psy = screenCoord.y;
}
else
{
*psx = screenCoord.x;
*psy = screenCoord.y;
}
And success, my mirrored text problem is resolved. Hopefully this helps others with this mirrored text problem. Realizing that one may need to only transform the test case by the World transform, and it should be a simple comparison, and the location of the camera may impact which x or z component is used. And if you are translating the world in any additional ways, then this translation could also impact if x or z is compared. Using TRACE and looking at the x y z values was helpful in figuring out what components I needed to use in my specific case.
I'm creating a 360° image player using Oculus rift SDK.
The scene is composed by a cube and the camera is posed in the center of it with just the possibility to rotate around yaw, pitch and roll.
I've drawn the object using openGL considering a 2D texture for each cube's face to create the 360° effect.
I would like to find the portion in the original texture that is actual shown on the Oculus viewport in a certain instant.
Up to now, my approach was try to find the an approximate pixel position of some significant point of the viewport (i.e. the central point and the corners) using the Euler Angles in order to identify some areas in the original textures.
Considering all the problems of using Euler Angles, do not seems the smartest way to do it.
Is there any better approach to accomplish it?
Edit
I did a small example that can be runned in the render loop:
//Keep the Orientation from Oculus (Point 1)
OVR::Matrix4f rotation = Matrix4f(hmdState.HeadPose.ThePose);
//Find the vector respect to a certain point in the viewport, in this case the center (Point 2)
FovPort fov_viewport = FovPort::CreateFromRadians(hmdDesc.CameraFrustumHFovInRadians, hmdDesc.CameraFrustumVFovInRadians);
Vector2f temp2f = fov_viewport.TanAngleToRendertargetNDC(Vector2f(0.0,0.0));// this values are the tangent in the center
Vector3f vector_view = Vector3f(temp2f.x, temp2f.y, -1.0);// just add the third component , where is oriented
vector_view.Normalize();
//Apply the rotation (Point 3)
Vector3f final_vect = rotation.Transform(vector_view);//seems the right operation.
//An example to check if we are looking at the front face (Partial point 4)
if (abs(final_vect.z) > abs(final_vect.x) && abs(final_vect.z) > abs(final_vect.y) && final_vect.z <0){
system("pause");
}
Is it right to consider the entire viewport or should be done for each single eye?
How can be indicated a different point of the viewport respect to the center? I don't really understood which values should be the input of TanAngleToRendertargetNDC().
You can get a full rotation matrix by passing the camera pose quaternion to the OVR::Matrix4 constructor.
You can take any 2D position in the eye viewport and convert it to its camera space 3D coordinate by using the fovPort tan angles. Normalize it and you get the direction vector in camera space for this pixel.
If you apply the rotation matrix gotten earlier to this direction vector you get the actual direction of that ray.
Now you have to convert from this direction to your texture UV. The component with the highest absolute value in the direction vector will give you the face of the cube it's looking at. The remaining components can be used to find the actual 2D location on the texture. This depends on how your cube faces are oriented, if they are x-flipped, etc.
If you are at the rendering part of the viewer, you will want to do this in a shader. If this is to find where the user is looking at in the original image or the extent of its field of view, then only a handful of rays would suffice as you wrote.
edit
Here is a bit of code to go from tan angles to camera space coordinates.
float u = (x / eyeWidth) * (leftTan + rightTan) - leftTan;
float v = (y / eyeHeight) * (upTan + downTan) - upTan;
float w = 1.0f;
x and y are pixel coordinates, eyeWidth and eyeHeight are eye buffer size, and *Tan variables are the fovPort values. I first express the pixel coordinate in [0..1] range, then scale that by the total tan angle for the direction, and then recenter.
In OpenGL I'm trying to create a free flight camera. My problem is the rotation on the Y axis. The camera should always be rotated on the Y world axis and not on the local orientation. I have tried several matrix multiplications, but all without results. With
camMatrix = camMatrix * yrotMatrix
rotates the camera along the local axis. And with
camMatrix = yrotMatrix * camMatrix
rotates the camera along the world axis, but always around the origin. However, the rotation center should be the camera. Somebody an idea?
One of the more tricky aspects of 3D programming is getting complex transformations right.
In OpenGL, every point is transformed with the model/view matrix and then with the projection matrix.
the model view matrix takes each point and translates it to where it should be from the point of view of the camera. The projection matrix converts the point's coordinates so that the X and Y coordinates can be mapped to the window easily.
To get the mode/view matrix right, you have to start with an identity matrix (one that doesn't change the vertices), then apply the transforms for the camera's position and orientation, then for the object's position and orientation in reverse order.
Another thing you need to keep in mind is, rotations are always about an axis that is centered on the origin (0,0,0). So when you apply a rotate transform for the camera, whether you are turning it (as you would turn your head) or orbiting it around the origin (as the Earth orbits the Sun) depends on whether you have previously applied a translation transform.
So if you want to both rotate and orbit the camera, you need to:
Apply the rotation(s) to orient the camera
Apply translation(s) to position it
Apply rotation(s) to orbit the camera round the origin
(optionally) apply translation(s) to move the camera in its set orientation to move it to orbit around a point other than (0,0,0).
Things can get more complex if you, say, want to point the camera at a point that is not (0,0,0) and also orbit that point at a set distance, while also being able to pitch or yaw the camera. See here for an example in WebGL. Look for GLViewerBase.prototype.display.
The Red Book covers transforms in much more detail.
Also note gluLookAt, which you can use to point the camera at something, without having to use rotations.
Rather than doing this using matrices, you might find it easier to create a camera class which stores a position and orthonormal n, u and v axes, and rotate them appropriately, e.g. see:
https://github.com/sgolodetz/hesperus2/blob/master/Shipwreck/MapEditor/GUI/Camera.java
and
https://github.com/sgolodetz/hesperus2/blob/master/Shipwreck/MapEditor/Math/MathUtil.java
Then you write things like:
if(m_keysDown[TURN_LEFT])
{
m_camera.rotate(new Vector3d(0,0,1), deltaAngle);
}
When it comes time to set the view for the camera, you do:
gl.glLoadIdentity();
glu.gluLookAt(m_position.x, m_position.y, m_position.z,
m_position.x + m_nVector.x, m_position.y + m_nVector.y, m_position.z + m_nVector.z,
m_vVector.x, m_vVector.y, m_vVector.z);
If you're wondering how to rotate about an arbitrary axis like (0,0,1), see MathUtil.rotate_about_axis in the above code.
If you don't want to transform based on the camera from the previous frame, my suggestion might be just to throw out the matrix compounding and recalc it every frame. I don't think there's a way to do what you want with a single matrix, as that stores the translation and rotation together.
I guess if you just want a pitch/yaw camera only, just store those values as two floats, and then rebuild the matrix based on that. Maybe something like pseudocode:
onFrameUpdate() {
newPos = camMatrix * (0,0,speed) //move forward along the camera axis
pitch += mouse_move_x;
yaw += mouse_move_y;
camMatrix = identity.translate(newPos)
camMatrix = rotate(camMatrix, (0,1,0), yaw)
camMatrix = rotate(camMatrix, (1,0,0), pitch)
}
rotates the camera along the world axis, but always around the origin. However, the rotation center should be the camera. Somebody an idea?
I assume matrix stored in memory this way (number represent element index if matrix were a linear 1d array):
0 1 2 3 //row 0
4 5 6 7 //row 1
8 9 10 11 //row 2
12 13 14 15 //row 3
Solution:
Store last row of camera matrix in temporary variable.
Set last row of camera matrix to (0, 0, 0, 1)
Use camMatrix = yrotMatrix * camMatrix
Restore last row of camera matrix from temporary variable.
I am currently struggling in finding a formula to rotate my OpenGL "Camera" (I tried do do it via a scene rotation, but have the same issue).
Basically my Camera is at a given position, looking a given point (all indicated to gluLookAt) and I would like to rotate the camera upwards for example, and still looking at the same point.
What should be the right process ?
What input data should I take to decide the amount of movement ? 2D mouse coordinates evolution or 3D unprojected mouse coordinates evolution ?
The trick is to see that a camera-rotation is the same as a scene rotation if you do it at the correct position. Move the camera into the point around which you want to rotate, then rotate the camera, then move back out by the same distance you moved in.
The amount by which you rotate depends on your application. Take G-Earth as an example: if you are close to the surface the rotation is (absolute) small, if you are far from the surface it is large.
If you're creating orbiting(oribitng around LookAt) camera for openGL I sugest you make it with these data:
LookAtPosition- 3D vector
CamUp - 3D unit vector
RelativeCamPosition - 3D unit vector
CamDistance - decimal number
LookAtPosition is a point on which you'll be looking. CamUp is vector that points up from camera, you can see it on this image. It's best to initialize camera at no rotation, so that CamUp = [0,1,0]. Note that it's unit vector so it's magnitude/size/length is always 1. RelativeCamPosition is again unit vector. You get it by taking LookAt to Camera
vector and dividing by it's magnitude, which you'll save in CamDistance. In intialized state it might look as this:
LookAtPosition = [0,0,0]
CamUp = [0,1,0]
RelativeCamPosition = [1,0,0]
CamDistance = 10
You can now get camera position by
CamPosition = LookAtPosition + RelativeCamPosition * CamDistance
But you need to rotate that camera arround right? Well there's a reason for unit vectors - they are easy to use in calculations. I believe you use angles for rotating so you need to use only sine and cosine. Rotate function might look like this:
Rotate(angleX, angleY){
RelativeCamPosition.x = sin(angleX)*cos(angleY);
RelativeCamPosition.z = cos(angleX)*cos(angleY);
RelativeCamPosition.y = sin(angleY);
}
where angleX and angleY are absolute (NOT RELATIVE) rotations in horizontal and vertical direction. You should always use absolute roations because there can be floating point errors while adding. Anyway I just made those calculations on scrap of paper so I hope they're allright.
Edit: I've just noticed that this will work just if your intiial state is like I wrote RelativeCamPosition = [1,0,0]. However it shouldn't be hard to edit them so it works for arbirtary initial state.
Heyo,
I'm currently working on a project where I need to place the camera such that the full motion of a character would be viewable without moving the camera. I have the position where the character starts, as well as the maximum distance that the character will travel in all three directions (X,Y, & Z). I also have the field of view (which is 90 degrees).
Is there an equation that'll figure out where I need to place the camera so it won't have to move to see the full motion?
Note: this is using OpenGL.
Clarification: The camera should be "in front" of the character that's in the motion, not above.
It'll also be moving along a ground plane.
If you make a bounding sphere of the points, all you need to do is keep the camera at a distance greater than or equal to the radius of the bounding sphere / sin(FOV/2).
For example, if you have a bounding sphere with radius Radius, and a specified Field of View FOV, your camera just needs to be at a point "Dist" away, pointing towards the center of the bounding sphere.
The equation for calculating the distance is:
Dist = Radius / sin( FOV/2 );
This will work in 3D, for a camera at any orientation.
Simply having the maximum range of (X, Y, Z) is not on its own sufficient, because the viewing port is essentially pyramid shaped, with the apex of the pyramid being at the eye position.
For the sake of argument, let's assume that all movement is in the (X, Z) plane (i.e. the ground), and the eye is directly above the origin 10m along the Y axis.
Assuming a square viewport, with your 90˚ field of view you'd be able to see from ±10m along both the X and Z axis, but only for objects who are on the ground (Y = 0). As soon as they come off the ground your view is reduced. If it's 1m of the ground then your (X, Z) extent is only ±9m.
Clearly a real camera could be placed anyway in the scene, facing any direction. Even the "roll" angle of the camera could change how much is visible. There are actually infinitely many such camera points, so you will need to constrain your criteria somewhat.
Take the line segment from the startpoint to the endpoint. Construct a plane orthogonal to this line segment through the midpoint of the line segment. Then position the camera somewhere in this plane at an distance of more than the following from the intersection point of plane and line looking at the intersection point. The up vector of the camera must be in the plane and the horizontal field of view must be 90 degrees.
distance = sqrt(dx^2 + dy^2 + dz^2) / 2
This camera positions will all have the startpoint and the endpoint on the left or right border of the view port and verticaly centered.
Another solution might be to write a function that takes the startpoint, the endpoint, and the desired position of both points on the screen. Then just solve the projection equation for the camera transformation.
It depends, for example, if the object is gonna move in a plane, you can just place the camera outside a ball circumscribed its movement area (this depends on the fact that FOV is 90, which is a fortunate angle).
If the object is gonna move in 3D, it's much more difficult. It would help if you'd specify the region where the object moves (cube vs. ball...) and the direction you want to see it from.