How to do panning/zooming operation in directx/c++ - c++

For learning purposes im trying to create simple drawing program.Right now im trying to add panning and zooming features.
This is how i update my camera m_rightward and m_upward are 1 and 0 based on if wasd is clicked.
void Camera::updateCamera(int width, int height)
{
this->height = height;
this->width = width;
cc.client_height = this->height;
cc.client_width = this->width;
cc.m_time = ::GetTickCount();
//panning and zooming in here
new_pos = world_cam.getTranslation() + world_cam.getZDirection()*(m_forward*0.1f);
new_pos = new_pos + world_cam.getXDirection()*(m_rightward * 20);
new_pos = new_pos + world_cam.getYDirection()*(m_upward * 20);
world_cam.setTranslation(new_pos);
temp = world_cam;
temp.inverse();
cc.m_view = temp;
cc.m_proj.setOrthoLH
(
width,
height,
-1.0f,
1.0f
);
}
and this is how i fill my linestrip to draw
void AppWindow::onLeftMouseDown(const Point & mouse_pos)
{
Point pos = screenToClient(mouse_pos);
clickedPoint = Vector3D(pos.m_x, pos.m_y, 0.0f);
if (isLPressed)
{
if (counter == 0)
{
LineStrip.push_back(Line(clickedPoint));
counter++;
}
else
{
LineStrip.back().addPoint(clickedPoint);
counter++;
}
}
}
and this is vertex shader
VS_OUTPUT vsmain(VS_INPUT input)
{
VS_OUTPUT output = (VS_OUTPUT) 0;
float4 new_pos = mul(input.position, m_world);
float clip_x = (new_pos.x / client_width) * 2.0 - 1.0;
float clip_y = 1.0 - (new_pos.y / client_height) * 2.0;
output.position = float4(clip_x, clip_y, 0, 1);
output.color = input.color;
return output;
}
Right now i can draw lines as i wanted.The problem is i cant make panning and zooming work.It just doesn't move the screen at all.I'm guessing how i fill linestrip and vertex shader output is wrong.If you any of you help me out of this i will be so glad.
Thank you

Your main issue here stems from a misunderstanding of how 3D space works.
There are five main spaces used in the 3D rendering pipeline: Local Space, World Space, View Space, Projection Space/Homogeneous Clip Space, and Screen Space. The matrices used to convert between them are named for the space they convert to:
The World Matrix converts from Local Space to World Space.
The View Matrix converts from World Space to View Space.
The Projection Matrix converts from View Space to Projection Space.
The rasterizer does the final transformation based on the settings in the D3D11_VIEWPORT structure, specified in the ID3D11DeviceContext::RSSetViewports call.
Your system is simple enough that you don't need a world matrix. Since you presumably are making a 2D drawing program with the 3D API, you can specify the points for each line directly in world space.
In order to support camera operations, however, you need a view matrix, and in order to do the rendering properly, you will need a projection matrix and a D3D11_VIEWPORT structure with the size of the client area filled in.
In your code you create an orthographic projection matrix, which is perfect (though your near and far plane specifications should be the minumum and maximum zoom you support). You don't, however, create a view matrix properly. The view matrix is the inverse transpose of the camera's world matrix, and it can be created directly with XMMatrixLookAtLH or XMMatrixLookToLH (LookTo is likely to be a better option for your system).
Finally, when you update the shader's constant buffer with the new matrix, multiply the view and projection matrices together before writing the new combined matrix down to the buffer.
As a consequence of this new complexity, you no longer need the majority of the code in the vertex shader - you can do the multiplication directly:
VS_OUTPUT vsmain(VS_INPUT input)
{
VS_OUTPUT output = (VS_OUTPUT) 0;
output.position = mul(input.position, m_viewprojection);
output.color = input.color;
return output;
}
The final stage is the adding of new points. Because of the added view and projection matrices, the coordinates you get from the mouse down event are not world space coordinates. You need to convert them to clip space, then multiply them by the inverse of the combined view/projection matrix to convert them into the real coordinates in world space. Then you can add the new line segment to the strip, the same as how you have in your code.
The conversion to clip space is basic algebra - the conversion the rasterizer does is:
[ScreenX, ScreenY] = [(ProjX + 1)*(ScreenWidth/2), (ProjY + 1)*(ScreenHeight/2)]
To convert back to projection space:
[ProjX, ProjY] = [(ScreenX * (2 / ScreenWidth)) - 1, (ScreenY * (2 / ScreenHeight)) - 1]
From there, you can do a matrix-vector multiplication to get the X and Y coordinates in world space. Since this app is 2D (presumably), you can ignore the lost Z component and just take X and Y. If you ever convert to 3D however, you will need to project a ray from the X and Y coordinates out into the world to find the object to select, or somehow fix the Z coordinate with a different algorithm.

Related

OpenGL - Why is my ray picking not working?

I recently setup a project that uses OpenGL (Via the C# Wrapper Library OpenTK) which should do the following:
Create a perspective projection camera - this camera will be used to make the user rotate,move etc. to look at my 3d models.
Draw some 3d objects.
Use 3d ray picking via unproject to let the user pick points/models in the 3d view.
The last step (ray picking) looks ok on my 3d preview (GLControl) but returns invalid results like Vector3d (1,86460186949617; -45,4086124979203; -45,0387025610247). I have no idea why this is the case!
I am using the following code to setup my viewport:
this.RenderingControl.MakeCurrent();
int w = RenderingControl.Width;
int h = RenderingControl.Height;
// Use all of the glControl painting area
GL.Viewport(0, 0, w, h);
GL.MatrixMode(MatrixMode.Projection);
GL.LoadIdentity();
Matrix4 p = Matrix4.CreatePerspectiveFieldOfView(MathHelper.PiOver4, w / (float)h, 0.1f, 64.0f);
GL.LoadMatrix(ref p);
I use this method for unprojecting:
/// <summary>
/// This methods maps screen coordinates to viewport coordinates.
/// </summary>
/// <param name="screen"></param>
/// <param name="view"></param>
/// <param name="projection"></param>
/// <param name="view_port"></param>
/// <returns></returns>
private Vector3d UnProject(Vector3d screen, Matrix4d view, Matrix4d projection, int[] view_port)
{
Vector4d pos = new Vector4d();
// Map x and y from window coordinates, map to range -1 to 1
pos.X = (screen.X - (float)view_port[0]) / (float)view_port[2] * 2.0f - 1.0f;
pos.Y = (screen.Y - (float)view_port[1]) / (float)view_port[3] * 2.0f - 1.0f;
pos.Z = screen.Z * 2.0f - 1.0f;
pos.W = 1.0f;
Vector4d pos2 = Vector4d.Transform(pos, Matrix4d.Invert(Matrix4d.Mult(view, projection)));
Vector3d pos_out = new Vector3d(pos2.X, pos2.Y, pos2.Z);
return pos_out / pos2.W;
}
I use this code to position my camera (including rotation) and do the ray picking:
// Clear buffers
GL.Clear(ClearBufferMask.ColorBufferBit | ClearBufferMask.DepthBufferBit);
// Apply camera
GL.MatrixMode(MatrixMode.Modelview);
Matrix4d mv = Matrix4d.LookAt(EyePosition, Vector3d.Zero, Vector3d.UnitY);
GL.LoadMatrix(ref mv);
GL.Translate(0, 0, ZoomFactor);
// Rotation animation
if (RotationAnimationActive)
{
CameraRotX += 0.05f;
}
if (CameraRotX >= 360)
{
CameraRotX = 0;
}
GL.Rotate(CameraRotX, Vector3.UnitY);
GL.Rotate(CameraRotY, Vector3.UnitX);
// Apply useful rotation
GL.Rotate(50, 90, 30, 0f);
// Draw Axes
drawAxes();
// Draw vertices of my 3d objects ...
// Picking Test
int x = MouseX;
int y = MouseY;
int[] viewport = new int[4];
Matrix4d modelviewMatrix, projectionMatrix;
GL.GetDouble(GetPName.ModelviewMatrix, out modelviewMatrix);
GL.GetDouble(GetPName.ProjectionMatrix, out projectionMatrix);
GL.GetInteger(GetPName.Viewport, viewport);
// get depth of clicked pixel
float[] t = new float[1];
GL.ReadPixels(x, RenderingControl.Height - y, 1, 1, OpenTK.Graphics.OpenGL.PixelFormat.DepthComponent, PixelType.Float, t);
var res = UnProject(new Vector3d(x, viewport[3] - y, t[0]), modelviewMatrix, projectionMatrix, viewport);
GL.Begin(BeginMode.Lines);
GL.Color3(Color.Yellow);
GL.Vertex3(0, 0, 0);
GL.Vertex3(res);
Debug.WriteLine(res.ToString());
GL.End();
I get the following result from my ray picker:
Clicked Position = (1,86460186949617; -45,4086124979203;
-45,0387025610247)
This vector is shown as the yellow line on the attached screenshot.
Why is the Y and Z Position not in the range -1/+1? Where do these values like -45 come from and why is the ray rendered correctly on the screen?
If you have only a tip about what could be broken I would also appreciate your reply!
Screenshot:
If you break down the transform from screen to world into individual matrices, print out the inverses of the M, V, and P matrices, and print out the intermediate result of each (matrix inverse) * (point) calculation from screen to world/model, then I think you'll see the problem. Or at least you'll see that there is a problem with using the inverse of the M-V-P matrix and then intuitively grasp the solution. Or maybe just read the list of steps below and see if that helps.
Here's the approach I've used:
Convert the 2D vector for mouse position in screen/control/widget coordinates to the 4D vector (mouse.x, mouse.y, 0, 1).
Transform the 4D vector from screen coordinates to Normalized Device Coordinates (NDC) space. That is, multiply the inverse of your NDC-to-screen matrix [or equivalent equations] by (mouse.x, mouse.y, 0, 1) to yield a 4D vector in NDC coordinate space: (nx, ny, 0, 1).
In NDC coordinates, define two 4D vectors: the source (near point) of the ray as (nx, ny, -1, 1) and a far point at (nx, ny, +1, 1).
Multiply each 4D vector by the inverse of the (perspective) projection matrix.
Convert each 4D vector to a 3D vector (i.e. divide through by the fourth component, often called "w"). *
Multiply the 3D vectors by the inverse of the view matrix.
Multiply the 3D vectors by the inverse of the model matrix (which may well be the identity matrix).
Subtract the 3D vectors to yield the ray.
Normalize the ray.
Yee-haw. Go back and justify each step with math, if desired, or save that review for later [if ever] and work frantically towards catching up on creating actual 3D graphics and interaction and whatnot.
Go back and refactor, if desired.
(* The framework I use allows multiplication of a 3D vector by a 4x4 matrix because it treats the 3D vector as a 4D vector. I can make this more clear later, if necessary, but I hope the point is reasonably clear.)
That worked for me. This set of steps also works for Ortho projections, though with Ortho you could cheat and write simpler code since the projection matrix isn't goofy.
It's late as I write this and I may have misinterpreted your problem. I may have also misinterpreted your code since I use a different UI framework. But I know how aggravating ray casting for OpenGL can be, so I'm posting in the hope that at least some of what I write is useful, and that I can thereby alleviate some human misery.
Postscript. Speaking of misery: I found numerous forum posts and blog pages that address ray casting for OpenGL, but most posts start with some variant of the following: "First, you have to know X" [where X is not necessary to know]; or "Go look at the unproject function [in library X in repository Y for which you'll need client app Z . ..]"; or a particular favorite of mine: "Review a textbook on linear algebra."
Having to slog through yet another description of the OpenGL rendering pipeline or the OpenGL transformation conga line when you just need to debug ray casting--a common problem--is like having to listen to a lecture on hydraulics when you discover your brake pedal isn't working.

OpenGL 2D Text in 3D Space [C++/GLM] Matrix Multiplication

I was trying to display 2D text in using 3D coordinates.
I was following this tutorial (Solution #1: The 2D way). I did everything as shown in this tutorial but something is probably wrong. Here is the code:
void Update()
{
glm::mat4 projectionMatrix = glm::perspective(45.0,640.0/480.0,0.01,500.0);
glm::mat4 viewMatrix = glm::lookAt(glm::vec3(0.0,0.0,0.0),glm::vec3(0.0,0.0,-5.0),glm::vec3(0.0,1.0,0.0));
glm::vec4 worldSpace(0.f,1.0,-5.f,1.0);
glm::vec4 screenSpace = projectionMatrix * viewMatrix * worldSpace;
screenSpace /= screenSpace.w;
ovlay.setPosition(screenSpace.x,screenSpace.y);
}
projectionMatrix is the perspective that I'm using.
viewMatrix is my camera position and direction.
worldSpace is the position in 3D that I want to use to calculate 2D coords.
screenSpace should give me the position in 2D space but I get some weird result:
x = 0, y = 0.358518
I think it should be something like x = 320, y = 100.
If someone knows what I did wrong I'd be thankful.
Well, what this code calculates are normalized device coordinates of thet worldSpace point. The viewing volume is [-1,1] along all axis in this space, so x=0 is exactly at the center and y=0.358518 is somewhere above the center.
If you want the window space position, you need to take the viewport into account. Assuming your viewport fillst the whole window of size w * h pixels, you can get the window position as:
wx = (x + 1.0f) * 0.5f * w;
wy = (y + 1.0f) * 0.5f * h;
Assuming the 640x480 resoluting suggested by your projection matrix, this would give (320, 326). I don't know why you'd expect y as 100. Note the GL uses the bottom left corner as origin. In typical window systems, origin is at the top, so y=326 in the GL would match y'=153 in that other convention.

Cascaded Shadow maps not quite right

Ok. So, I've been messing around with shadows in my game engine for the last week. I've mostly implemented cascading shadow maps (CSM), but I'm having a bit of a problem with shadowing that I just can't seem to solve.
The only light in this scene is a directional light (sun), pointing {-0.1 -0.25 -0.65}. I calculate 4 sets of frustum bounds for the four splits of my CSMs with this code:
// each projection matrix calculated with same near plane, different far
Frustum make_worldFrustum(const glm::mat4& _invProjView) {
Frustum fr; glm::vec4 temp;
temp = _invProjView * glm::vec4(-1, -1, -1, 1);
fr.xyz = glm::vec3(temp) / temp.w;
temp = _invProjView * glm::vec4(-1, -1, 1, 1);
fr.xyZ = glm::vec3(temp) / temp.w;
...etc 6 more times for ndc cube
return fr;
}
For the light, I get a view matrix like this:
glm::mat4 viewMat = glm::lookAt(cam.pos, cam.pos + lightDir, {0,0,1});
I then create each ortho matrix from the bounds of each frustum:
lightMatVec.clear();
for (auto& frus : cam.frusVec) {
glm::vec3 arr[8] {
glm::vec3(viewMat * glm::vec4(frus.xyz, 1)),
glm::vec3(viewMat * glm::vec4(frus.xyZ, 1)),
etc...
};
glm::vec3 minO = {INFINITY, INFINITY, INFINITY};
glm::vec3 maxO = {-INFINITY, -INFINITY, -INFINITY};
for (auto& vec : arr) {
minO = glm::min(minO, vec);
maxO = glm::max(maxO, vec);
}
glm::mat4 projMat = glm::ortho(minO.x, maxO.x, minO.y, maxO.y, minO.z, maxO.z);
lightMatVec.push_back(projMat * viewMat);
}
I have a 4 layer TEXTURE_2D_ARRAY bound to 4 framebuffers that I draw the scene into with a very simple vertex shader (frag disabled or punchthrough alpha).
I then draw the final scene. The vertex shader outputs four shadow texcoords:
out vec3 slShadcrd[4];
// stuff
for (int i = 0; i < 4; i++) {
vec4 sc = WorldBlock.skylMatArr[i] * vec4(world_pos, 1);
slShadcrd[i] = sc.xyz / sc.w * 0.5f + 0.5f;
}
And a fragment shader, which determines the split to use with:
int csmIndex = 0;
for (uint i = 0u; i < CameraBlock.csmCnt; i++) {
if (-view_pos.z > CameraBlock.csmSplits[i]) index++;
else break;
}
And samples the shadow map array with this function:
float sample_shadow(vec3 _sc, int _csmIndex, sampler2DArrayShadow _tex) {
return texture(_tex, vec4(_sc.xy, _csmIndex, _sc.z)).r;
}
And, this is the scene I get (with each split slightly tinted and the 4 depth layers overlayed):
Great! Looks good.
But, if I turn the camera slightly to the right:
Then shadows start disappearing (and depending on the angle, appearing where they shouldn't be).
I have GL_DEPTH_CLAMP enabled, so that isn't the issue. I'm culling front faces, but turning that off doesn't make a difference to this issue.
What am I missing? I feel like it's an issue with one of my projections, but they all look right to me. Thanks!
EDIT:
All four of the the light's frustums drawn. They are all there, but only z is changing relative to the camera (see comment below):
EDIT:
Probably more useful, this is how the frustums look when I only update them once, when the camera is at (0,0,0) and pointing forwards (0,1,0). Also I drew them with depth testing this time.
IMPORTANT EDIT:
It seems that this issue is directly related to the light's view matrix, currently:
glm::mat4 viewMat = glm::lookAt(cam.pos, cam.pos + lightDir, {0,0,1});
Changing the values for eye and target seems to affect the buggered shadows. But I don't know what I should actually be setting this to? Should be easy for someone with a better understanding than me :D
Solved it! It was indeed an issue with the light's view matrix! All I had to do was replace camPos with the centre point of each frustum! Meaning that each split's light matrix needed a different view matrix. So I just create each view matrix like this...
glm::mat4 viewMat = glm::lookAt(frusCentre, frusCentre+lightDir, {0,0,1});
And get frusCentre simply...
glm::vec3 calc_frusCentre(const Frustum& _frus) {
glm::vec3 min(INFINITY, INFINITY, INFINITY);
glm::vec3 max(-INFINITY, -INFINITY, -INFINITY);
for (auto& vec : {_frus.xyz, _frus.xyZ, _frus.xYz, _frus.xYZ,
_frus.Xyz, _frus.XyZ, _frus.XYz, _frus.XYZ}) {
min = glm::min(min, vec);
max = glm::max(max, vec);
}
return (min + max) / 2.f;
}
And bam! Everything works spectacularly!
EDIT (Last one!):
What I had was not quite right. The view matrix should actually be:
glm::lookAt(frusCentre-lightDir, frusCentre, {0,0,1});

How do you set the Bounds of glm::ortho based on scene max and min coordinates?

I have a triangle and have 3 vertices anywhere in space.
I attempted to get the max and min coordinates for it.
void findBoundingBox(glm::vec3 & minBB, glm::vec3 & maxBB)
{
minBB.x = std::min(minBB.x, mCoordinate.x);
minBB.y = std::min(minBB.y, mCoordinate.y);
minBB.z = std::min(minBB.z, mCoordinate.z);
maxBB.x = std::max(maxBB.x, mCoordinate.x);
maxBB.y = std::max(maxBB.y, mCoordinate.y);
maxBB.z = std::max(maxBB.z, mCoordinate.z);
}
}
Now I tried to set
:
glm::vec3 InverseViewDirection(50.0f, 200, 200); //Inverse View Direction
glm::vec3 LookAtPosition(0.0,0,0); // I can make it anywhere with barycentric coord, but this is the simple case
glm::vec3 setupVector(0.0, 1, 0);
I tried to set the orthographic view to wrap the triangle by:
myCamera.setProjectionMatrix(min.x, max.x, max.y,min.y, 0.0001f, 10000.0f);
But its not neatly bounding the triangle in my view.
I've been stumped on this for a day, any pointers?
Bad: output : (I want the view to neatly bound the triangle)
Edit:
Based on a comment ( I have tried to update the bounds with the view matrix (model is identity, so ignoring that for now)
still no luck :(
glm::vec4 minSS = ((myCamera.getViewMatrix()) * glm::vec4(minWS, 0.0));
glm::vec4 maxSS = ((myCamera.getViewMatrix()) * glm::vec4(maxWS, 0.0));
myCamera.setProjectionMatrix(minSS.x, maxSS.x, maxSS.y, minSS.y, -200.0001f, 14900.0f);
You will need to apply all transformations that come before the perspective transformation to your input points when you calculate the bounding box.
In your code fragments, it looks like you're applying a viewing transform with an arbitrary viewpoint (50, 200, 200) as part of your rendering. You need to apply this same transformation to your input points before you feed them into your findBoundingBox() function.
In more mathematical terms, you typically have something like this in your vertex shader, with InputPosition being the original vertex coordinates:
gl_Position = ProjectionMatrix * ViewMatrix * ModelMatrix * InputPosition;
To determine a projection matrix that will map all your points to a given range, you need to look at all points that the projection matrix is applied to. With the notation above, those points are ViewMatrix * ModelMatrix * InputPosition. So when you calculate the bounding box, the model and view matrices (or the modelview matrix if you combine them) needs to be applied to the input points.

converting 3d position to 2d screen position

I'd like to convert a 3d position into 2d screen position. I had a look at a similar question: Projecting a 3D point to a 2D screen coordinate , but I dont understand it completely. I thought in order to calculate the 2d position I would need the projection matrix, but I dont see how it is used, apart from converting a point into the location coordinate space. Besides, is cam.FieldOfView equal to farZ in OpenGL?
Could someone please help me complete this function. Are the parameters sufficient to calculate the 2d position? Pos is already a vector relative to the camera position.
Vector2* convert(Vector3& pos, Matrix4& projectionMatrix, int screenWidth, int screenHeight)
{
float ratio = screenWidth / screenHeight;
...
screenX = screenWidth * ( 1.0f - screenX);
screenY = screenHeight * ( 1.0f - screenY);
return new Vector2(screenX, screenY);
}
Seems to me it would be something like that:
Vector2 Convert(Vector3 pos, const Matrix& viewMatrix, const Matrix& projectionMatrix, int screenWidth, int screenHeight)
{
pos = Vector3::Transform(pos, viewMatrix);
pos = Vector3::Transform(pos, projectionMatrix);
pos.X = screenWidth*(pos.X + 1.0)/2.0;
pos.Y = screenHeight * (1.0 - ((pos.Y + 1.0) / 2.0));
return Vector2(pos.X, pos.Y);
}
What are we doing here is just passing the Vector though the two transformation matrices: the view, then the projection. After the projection you get a vector with Y and X between -1 and 1. We do the appropriate transformation to obtain real pixel coordinates and return a new Vector2. Note that the Z component of 'pos' also store the depth of the point, in the screen space, at the end of the function.
You need the 'view' matrix because it defines where the camera is located and rotated. The projection only defines the way the 3D space is 'flattened' on the 2D space.
A field of view is not the farZ. A projection matrix has some parameters, among them:
the field of view, FOV, that is the horizontal angle of view, in radians;
the far plane, or farZ : this defines the maximum distance a point can be from the camera;
the near plane, nearZ: the minimum distance a point can be from the camera.
Besides the math problem, you may use directly the Vector2 instead of a heap allocation (returning a pointer). Vector2 is a light structure and pointers are very likely to cause headaches in this context (where are you going to delete it, and so on). Also note that I used 'const' references as we do not modify them, except the vector. For this one we want a local copy, this is why it is not a reference at all.
Previous code only work if you do not do any rotations (for eg. GL.Rotate(rotation_x, 1.0, 0.0, 0.0)).
But if you do here is the code:
private Vector2 Convert(Vector3 pos, Matrix4 viewMatrix, Matrix4 projectionMatrix, int screenWidth, int screenHeight)
{
pos = Vector3.Transform(pos, viewMatrix);
pos = Vector3.Transform(pos, projectionMatrix);
pos.X /= pos.Z;
pos.Y /= pos.Z;
pos.X = (pos.X + 1) * screenWidth / 2;
pos.Y = (pos.Y + 1) * screenHeight / 2;
return new Vector2(pos.X, pos.Y);
}
I think what you're looking for is a replacement for gluLookAt. Given a position and orientation it converts the scene geometry into screen coordinates for rendering. As the article says, it relies on a number of deprecated features of OpenGL, but it does provide a code sample you can implement using your vector / matrix library. More detailed information on the projection matrices is available from here.
Once you have the projection matrix you simply apply it to your vectors (post-multiply your scene's vectors by the projection matrix) and then just drop the Z component of the resulting vector ... that is, just use the X and Y components of the resultant vectors.