OpenGL screen-to-world coordinates conversion - opengl

So the default 2d clipping area of opengl is left -1.0 to right 1.0, and buttom -1.0 to top 1.0
And the window I created for an opengl program is 640 pixles in width and 480 pixels in height. The top left pixel is (0,0), the button right pixel is (640, 480)
I also wrote a function to retrive the coordinates when I click and drag and release the mouse button(When I click, it's (x1,y1) and when I release it's(x2,y2) )
So what should I do to convert (x1,y1) and (x2,y2) to the corresponding position in the clipping area?

The answer given by #BDL might get you close enough for what you need, but the calculations are not really correct.
The division needs to be by the number of pixels in each coordinate direction, because you do have 640/480 pixels within the coordinate range.
One subtle detail to take into account is that, when you get a given position from your mouse input, these will be the integer coordinates of the pixels. If you simply apply the scaling based on the window size, the resulting OpenGL coordinate would map to the left/bottom edge of the pixel. But what you most likely want is the center of the pixel. To precisely transform this into the OpenGL coordinate space, you can simply apply a 0.5 offset to your input value, moving the value from the edge to the center of the pixel.
For example, the left most pixel would have x-coordinate 0, the right most 639. The centers of these two, after applying the 0.5 offset, are 0.5 and 639.5. Applying this correction, you can also see that they are now both a distance of 0.5 away from the corresponding edges of the area at 0 and 640, making the whole thing symmetrical.
So the correct calculation is:
float xClip = ((xPix + 0.5f) / 640.0f) * 2.0f - 1.0f;
float yClip = 1.0f - ((yPix + 0.5f) / 480.0f) * 2.0f;
Or slightly simplified:
float xClip = (xPix + 0.5f) / 320.0f - 1.0f;
float yClip = 1.0f - (yPix + 0.5f) / 240.0f;
This takes the y-inversion into account.

I assume that the rightmost pixel is 639 (otherwise your window would be 641 pixels large).
The transformation is quiet simple, we just need a linear mapping. To transform a point P from pixel coordinates to clipping coordinates one can use the following formula
319.5
P_clip = (P_pixel / [ ]) - 1.0
239.5
Let's go over it step by step for the x coordinate. First we transform the [0, 639] range to a [0, 1] range by dividing through the window width
P_01 = P_pixel_x / 639
Then we transform from [0, 1] to [-1, 1] by multiplying by 2 and subtracting 1
P_clip_x = P_01 * 2 - 1
When one combines these two calculations and extends it to the y coordinate one gets the equation given above.

Related

X,Y position of semi cylinder - ray - triangle- intersection to space [-1,1] [-1,1]

I am rendering a tile map to a fbo and then moving the resulted buffer to a texture and rendering it on a FSQ. Then from the mouse click events, I got the screen coordinates and move them to clip space [-1,1]:
glm::vec2 posMouseClipSpace((2.0f * myCursorPos.x) / myDeviceWidth -
1.0f, 1.0f - (2.0f * myCursorPos.y) / myDeviceHeight);
I have logic on my program that based on those coordinates, it selects a specific tile on the texture.
Now, moving to 3D, I am texturing a semi cylinder with the FBO I used in the previous step:
In this case I am using a ray-triangle intersection point that hits the cylinder with radius r and height h. The idea is moving this intersection point to space [-1,1] so I can keep the logic on my program to select tiles
I use the Möller–Trumbore algorithm to check points on the cylinder hit by a ray. Lets say the intersected point is (x,y) (not sure if the point is in triangle, object or world space. Apparently it's worldspace).
I want to translate that point to space x:[-1,1], y[-1,1].
I know the height of my cylinder, which is a quarter of the cylinder's arc length:
cylinderHeight = myRadius * (PI/2);
so the point in the Y axis can be set in [-1,1]space:
vec2.y = (2.f * (intersectedPoint.y - myCylinder->position().y) ) /
(myCylinder->height()) - 1.f
and That works perfectly.
However, How to compute the horizontal axis which depends on 2 variables x and z?
Currently, my cylinder's radius is 1, so by coincidence a semi cylinder set in the origin would go from (-1 ,1) on the X axis, which made me think it was [-1,1] space, but it turns out is not.
My next approach was using the arc length of a semi circle s =r * PI and then plug that value into the equation:
vec2.x = (2.f * (intersectedPoint.x - myCylinder->position().x) ) /
(myCylinder->arcLength()) - 1.f
but clearly it goes off by 1 unit on the negative direction.
I appreciate the help.
From your description, it seems that you want to convert the world space intersection coordinate to its corresponding normalized texture coordinate.
For this you need the Z coordinate as well, as there must be two "horizontal" coordinates. However you don't need the arc length.
Using the relative X and Z coordinates of intersectedPoint, calculate the polar angle using atan2, and divide by PI (the angular range of the semi-circle arc):
vec2.x = atan2(intersectedPoint.z - myCylinder->position().z,
myCylinder->position().x - intersectedPoint.x) / PI;

Wrong aspect ratio calculations for camera (simple ray-caster)

I am working on some really simple ray-tracer.
For now I am trying to make the perspective camera works properly.
I use such loop to render the scene (with just two, hard-coded spheres - I cast ray for each pixel from its center, no AA applied):
Camera * camera = new PerspectiveCamera({ 0.0f, 0.0f, 0.0f }/*pos*/,
{ 0.0f, 0.0f, 1.0f }/*direction*/, { 0.0f, 1.0f, 0.0f }/*up*/,
buffer->getSize() /*projectionPlaneSize*/);
Sphere * sphere1 = new Sphere({ 300.0f, 50.0f, 1000.0f }, 100.0f); //center, radius
Sphere * sphere2 = new Sphere({ 100.0f, 50.0f, 1000.0f }, 50.0f);
for(int i = 0; i < buffer->getSize().getX(); i++) {
for(int j = 0; j < buffer->getSize().getY(); j++) {
//for each pixel of buffer (image)
double centerX = i + 0.5;
double centerY = j + 0.5;
Geometries::Ray ray = camera->generateRay(centerX, centerY);
Collision * collision = ray.testCollision(sphere1, sphere2);
if(collision){
//output red
}else{
//output blue
}
}
}
The Camera::generateRay(float x, float y) is:
Camera::generateRay(float x, float y) {
//position = camera position, direction = camera direction etc.
Point2D xy = fromImageToPlaneSpace({ x, y });
Vector3D imagePoint = right * xy.getX() + up * xy.getY() + position + direction;
Vector3D rayDirection = imagePoint - position;
rayDirection.normalizeIt();
return Geometries::Ray(position, rayDirection);
}
Point2D fromImageToPlaneSpace(Point2D uv) {
float width = projectionPlaneSize.getX();
float height = projectionPlaneSize.getY();
float x = ((2 * uv.getX() - width) / width) * tan(fovX);
float y = ((2 * uv.getY() - height) / height) * tan(fovY);
return Point2D(x, y);
}
The fovs:
double fovX = 3.14159265359 / 4.0;
double fovY = projectionPlaneSize.getY() / projectionPlaneSize.getX() * fovX;
I get good result for 1:1 width:height aspect (e.g. 400x400):
But I get errors for e.g. 800x400:
Which is even slightly worse for bigger aspect ratios (like 1200x400):
What did I do wrong or which step did I omit?
Can it be a problem with precision or rather something with fromImageToPlaneSpace(...)?
Caveat: I spent 5 years at a video company, but I'm a little rusty.
Note: after writing this, I realized that pixel aspect ratio may not be your problem as the screen aspect ratio also appears to be wrong, so you can skip down a bit.
But, in video we were concerned with two different video sources: standard definition with a screen aspect ratio of 4:3 and high definition with a screen aspect ratio of 16:9.
But, there's also another variable/parameter: pixel aspect ratio. In standard definition, pixels are square and in hidef pixels are rectangular (or vice-versa--I can't remember).
Assuming your current calculations are correct for screen ratio, you may have to account for the pixel aspect ratio being different, either from camera source or the display you're using.
Both screen aspect ratio and pixel aspect ratio can be stored a .mp4, .jpeg, etc.
I downloaded your 1200x400 jpeg. I used ImageMagick on it to change only the pixel aspect ratio:
convert orig.jpg -resize 125x100%\! new.jpg
This says change the pixel aspect ratio (increase the width by 125% and leave the height the same). The \! means pixel vs screen ratio. The 125 is because I remember the rectangular pixel as 8x10. Anyway, you need to increase the horizontal width by 10/8 which is 1.25 or 125%
Needless to say this gave me circles instead of ovals.
Actually, I was able to get the same effect with adjusting the screen aspect ratio.
So, somewhere in your calculations, you're introducing a distortion of that factor. Where are you applying the scaling? How are the function calls different?
Where do you set the screen size/ratio? I don't think that's shown (e.g. I don't see anything like 1200 or 400 anywhere).
If I had to hazard a guess, you must account for aspect ratio in fromImageToPlaneSpace. Either width/height needs to be prescaled or the x = and/or y = lines need scaling factors. AFAICT, what you've got will only work for square geometry at present. To test, using the 1200x400 case, multiply the x by 125% [a kludge] and I bet you get something.
From the images, it looks like you have incorrectly defined the mapping from pixel coordinates to world coordinates and are introducing some stretch in the Y axis.
Skimming your code it looks like you are defining the camera's view frustum from the dimensions of the frame buffer. Therefore if you have a non-1:1 aspect ratio frame buffer, you have a camera whose view frustum is not 1:1. You will want to separate the model of the camera's view frustum from the image space dimension of the final frame buffer.
In other words, the frame buffer is the portion of the plane projected by the camera that we are viewing. The camera defines how the 3D space of the world is projected onto the camera plane.
Any basic book on 3D graphics will discuss viewing and projection.

Computer graphics: how to draw this effect using computer programs?

I am wondering how to draw this effect using computer programs, either CPU or GPU?
You have two lines there. What you want to do is to pick the closer line for each pixel, and calculate the distance to it. This will be your intensity at a given point. Furthermore do a fade to black as you approach the bottom of the image (use your pixel's y position to do this)
your lines seem to be at exactly at 25% and 75% on the x axis, therefore a pseudode looks like this:
for each pixel p: //p.x and p.y is normalized to the 0-1 range!
intensity = ( 0.25 - min( abs(p.x-0.25) , abs(p.x-0.75) ) ) / 0.25; //intensity is normalized to 0-1 range
intensity *= intensity; //distance squared
intensity *= (1.0 - p.y); //Top of image is 0, bottom is 1
display_intensity();
end
Depending on how you want to use this, you can create a texture on the CPU, or use a shader and calculate it in GLSL on the GPU.

How to move a camera using in a ray-tracer?

I am currently working on ray-tracing techniques and I think I've made a pretty good job; but, I haven't covered camera yet.
Until now, I used a plane fragment for view plane which is located between (-width/2, height/2, 200) and (width/2, -height/2, 200) [200 is just a fixed number of z, can be changed].
Addition to that, I use the camera mostly on e(0, 0, 1000), and I use a perspective projection.
I send rays from point e to pixels, and print it to image's corresponding pixel after calculating the pixel color.
Here is a image I created. Hopefully you can guess where eye and view plane are by looking at the image.
My question starts from here. It's time to move my camera around, but I don't know how to map 2D view plane coordinates to the canonical coordinates. Is there a transformation matrix for that?
The method I think requires to know the 3D coordinates of pixels on view plane. I am not sure it's the right method to use. So, what do you suggest?
There are a variety of ways to do it. Here's what I do:
Choose a point to represent the camera location (camera_position).
Choose a vector that indicates the direction the camera is looking (camera_direction). (If you know a point the camera is looking at, you can compute this direction vector by subtracting camera_position from that point.) You probably want to normalize (camera_direction), in which case it's also the normal vector of the image plane.
Choose another normalized vector that's (approximately) "up" from the camera's point of view (camera_up).
camera_right = Cross(camera_direction, camera_up)
camera_up = Cross(camera_right, camera_direction) (This corrects for any slop in the choice of "up".)
Visualize the "center" of the image plane at camera_position + camera_direction. The up and right vectors lie in the image plane.
You can choose a rectangular section of the image plane to correspond to your screen. The ratio of the width or height of this rectangular section to the length of camera_direction determines the field of view. To zoom in you can increase camera_direction or decrease the width and height. Do the opposite to zoom out.
So given a pixel position (i, j), you want the (x, y, z) of that pixel on the image plane. From that you can subtract camera_position to get a ray vector (which then needs to be normalized).
Ray ComputeCameraRay(int i, int j) {
const float width = 512.0; // pixels across
const float height = 512.0; // pixels high
double normalized_i = (i / width) - 0.5;
double normalized_j = (j / height) - 0.5;
Vector3 image_point = normalized_i * camera_right +
normalized_j * camera_up +
camera_position + camera_direction;
Vector3 ray_direction = image_point - camera_position;
return Ray(camera_position, ray_direction);
}
This is meant to be illustrative, so it is not optimized.
For rasterising renderers, you tend to need a transformation matrix because that's how you map directly from 3D coordinates to screen 2D coordinates.
For ray tracing, it's not necessary because you're typically starting from a known pixel coordinate in 2D space.
Given the eye position, a point in 3-space that's in the center of the screen, and vectors for "up" and "right", it's quite easy to calculate the 3D "ray" that goes from the eye position and through the specified pixel.
I've previously posted some sample code from my own ray tracer at https://stackoverflow.com/a/12892966/6782

Zooming into the mouse, factoring in a camera translation? (OpenGL)

Here is my issue, I have a scale point, which is the unprojected mouse position. I also have a "camera which basically translates all objects by X and Y. What I want to do is achieve zooming into mouse position.
I'v tried this:
1. Find the mouse's x and y coordinates
2. Translate by (x,y,0) to put the origin at those coordinates
3. Scale by your desired vector (i,j,k)
4. Translate by (-x,-y,0) to put the origin back at the top left
But this doesn't factor in a translation for the camera.
How can I properly do this. Thanks
glTranslatef(controls.MainGlFrame.GetCameraX(),
controls.MainGlFrame.GetCameraY(),0);
glTranslatef(current.ScalePoint.x,current.ScalePoint.y,0);
glScalef(current.ScaleFactor,current.ScaleFactor,0);
glTranslatef(-current.ScalePoint.x,-current.ScalePoint.y,0);
Instead of using glTranslate to move all the objects, you should try glOrtho. It takes as parameters the wanted left coords, right coords, bottom coords, top coords, and min/max depth.
For example if you call glOrtho(-5, 5, -2, 2, ...); your screen will show all the points whose coords are inside a rectangle going from (-5,2) to (5,-2). The advantage is that you can easily adjust the zoom level.
If you don't multiply by any view/projection matrix (which I assume is the case), the default screen coords range from (-1,1) to (1,-1).
But in your project it can be very useful to control the camera. Call this before you draw any object instead of your glTranslate:
float left = cameraX - zoomLevel * 2;
float right = cameraX + zoomLevel * 2;
float top = cameraY + zoomLevel * 2;
float bottom = cameraY - zoomLevel * 2;
glOrtho(left, right, bottom, top, -1.f, 1.f);
Note that cameraX and cameraY now represent the center of the screen.
Now when you zoom on a point, you simply have to do something like this:
cameraX += (cameraX - screenX) * 0.5f;
cameraY += (cameraY - screenY) * 0.5f;
zoomLevel += 0.5f;