Ray picking with depth buffer: horribly inaccurate? - c++

I'm trying to implement a ray picking algorithm, for painting and selecting blocks (thus I need a fair amount of accuracy). Initially I went with a ray casting implementation, but I didn't feel it was accurate enough (although the fault may have been with my intersection testing). Regardless, I decided to try picking by using the depth buffer, and transforming the mouse coordinates to world coordinates. Implementation below:
glm::vec3 Renderer::getMouseLocation(glm::vec2 coordinates) {
float depth = deferredFBO->getDepth(coordinates);
// Calculate the width and height of the deferredFBO
float viewPortWidth = deferredArea.z - deferredArea.x;
float viewPortHeight = deferredArea.w - deferredArea.y;
// Calculate homogenous coordinates for mouse x and y
float windowX = (2.0f * coordinates.x) / viewPortWidth - 1.0f;
float windowY = 1.0f - (2.0f * coordinates.y) / viewPortHeight;
// cameraToClip = projection matrix
glm::vec4 cameraCoordinates = glm::inverse(cameraToClipMatrix)
* glm::vec4(windowX, windowY, depth, 1.0f);
// Normalize
cameraCoordinates /= cameraCoordinates.w;
glm::vec4 worldCoordinates = glm::inverse(worldToCameraMatrix)
* cameraCoordinates;
return glm::vec3(worldCoordinates);
}
The problem is that the values are easily ±3 units (blocks are 1 unit wide), only getting accurate enough when very close to the near clipping plane.
Does the inaccuracy stem from using single-precision floats, or maybe some step in my calculations? Would it help if I used double-precision values, and does OpenGL even support that for depth buffers?
And lastly, if this method doesn't work, am I best off using colour IDs to accurately identify which polygon was picked?

Colors are the way to go, the depth buffers accuracy depend on the plane distances, the resolution of the FBO texture, also on the normal or slope of the surface.The same precision problem happens during the standard shadowing.(Using colors is a bit easier because of with the depth intersection test one object have more "color", depth values. It's more accurate if one object has one color.)
Also, maybe its just me, but I like to avoid rather complex matrix calculations if they're not necessary. It's enough for the poor CPU to do the other stuffs.
For double precision values, that could drop performance badly. I've encountered this kind of performance drop, it was about 3x slower for me to use doubles rather than floats:
my post:
GLSL performance - function return value/type and an
article about this:
https://superuser.com/questions/386456/why-does-a-geforce-card-perform-4x-slower-in-double-precision-than-a-tesla-card
so yep, you can, use 64 bit floats (double):
http://www.opengl.org/registry/specs...hader_fp64.txt,
and http://www.opengl.org/registry/specs...trib_64bit.txt,
but you should not.
All in all use colored polys, I like colors khmm...
EDIT: more about double precision depth : http://www.opengl.org/discussion_boards/showthread.php/173450-Double-Precision, its a pretty good discussion

Related

OpenGL Terrain System, small height difference between GPU and CPU

A quick summary:
I've a simple Quad tree based terrain rendering system that builds terrain patches which then sample a heightmap in the vertex shader to determine the height of each vertex.
The exact same calculation is done on the CPU for object placement and co.
Super straightforward, but now after adding some systems to procedurally place objects I've discovered that they seem to be misplaced by just a small amount. To debug this I render a few crosses as single models over the terrain. The crosses (red, green, blue lines) represent the height read from the CPU. While the terrain mesh uses a shader to translate the vertices.
(I've also added a simple odd/even gap over each height value to rule out a simple offset issue. So those ugly cliffs are expected, the submerged crosses are the issue)
I'm explicitly using GL_NEAREST to be able to display the "raw" height value:
As you can see the crosses are sometimes submerged under the terrain instead of representing its exact height.
The heightmap is just a simple array of floats on the CPU and on the GPU.
How the data is stored
A simple vector<float> which is uploaded into a GL_RGB32F GL_FLOAT buffer. The floats are not normalized and my terrain usually contains values between -100 and 500.
How is the data accessed in the shader
I've tried a few things to rule out errors, the inital:
vec2 terrain_heightmap_uv(vec2 position, Heightmap heightmap)
{
return (position + heightmap.world_offset) / heightmap.size;
}
float terrain_read_height(vec2 position, Heightmap heightmap)
{
return textureLod(heightmap.heightmap, terrain_heightmap_uv(position, heightmap), 0).r;
}
Basics of the vertex shader (the full shader code is very long, so I've extracted the part that actually reads the height):
void main()
{
vec4 world_position = a_model * vec4(a_position, 1.0);
vec4 final_position = world_position;
// snap vertex to grid
final_position.x = floor(world_position.x / a_quad_grid) * a_quad_grid;
final_position.z = floor(world_position.z / a_quad_grid) * a_quad_grid;
final_position.y = terrain_read_height(final_position.xz, heightmap);
gl_Position = projection * view * final_position;
}
To ensure the slightly different way the position is determined I tested it using hardcoded values that are identical to how C++ reads the height:
return texelFetch(heightmap.heightmap, ivec2((position / 8) + vec2(1024, 1024)), 0).r;
Which gives the exact same result...
How is the data accessed in the application
In C++ the height is read like this:
inline float get_local_height_safe(uint32_t x, uint32_t y)
{
// this macro simply clips x and y to the heightmap bounds
// it does not interfer with the result
BB_TERRAIN_HEIGHTMAP_BOUND_XY_TO_SAFE;
uint32_t i = (y * _size1d) + x;
return buffer->data[i];
}
inline float get_height_raw(glm::vec2 position)
{
position = position + world_offset;
uint32_t x = static_cast<int>(position.x);
uint32_t y = static_cast<int>(position.y);
return get_local_height_safe(x, y);
}
float BB::Terrain::get_height(const glm::vec3 position)
{
return heightmap->get_height_raw({position.x / heightmap_unit_scale, position.z / heightmap_unit_scale});
}
What have I tried:
Comparing the Buffers
I've dumped the first few hundred values from the vector. And compared it with the floating point buffer uploaded to the GPU using Nvidia Nsight, they are equal, rounding/precision errors there.
Sampling method
I've tried texture, textureLod and texelFetch to rule out some issue there, they all give me the same result.
Rounding
The super strange thing, when I round all the height values. They are perfectly aligned which just screams floating point precision issues.
Position snapping
I've tried rounding, flooring and ceiling the position, to ensure the position always maps to the same texel. I also tried adding an epsilon offset to rule out a positional precision error (probably stupid because the terrain is stable...)
Heightmap sizes
I've tried various heightmaps, also of different sizes.
Heightmap patterns
I've created a heightmap containing a pattern to ensure the position is not just offsetet.

How to prevent excessive SSAO at a distance

I am using SSAO very nearly as per John Chapman's tutorial here, in fact, using Sascha Willems Vulkan example.
One difference is the fragment position is saved directly to a G-Buffer along with linear depth (so there are x, y, z, and w coordinates, w being the linear depth, calculated in the G-Buffer shader. Depth is calculated like this:
float linearDepth(float depth)
{
return (2.0f * ubo.nearPlane * ubo.farPlane) / (ubo.farPlane + ubo.nearPlane - depth * (ubo.farPlane - ubo.nearPlane));
}
My scene typically consists of a large, flat floor with a model in the centre. By large I mean a lot bigger than the far clip distance.
At high depth values (i.e. at the horizon in my example), the SSAO is generating occlusion where there should really be none - there's nothing out there except a completely flat surface.
Along with that occlusion, there comes some banding as well.
Any ideas for how to prevent these occlusions occurring?
I found this solution while I was writing the question, which works only because I have a flat floor.
I look up the normal value at each kernel sample position, and compare to the current normal, discarding any with a dot product that is close to 1. This means flat planes can't self-occlude.
Any comments on why I shouldn't do this, or better alternatives, would be very welcome!
It works for my current situation but if I happened to have non-flat geometry on the floor I'd be looking for a different solution.
vec3 normal = normalize(texture(samplerNormal, newUV).rgb * 2.0 - 1.0);
<snip>
for(int i = 0; i < SSAO_KERNEL_SIZE; i++)
{
<snip>
float sampleDepth = -texture(samplerPositionDepth, offset.xy).w;
vec3 sampleNormal = normalize(texture(samplerNormal, offset.xy).rgb * 2.0 - 1.0);
if(dot(sampleNormal, normal) > 0.99)
continue;

OpenGL Raycasting with any object

I'm just wondering if there was any way which one can perform mouse picking detection onto any object. Whether it would be generated object or imported object.
[Idea] -
The idea I have in mind is that, there would be iterations with every object in the scene. Checking if the mouse ray has intersected with an object. For checking the intersection, it would check the mouse picking ray with the triangles that make up the object.
[Pros] -
I believe the benefit of this approach is that, every object can be detected with mouse picking since they all inherit from the detection method.
[Cons] -
I believe this drawbacks are mainly the speed and the method being very expensive. So would need fine tuning of optimization.
[Situation] -
In the past I have read about mouse picking and I too have implemented some basic form of mouse picking. But all those were crappy work which I am not proud of. So again today, I have re-read some of the stuff from online. Nowadays I see alot of mouse picking using color ids and shaders. I'm not too keen for this method. I'm more into a mathematical side.
So here is my mouse picking ray thingamajig.
maths::Vector3 Camera::Raycast(s32 mouse_x, s32 mouse_y)
{
// Normalized Device Coordinates
maths::Vector2 window_size = Application::GetApplication().GetWindowSize();
float x = (2.0f * mouse_x) / window_size.x - 1.0f;
float y = 1.0f;
float z = 1.0f;
maths::Vector3 normalized_device_coordinates_ray = maths::Vector3(x, y, z);
// Homogeneous Clip Coordinates
maths::Vector4 homogeneous_clip_coordinates_ray = maths::Vector4(normalized_device_coordinates_ray.x, normalized_device_coordinates_ray.y, -1.0f, 1.0f);
// 4D Eye (Camera) Coordinates
maths::Vector4 camera_ray = maths::Matrix4x4::Invert(projection_matrix_) * homogeneous_clip_coordinates_ray;
camera_ray = maths::Vector4(camera_ray.x, camera_ray.y, -1.0f, 0.0f);
// 4D World Coordinates
maths::Vector3 world_coordinates_ray = maths::Matrix4x4::Invert(view_matrix_) * camera_ray;
world_coordinates_ray = world_coordinates_ray.Normalize();
return world_coordinates_ray;
}
I have this ray plane intersection function which calculates if a certain ray as intersected with a certain plane. DUH!
Here is the code for that.
bool Camera::RayPlaneIntersection(const maths::Vector3& ray_origin, const maths::Vector3& ray_direction, const maths::Vector3& plane_origin, const maths::Vector3& plane_normal, float& distance)
{
float denominator = plane_normal.Dot(ray_direction);
if (denominator >= 1e-6) // 1e-6 = 0.000001
{
maths::Vector3 vector_subtraction = plane_origin - ray_origin;
distance = vector_subtraction.Dot(plane_normal);
return (distance >= 0);
}
return false;
}
There are many more out there. E.g. Plane Sphere Intersection, Plane Disk Intersection. These things are like very specific. So it feel that is very hard to do mouse picking intersections on a global scale. I feel this way because, for this very RayPlaneIntersection function. What I expect to do with it is, retrieve the objects in the scene and retrieve all the normals for that object (which is a pain in the ass). So now to re-emphasize my question.
Is there already a method out there which I don't know, that does mouse picking in one way for all objects? Or am I just being stupid and not knowing what to do when I have everything?
Thank you. Thank you.
Yes, it is possible to do mouse-picking with OpenGL: you render all the geometry into a special buffer that stores a unique id of the object instead of its shaded color, then you just look at what value you got at the pixel below the mouse and know the object by its id that is written there. However, although it might be simpler, it is not a particularly efficient solution if your camera or geometry constantly moves.
Instead, doing an analytical ray-object intersection is the way to go. However, you don't need to check the intersection of every triangle of every object against the ray. That would be inefficient indeed. You should cull entire objects by their bounding boxes, or even portions of the whole scene. Game engines have their own spacial index data structure to speed-up ray-object intersections. They need it not only for mouse picking, but also for collision-detection, physics simulations, AI, and what-not.
Also note that the geometry used for the picking might be different from the one used for rendering. One example that comes to mind is that of semi-transparent objects.

Physically based camera values too small

I am currently working on a physically based camera model and came across this blog: https://placeholderart.wordpress.com/2014/11/21/implementing-a-physically-based-camera-manual-exposure/
So I tried to implement it myself in OpenGL. I thought of calculating the exposure using the function getSaturationBasedExposure and pass that value to a shader where I will multiply the final color with that value:
float getSaturationBasedExposure(float aperture,
float shutterSpeed,
float iso)
{
float l_max = (7800.0f / 65.0f) * Sqr(aperture) / (iso * shutterSpeed);
return 1.0f / l_max;
}
colorOut = color * exposure;
But the values I get from that function are way too small (like around 0.00025 etc) so I guess I am missunderstanding the returned value of that function.
In the blog a test scene is mentioned in which the scene luminance is around 4000, but I haven't seen a shader implementation working with color range from 0 to 4000+ (not even HDR goes that high, right?).
So could anyone explain me how to apply the calculations correctly to a OpenGL scene or help me understand the meaning behind the calculations?

OpenGL: glReadPixels inaccurate for unprojection

I have my own unproject function for performing reverse projection of a screen point. The code is as follows (written in OpenTK):
public static Vector3 UnProject(Point screenLocation, float depth)
{
int[] viewport = GetViewport();
Vector4 pos = new Vector4();
// Map x and y from window coordinates, map to range -1 to 1
pos.X = (screenLocation.X - viewport[0]) / (float)viewport[2] * 2.0f - 1.0f;
pos.Y = 1 - (screenLocation.Y - viewport[1]) / (float)viewport[3] * 2.0f;
pos.Z = depth * 2.0f - 1.0f;
pos.W = 1.0f;
Vector4 pos2 = Vector4.Transform(pos, Matrix4.Invert(GetModelViewMatrix() * GetProjectionMatrix()));
Vector3 pos_out = new Vector3(pos2.X, pos2.Y, pos2.Z);
return pos_out / pos2.W;
}
Basically, you'd provide the desired unprojection depth to my function, and it will give you the corresponding world coordinate of the screen point. Assuming that this works correctly (which I am 99% sure think it does), I'm having problems converting screen points to world coordinates. This unprojection works fine for picking: I'd call my unproject function twice (once with depth = 0 and another time with depth = 1) to convert the screen point to ray. I perform ray/triangle intersection to determine which object intersects with the ray and based on that I perform picking (which works very accurately).
For another operation (let's call it operation X), I only need to know the world coordinate of the screen point (assuming that the mouse cursor is over an object on the screen). For that, I am obtaining the depth under the cursor by using the glReadPixel function. The problem is that I feel the Z value obtained by reading the depth buffer is a little bit off. If I calculate the intersection with ray casting, I get accurate results, but that is not viable for operation X as operation X needs to be performed every time MouseMoved is triggered.
To demonstrate the lack of accuracy, here are the two numbers I obtained:
glReadPixel + Unprojection yields (0.886105343709181, 0.12422376198582, 0.998496665566841) as the world coordinate under the cursor.
Ray casting + intersection yields  (0.885407337013061, 0.124174778008613, 1) as the world coordinate under the cursor.
This 0.0015 error in the Z value is too much for operation X (as it is very sensitive to small numbers).
Is there something wrong with glReadPixels that I should know about? Is this happening because glReadPixels is only capable of reading float values?
I don't think that glReadPixels is to blame here. I think that the Z buffer precision is the issue. By default, you typically have a 24 bit fixed-point depth buffer. Maybe it helps if you use a 32 bit floating point depth buffer, but you probably need an FBO for that.