The output values of Vertex post processing is in Window space, then we have primitive assembling and Face culling stages.If so Face culling happens in window space?
Let me quote the OpenGL 4.6 core profile specification, section 1.2.3 "What is the OpenGL graphics system? | Our View" (emphasis mine):
We view OpenGL as a pipeline having some programmable stages and some
statedriven fixed-function stages that are invoked by a set of
specific drawing operations. This model should engender a
specification that satisfies the needs of both programmers and
implementors. It does not, however, necessarily provide a model for
implementation. An implementation must produce results conforming to
those produced by the specified methods, but there may be ways to
carry out a particular computation that are more efficient than the
one specified.
This means that implementations are conforming if they achieve the same results as an implementation which exactly follows the spec word for word would get.
So face culling has to behave the same way as if you would do it in window space. It obviously has the vertex shader stage, since before that, the final coordinates of the vertices are unknown.(Now there is an interesting side note here. It was iirc nvidia which has a patent on splitting the vertex shader in two parts: on which only calculates gl_Position and can optimize out all calculations which do not contribute yo that, and one for the rest of the outputs. This would allow them to do clipping and culling with only having parts of the VS executed. Not sure if they actually employ that strategy in their drivers, but the patent is out there somewhere. However, it is a nice example for what weird tricks real-world implementations might do.)
So does face culling happen in window space? Only your OpenGL implementor knows for sure. But putting the question the other way: Can face culling be implemented to happen in clip space?
First, let's look if it can happen in normalized device space:
Obviously, for face culling, only the winding order of the 2D projection of the triangle matters, so z does not matter. And this orientation doesn't change when you go from clip space to window space, or does it? A few observations here:
The NDC to window space conversion is just a scale and translation which works on each component individually, we do not get cross-influences of the form x_win = A * x_ndc + B *y_ndc + c * z_ndc + D, but B and C are guaranteed to be 0 here.
So the only way to flip the triangle orientation when going from clip space to window space would be if the scale factor is negative, and quoting the GL 4.6 spec again on the glViewport*() function family: "An INVALID_VALUE error is generated if either w or h is negative"
This means the face culling could be done in NDC, without having to take the actual viewport parameters into account.
Can the face culling happen in clip space?
Any point w<0 can never fulfill the clipping condition -w <= x,y,z <= w, so the division by w step to get from clip space to NDC will only happen with w>0 and so never flip the triangle around. But this requires that at least primitive clipping has already happened (consider a case where one vertex lies behind the camera, and two in front of it).
However, the w coordinate is still tricky even after clipping. Consider the case where you stand in a room with a sloped ceiling and look straight forward onto a side wall, but see part of the ceiling. A triangle from the ceiling could lead to the following clip space coordinates (-1,1,z_1,1), (1,1,z_2,1), (0,1.2,z_3,10) (with some z_is fulfilling the clipping condition, actual values don't matter here). If you just take the clip space x and y values (as you would do in window space or NDC), you would see a 2D projection with the third vertex above the line formed by the first two. But after the division by w, it will be below, and the triangle order is flipped (since w can vary per vertex).
So does this mean that the face culling cannot be done before the perspective divide by w? No, not necessarily. Maybe there is some clever way to carry out the culling in clip space even before the divide, but either way, the clip space w coordinate needs to be taken into account somehow.
So I can't really answer your question, but the general principle is that implementors try to do any sort of culling / discarding steps as early in the pipeline as possible (within all the gazillion engineering constraints they will have, of course).
Related
I am trying to understand open GL concepts . While reading this tutorial - http://www.arcsynthesis.org/gltut/Positioning/Tut04%20Perspective%20Projection.html,
I came accross this statement :
This is because camera space and NDC space have different viewing directions. In camera space, the camera looks down the -Z axis; more negative Z values are farther away. In NDC space, the camera looks down the +Z axis; more positive Z values are farther away. The diagram flips the axis so that the viewing direction can remain the same between the two images (up is away).
I am confused as to why the viewing direction has to change . Could some one please help me understand this with an example ?
This is mostly just a convention. OpenGL clip space (and NDC space and screen space) has always been defined as left-handed (with z pointing away into the screen) by the spec.
OpenGL eye space had been defined with camera at origin and looking at -z direction (so right-handed). However, this convention was just meaningful in the fixed-function pipeline, where together with the fixed function per vertex lighting which was carried out in eye space, the viewing direction did matter cases like whenGL_LOCAL_VIEWER was disabled (as was the default).
The classic GL projection matrix typically converts the handedness, and the perspecitve division is done with a divisior of -z_eye, typically, so the last row of the projection matrix is typically (0, 0, -1, 0). The old glFrustum(), glOrtho(), and gluPerspective() actually supported that convention by using the z_near and z_far clipping distances negated, so that you had to specify positive values for clip planes to lie before the camera at z<0.
However, with modern GL, this convention is more or less meaningless. There is no fixed-function unit left which does work in eye space, so the eye space (and anything before that) is totally under the user's control. You can use anything you like here. The clip space and all the later spaces are still used by fixed function units (clipping, rasterization, ...), so there most be some convention to define the interface, and it is still a left-handed system.
Even in modern GL, the old right-handed eye space convention is still in use. The popular glm library for example reimplements the old GL matrix functions the same way.
There is really no reason to prefer one of the possible conventions over the other, but at some point, you have to choose and stick to one.
I've been trying to utilize the techniques in Eric Penner's "Shader Amortization using
Pixel Quad Message Passing" from GPU Pro 2, Chapter VI.2. The basic idea is that modern GPU's process fragment shaders in 2x2 fragment quads, and you can use ddx() and ddy() to get the value of some_var at all four fragments as long as the following hold:
Your GPU supports high-quality derivatives
You know which fragment you're processing (top-left, top-right, bottom-left, bottom-right)
This opens up a lot of opportunities for fragment shader optimization (like distributing texture fetches over a 2x2 pixel quad) that you'd need Compute Shaders to beat.
My problem is this:
I can't deterministically detect which fragment I'm processing. Ideally, each fragment block would start at even-numbered output pixel coords like (0, 0), (2, 0), ... (1024, 1024), ..., so you'd just need to check whether the output pixel x and y coords are even or odd to know which fragment you're currently processing. The method Penner uses in the book assumes this works...but it seems to be going wrong for me.
Unfortunately, my 2x2 fragment quads appear to be starting in nondeterministic places: I've seen them start at (even, even), (even, odd), and (odd, even). I can't remember if I've seen (odd, odd) or not, but anyway, the arrangement seems to depend on a myriad of factors I don't understand, including the output resolution and shader specifics. (I'm testing on an 8800 GTS, in case anyone's wondering.)
Does anyone know what might be causing this nondeterminism or have any documentation on it? I understand there's virtually no official standardization in this area, but I'm more interested in how things work in practice on modern desktop-level GPU's, and I'm hoping there's a way to get this technique to work. If no one knows how to reason about the even/odd start behavior, does anyone know any other way of determining the current fragment's relative location in its 2x2 quad?
Thanks :)
As it turns out, the premise of my question was mostly wrong:
The 2x2 fragment quads DO almost always start on even pixel numbers...as long as the output resolution is even-numbered.
If the output resolution is odd-numbered (a possibility with the underlying program I'm working with), things can get more complicated, for obvious reasons. I don't expect there's any uniformity here across drivers/GPU's/etc. either, but my current tests (which themselves may still be buggy) appear to demonstrate 2x2 pixel quads starting at an odd pixel along the dimension with odd resolution, at least when the odd dimension is horizontal.
All of this weirdness helped obscure my bigger issue: The code I used to detect the fragment's location in the pixel quad was buggy. I tested by setting the texture coordinates equal within a pixel quad (set to the pixel quad center)...or so I thought. However, I calculated the screen coordinates based on a full-screen quad where the uv mapping has the +v axis pointing downward. The screenspace origin starts at the bottom-left, because it's based on the top-right quadrant of Cartesian coordinates, and I accidentally forgot to invert the v-coordinate of the uv offset I used to find the pixel quad center. Many of my nondeterministic observations came from failing to check my assumptions while debugging and misinterpreting things as a result, particularly in combination with odd resolutions.
This was an embarrassing mistake I should have caught a lot sooner, but I figured I'd detail it as a warning to others to always double-check the direction of your vertical axis when you're dealing with opposite-facing coordinate frames. ;)
UPDATE:
I ran across a situation where 2x2 pixel quads started on even pixel numbers even when the resolution was odd. Thanks to the nondeterminism under odd resolutions, I had to work out another solution:
If you're deriving your screen pixel numbers from the uv coords of a fullscreen quad (for post-processing), the fragment location derived from this is only useful for arranging/placing shared samples between fragments, etc., not for the quad-pixel communication itself. You'll need to have screen pixel numbers with respect to the screenspace origin for that. You can derive these from vertex positions, or you can use ddx().x and ddy().y on the uv-based pixel numbers to find out their screen direction and mirror the fragment position in the appropriate direction from there.
Calculate the fragment location based on your screen pixel numbers (with respect to the true screenspace origin) and the assumption 2x2 pixel quads start on even pixels. (If you used uv-based pixel numbers, now is the time to mirror things.)
Do a ddx().x and ddy().y on the fragment location, and if they're negative in either direction, you know the pixel quad starts at an odd pixel number in that direction...so mirror in that direction.
If you calculate two fragment positions, one based on a uv origin and one based on a screen origin, use the uv-based one for reasoning about uv-based sample placement, and use the screen-based one for actually obtaining the values of a variable at neighboring fragments.
Profit.
I'll post a link to my working MIT-licensed code once I release it on Github, along with usage examples (the speedup is unfortunately not what I expected, but whatever ;)). I'm just waiting to get done with a larger shader I'll be uploading along with it.
I've been learning OpenGL, and the one topic that continues to baffle me is the far clipping plane. While I can understand the reasoning behind the near clipping plane, and the side clipping planes (which never have any real effect because objects outside them would never be rendered anyway), the far clipping plane seems only to be an annoyance.
Since those behind OpenGL have obviously thought this through, I know there must be something I am missing. Why does OpenGL have a far clipping plane? More importantly, because you cannot turn it off, what are the recommended idioms and practices to use when drawing things at huge distances (for objects such as stars thousands of units away in a space game, a skybox, etc.)? Are you expected just to make the clipping plane very far away, or is there a more elegant solution? How is this done in production software?
The only reason is depth-precision. Since you only have a limited number of bits in the depth buffer, you can also just represent a finite amount of depth with it.
However, you can set the far plane to infinitely far away: See this. It just won't work very well with the depth buffer - you will see a lot of artifacts if you have occlusion far away.
So since this revolves around the depth buffer, you won't have a problem dealing with further-away stuff, as long as you don't use it. For example, a common technique is to render the scene in "slabs" that each only use the depth buffer internally (for all the stuff in one slab) but some form of painter's algorithm externally (for the slabs, so you draw the furthest one first)
Why does OpenGL have a far clipping plane?
Because computers are finite.
There are generally two ways to attempt to deal with this. One way is to construct the projection by taking the limit as z-far approaches infinity. This will converge on finite values, but it can play havoc with your depth precision for distant objects.
An alternative (if you're willing to have objects beyond a certain distance fail to depth-test correctly at all) is to turn on depth clamping with glEnable(GL_DEPTH_CLAMP). This will prevent clipping against the near and far planes; it's just that any fragments that would have normalized z coordinates outside of the [-1, 1] range will be clamped to that range. As previously indicated, it screws up depth tests between fragments that are being clamped, but usually those objects are far away.
It's just "the fact" that OpenGL depth test was performed in Window Space Coordinates (Normalized device coordinates in [-1,1]^3. With extra scaling glViewport and glDepthRange).
So from my point of view it's one of the design point of view of the OpenGL library.
One of approach to eliminate this OpenGL extension/OpenGL core functionality https://www.opengl.org/registry/specs/ARB/depth_clamp.txt if it is available in your OpenGL version.
I want to describe that in the perspective projection there is nothing about "far clipping plane".
3.1 For perspective projection you need to setup point \vec{c} as center of projection and plane on which projection will be performed. Let's call it
image plane T: (\vec{r}-\vec{r_0},\vec{n})
3.2 Let's assume that projected plane T split arbitary point \vec{r} and \vec{c} central of projection. In other case \vec{r} and \vec{c} are in one hafe-space and point \vec{r} should be discarded.
3.4 The idea of projection is to find intersection \vec{i} with plane T
\vec{i}=(1-t)\vec{c}+t\vec{r}
3.5 As it is
(\vec{i}-\vec{r_0},\vec{n})=0
=>
( (1-t)\vec{c}+t\vec{r}-\vec{r_0},\vec{n})=0
=>
( \vec{c}+t(\vec{r}-\vec{c})-\vec{r_0},\vec{n})=0
3.6. From "3.5" derived t can be subtitute into "3.4" and you will receive projection into plane T.
3.7. After projection you point will lie in the plane. But if assume that image plane is parallel to OXY plane, then I can suggest to use original "depth" for point after projection.
So from geometry point of view it is possible not to use far plane at all. As also not to use [-1,1]^3 model explicitly at all.
p.s. I don't know how to type latex formulas in correct way, s.t. they will be rendered.
I am new to openGL and I am reading the redbook. Now, as an exercise I want to manually draw a sphere. For that I am dividing the sphere into slices and stacks, and thus I get multiple rectangles, but near the poles of the sphere I get triangles. (hope this was clear what I am doing). Now I know that if you draw a polygon with GL_POLYGON and it happens to intersect itself, the behavior is undefined. My question is this: given three points v1, v2, v3 which are not on one line, is it undefined behavior to do this:
glBegin(GL_POLYGON)
vertex v1
vertex v1
vertex v2
vertex v3
glEnd();
This may be combining two unrelated questions into one, but I am wondering also this: if I choose to divide the rectangles in my sphere routine into triangles, does it matter how I do it, that is, by which diagonal I divide the rect into two triangles? I am guessing that for drawing a single-colored sphere it won't matter, but I don't know about textures, shaders, lighting etc.
When I was doing openGL stuff, I quickly stuck to using just triangles. They are special in that a triangle is not ambiguous in any way.
You example though, I would imagine this will work, though probably with artefacts.
how you split a rectangle shouldn't matter, just as long as you pay attention to which way your triangles are wound, which way you define the points as this is what dictates their front and back.
But definitely stick to triangles, images these four points of a square
(0,0,0) (1,0,0) (1,0,1) (0,0,1)
Fairly easy to see it is a flat square, but what if I change them to
(0,1,0) (1,0,0) (1,1,1) (0,0,1)
what do you have now? it could be drawn like a valley or like a hill. if I defined this shape with triangles, you know exactly what I am describing
(0,1,0) (1,0,0) (1,1,1)
(1,1,1) (1,0,1) (0,1,0)
A hill like shape
ok... so I side tracked a little here... my point is, I don't know what your code would do in practice, but I don't think you should use it any way. and how you split up rectangles shouldn't matter as long as your triangles are described the right way around.
This is no problem whatsoever. OpenGL always has to be able to deal with the possibility of rasterising geometry where multiple vertices fall in the same location since even different input points may end up as the same output point depending on your modelview and projection matrices (or your geometry and/or vertex shader if you're on the programmable pipeline). It is designed to deal in the mathematically correct way under a wide variety of edge cases.
OpenGL's primary test for whether to paint a pixel with geometry is whether its centre falls within the mathematical bounds of the primitive being drawn*. So OpenGL can render polygons that paint non-continuous sets of pixels (which generally happens when they become almost vanishingly thin) or that paint no pixels whatsoever (which tends to be when they end up really small, but they may technically be of arbitrarily large size as long as they squeeze between pixel centres).
The exact tests used at the hardware level may vary from vendor to vendor and are guaranteed to be correct only for geometry that is convex on screen — which is why most people say stick to triangles, since they're unavoidably convex.
(*) a separate screen-oriented test being applied to pixels exactly on a boundary to ensure they're attributed to only exactly one polygon where polygons meet along a common edge
As I have understood, it is recommended to use glTranslate / glRotate in favour of glutLootAt. I am not going to seek the reasons beyond the obvious HW vs SW computation mode, but just go with the wave. However, this is giving me some headaches as I do not exactly know how to efficiently stop the camera from breaking through walls. I am only interested in point-plane intersections, not AABB or anything else.
So, using glTranslates and glRotates means that the viewpoint stays still (at (0,0,0) for simplicity) while the world revolves around it. This means to me that in order to check for any intersection points, I now need to recompute the world's vertices coordinates (which was not needed with the glutLookAt approach) for every camera movement.
As there is no way in obtaining the needed new coordinates from GPU-land, they need to be calculated in CPU land by hand. For every camera movement ... :(
It seems there is the need to retain the current rotations aside each of the 3 axises and the same for translations. There is no scaling used in my program. My questions:
1 - is the above reasoning flawed ? How ?
2 - if not, there has to be a way to avoid such recalculations.
The way I see it (and by looking at http://www.glprogramming.com/red/appendixf.html) it needs one matrix multiplication for translations and another one for rotating (only aside the y axis needed). However, having to compute so many additions / multiplications and especially the sine / cosine will certainly be killing FPS. There are going to be thousands or even tens of thousands of vertices to compute on. Every frame... all the maths... After having computed the new coordinates of the world things seem to be very easy - just see if there is any plane that changed its 'd' sign (from the planes equation ax + by + cz + d = 0). If it did, use a lightweight cross products approach to test if the point is inside the space inside each 'moving' triangle of that plane.
Thanks
edit: I have found about glGet and I think it is the way to go but I do not know how to properly use it:
// Retains the current modelview matrix
//glPushMatrix();
glGetFloatv(GL_MODELVIEW_MATRIX, m_vt16CurrentMatrixVerts);
//glPopMatrix();
m_vt16CurrentMatrixVerts is a float[16] which gets filled with 0.f or 8.67453e-13 or something similar. Where am I screwing up ?
gluLookAt is a very handy function with absolutely no performance penalty. There is no reason not to use it, and, above all, no "HW vs SW" consideration about that. As Mk12 stated, glRotatef is also done on the CPU. The GPU part is : gl_Position = ProjectionMatrix x ViewMatrix x ModelMatrix x VertexPosition.
"using glTranslates and glRotates means that the viewpoint stays still" -> same thing for gluLookAt
"at (0,0,0) for simplicity" -> not for simplicity, it's a fact. However, this (0,0,0) is in the Camera coordinate system. It makes sense : relatively to the camera, the camera is at the origin...
Now, if you want to prevent the camera from going through the walls, the usual method is to trace a ray from the camera. I suspect this is what you're talking about ("to check for any intersection points"). But there is no need to do this in camera space. You can do this in world space. Here's a comparison :
Tracing rays in camera space : ray always starts from (0,0,0) and goes to (0,0,-1). Geometry must be transformed from Model space to World space, and then to Camera space, which is what annoys you
Tracing rays in world space : ray starts from camera position (in world space) and goes to (eyeCenter - eyePos).normalize(). Geometry must be transformed from Model space to World space.
Note that there is no third option (Tracing rays in Model space) which would avoid to transform the geometry from Model space to World space. However, you have a pair of workarounds :
First, your game's world is probably still : the Model matrix is probably always identity. So transforming its geometry from Model to World space is equivalent to doing nothing at all.
Secondly, for all other objets, you can take the opposite approach. Intead of transforming the entire geometry in one direction, transform only the ray the other way around : Take your Model matrix, inverse it, and you've got a matrix which goes from world space to model space. Multiply your ray's origin and direction by this matrix : your ray is now in model space. Intersect the normal way. Done.
Note that all I've said is standard techniques. No hacks or other weird stuff, just math :)