OpenGL restore z from depth - opengl

I'm trying to understand how to restore the z from the depthbuffer and trying to do the math, based on these two posts:
Getting the true z value from the depth buffer (stackoverflow)
http://ogldev.atspace.co.uk/www/tutorial46/tutorial46.html
The two posts use different projection matrices (specifically the lower-right 2x2 part is different, one difference being use of a -1 vs a +1 in [ 3 ][ 4 ]). Not really sure why that would be, afaik the one with -1 is "the correct OpenGL projection-matrix" (right?)
Now I tried to do the calculation for both and the weird thing is that in the SO-post it mentions -A-B/z (or in my calculation -S-T/z). Then the code shows
And solving this for A and B (or S and T) gives
Okay, now doing the calculation for both Projection-matrices from scratch, (left=ogldev.atspace.co.uk, right=stackoverflow) and now it gets confusing because up to the -S-T/z part everything is fine, but when we compare the solved formula for which should have been the stackoverflow-case (-1 projection-matrix) it matches the one from ogldev.atspace.co.uk (+1 projection matrix) - colored in red...
this is confusing, any clues what I'm doing wrong?!
Updated calculations, see comments from "derhass" below:

The two posts use different projection matrices (specifically the
lower-right 2x2 part is different, one difference being use of a -1 vs
a +1 in [ 3 ][ 4 ]). Not really sure why that would be, afaik the one
with -1 is "the correct OpenGL projection-matrix" (right?)
No. There is no "right" and "wrong" here. There are just conventions. The "correct" matrix is the one which does the right thing, for the conventions you chose to use. Classic GL's glFrustum function did indeed use the matrix from that StackOverflow post. The convention here is that projection center is at origin, view direction is -z, x is right and y up. But you can use any convention, with arbitrary principal point, and arbitrary projection direction. The other matrix is just +z as the projection direction, which can be interpreted as a flipped handedness of the coordinate space. It can also be interpreted as just looking in the opposing direction while still keep the left-handed coordinate system.
this is confusing, any clues what I'm doing wrong?!
I'm not sure what you're trying to prove here, besides the fact that introducing small sign errors will give bogus results...
Your derivations for the "+z" projection matrix seem OK. It maps depth=0 to z=n, and depth=1 to z=f, which is the standard way of mapping these - and just another convention. You could also use a Reversed-Z mapping, where the near plane is mapped to depth 1, and far plane mapped to depth 0.
UPDATE
For the second matrix, you flipped the sign again, even after the corrections from my comment. When you substitued S and T back in that final formula, you actually substituted in -S. If you did the correct substitutions, you would have gotten [Now after you fixed the calulation again, you have got]
a formala which is exactly the negated one as in the +z matrix case - depth = 0 is mapped to -n, and depth=1 to -f, which is excatly how those parameters are defined in the classic GL convention, where n and f just describe the distances to those plane, in viewing direction (-z).

Related

Oblique View Frustum Depth Projection and Clipping (Eric Lengyel)

Trying to understand oblique clipping method i've got some problem with theory. According to this article written by Eric Lengyel at the end of 2 chapter we get clipping spaces:
Near <0,0,1,1>
Far <0,0,-1,1>
...
And it is said that:
each camera-space plane is
expressed as a sum or difference of two rows of the projection matrix
THIS moment i can not understand. For example, if it's said that Near plane value is "M4 + M3" (where M4 and M3 are the fourth and third rows of projection matrix), and other values ​​are calculated similarly, then the conclusion follows that projection matrix MUST be Identity (to get <0,0,1,1> result from M4 + M3). But we know that it's different. So, can someone explain, what matrix we use and and what is the connection with the projection matrix?
THIS moment i can not understand. For example, if it's said that Near plane value is "M4 + M3" (where M4 and M3 are the fourth and third rows of projection matrix), and other values ​​are calculated similarly, then the conclusion follows that projection matrix MUST be Identity (to get <0,0,1,1> result from M4 + M3).
First of all, your logic is very flawed here. To get a vector c=(0,0,1,1) out of the sum of two vectors a+b, you can find an infinte amount of vectors a and b fulfilling this, for example (7,-2pi,0,42) + (-7, 2pi, 1, -41) = (0,0,1,1).
However, this is completely besides the point, because you misunderstood crucial parts of that article. The clip planes you specified here are in clip space (for the special case that w = 1, as explained in the article). If we would want to find the equations for the clip planes in clip space, there would be absolutely no need for doing any calculations at all because the clip planes are defined in clip space as fixed equations. There is no point in calculating M4+M3 if we already know it would yield (0,0,1,1).
The whole article talks about efficiently calculating the clip planes in eye space. And table 1 of that paper makes this extremely clear:

Fragment shader - drawing a line?

I was interested in how to draw a line with a specific width (or multiple lines) using a fragment shader. I stumbled on the this post which seems to explain it.
The challenge I have is understanding the logic behind it.
A couple of questions:
Our coordinate space in this example is (0.0-1.0,0.0-1.0), correct?
If so, what is the purpose of the "uv" variable. Since thickness is 500, the "uv" variable will be very small. Therefore the distances from it to pont 1 and 2 (stored in the a and b variables)?
Finally, what is the logic behind the h variable?
i will try to answer all of your questions one by one:
1) Yes, this is in fact correct.
2) It is common in 3d computer graphics to express coordinates(within certain boundaries) with floating-point values between 0 and 1(or between -1 and 1). First of all, this makes it quite easy to decide whether a given value crosses said boundary or not, and abstracts away from a concept of "pixel" being a discrete image unit; furthermore this common practise can be found pretty much everywhere else(think of device coordinates or texture coordinates)
Don't be afraid that values that you are working with are less than one; in fact, in computer graphics you usually deal with floating-point arithmetics, and FLOAT types are quite good at expressing Real values line around the "1" point.
3) The formula give for h consists of 2 parts: the square-root part, and the 2/c coefficient. The square root part should be well known from scholl math classes - this is Heron formula for the area of a triangle(between a,b,c). 2/c extracts the height of the said triangle, which is stored in h and is also the distance between point uv and the "ground line" of the triangle. This distance is then used to decide, where is uv in relation to the line p1-p2.

Mathematical Issue: Triangle, Pyramid, Rotation, Translation, Zoom

Another tricky question. What you can see here is my physical pyramid built with 3 leds which form a triangle in 1 plane and another led in the mid center, about 18mm above the other 3. The 4th one makes the triangle to a pyramid. (You may can see it better if you look on the right triangle. This one is rotated about the horizontal achsis, and you can see a diode on a stick very well).
The second picture shows my running program. The left box shows the raw picture of the leds (photo with ir-filter). The picture in the center shows that my program found the points and is also able to tell which point is which, based on some conditions (like C is always where the both lines with maximal distance betweens diodes intersect; and the both longest lengths are always a and b). But dont care about this, i know the points are 100% correctly found.
Then on the right picture are some calculated values, like the height between C and c and so on. I would be able to calculate more, but i didnt bother to care for now, cause I am stuck.
I want to calculate the pyramids rotation and translation in the 3 dimensional space.
The yellow points are the leds after rotation arround an axis throught the center of the triangle in camera z- direction. So now i do not have to worry about this, when calculating the other 2. The Rotation arround the horizontal axis, and the rotation arround the vertical axis. I could easily calculate this with the lengths of the distance from the center of the triangle to the 4th diode (as you can see the 4th diode moves on the image plane with rotation), or the lengths of the both axes.
But my problem is the unknown depth.
It affects all lengths (a,b,c, and also the lengths from the center to the 4th diode if we call this d and e). I know the measurments of the real pyramid, with a tolerance of +-5% or so, but they get also affected by the zoom. So how do i deal with this?
I thought of an equation with a ratio between something with the lengths of the horizontal axis, the length of the vertical axis, the angles alpha, beta and gamma, and the lengths d and e.
Alpha, beta and gamma only get affected by rotation arround the axes (which i want to know. i want to know the rotation and the zoom), where a rotation arround one axis has the opposite effect than a rotation arround the other. So if you rotate arround both axes in the same angle, the ratio between the length of the axes is the same as before.
The zoom (real: how close it is to the camera; what i want to know in 1st place: multiplication factor 2x, 3x,0.5, 0,4322344,.....) does not affect the angles, but all the lengths: a,b,c,d,e,hc (vertical length of axis), hx (i have not calculated it yet, but it would be easy. the name hx can vary, i just thought of something random right now; it is the length of the horizontal axis) in the same way (i guess).
You see i have thought of many, but i think i am too dumb.
So, is there any math genius out there wo can give me the right equations, for either the rotation OR/AND the zoomfactor?
(i also thought about using Posit/Downhill- Simplex, and so on, but this would be the nicest, since i already know so much, like all Points, and so on and so on)
Please, please, i need your help really bad! I am writing this in C++ and with help of OpenCV if you need to know, but i think its more a mathematical problem.
Thanks in advance!
Ah, and Alpha seems to be always the same as Beta!
Edit: Had to delete the second picture
Have a look to Boost Geometry or here also
Have a look at SolvePnP() in OpenCV. Even if you don't use it directly, the documentation has citations for the methods used.

OpenGL LookAt function: is the up vector arbitrary?

I am trying to understand the glLookAt function.
It takes 3 triplets. The first is the eye position, the second is the point at which the eye stares. That point will appear in the center of my viewport, right? The third is the 'up' vector. I understand the meaning of the 'up' vector if it is perpendicular to the vector from eye to starepoint. The question is, is it allowed to specify other vectors for up, and, if yes, what's the meaning then?
A link to a graphical detailed explanation of gluPerstpective, glLookAt and glFrustum would be also much appreciated. The official OpenGL documentation appears not to be intended for newbies.
Please note that I understand the meaning of up vector when it is perpendicular to eye->object vector. The question is what is the meaning (if any), if it is not. I can't figure that out with playing with parameters.
It works as long as it is "sufficiently perpendicular" to the up vector. What matters is the plane between the up-vector and the look-at vector.
If these two become aligned the up-direction will be more or less random (based on the very small bits in your values), as a small adjustment of it will leave it pointing above/left/right of the look-at vector.
If they have a sufficiently large separating angle (in 32-bit floating point math) it will work well. This angle needs usually not be more than a degree or so, so they can be very close. But if the difference is down to a few bits, each changed bit will yield a huge direcitonal change.
It comes down to numerical precision.
(I'm sure there are more mathematical terms & definitions for this, but it's been a few years since college.. :)
final word: If the vectors are parallel, then the up-direction is completely undefined and you'll get a degenerate view matrix.
The up vector lets openGL know what way your have your camera.
Think in the really world, if you have to points in space, you can draw a line from one to the other. You can then align an object, such as a camera so that it points from one to the other. But you have no way of knowing how you object should be rotated around this axis that the line makes. The up vector dictates which direction the camera should be standing.
most of the time, your up vector will be (0,1,0) which means that the camera will be rotated just like you would normally hold a camera, or if you held your head up straight. if you set your up vector (1,0,0) it would be like holding your head on its side, so from the base of your head to the top of your head it pointing to the right. You are still looking from the same point (more or less) to the same point, but your 'up' has changed. A look vector of(0,-1,0) would make the camera be up side down, like if you where doing a hand stand.
One way you could think about this, your arm is a vector from the camera position (your shoulder) to the camera look at point (your index finger) if you stick you thumb out, this is your up vector.
This picture may help you http://images.gamedev.net/features/programming/oglch3excerpt/03fig11.jpg
EIDT
Perpendicular or not.
I see what you are asking now. example, you at (10,10,10) looking at (0,0,0) the resulting vector for your looking direction is (-10,-10,-10) the vector perpendicular to this does not matter for the purpose of you up vector glLookAt, if you wanted the view to orientated so that you are like a normal person just looking down a bit, just set you up vector to (0,1,0) In fact, unless you want to be able to roll the camera, you don't need this to be nay thing else.
In this website you have a great tutorial
http://www.xmission.com/~nate/tutors.html
http://users.polytech.unice.fr/~buffa/cours/synthese_image/DOCS/www.xmission.com/Nate/tutors.html
Download the executables and you can change the values of the parameters to the glLookAt function and see what happens "in real-time".
The up vector does not need to be perpendicular to the looking direction. As long as it is not parallel (or very close to being parallel) to the looking direction, you should be fine.
Given that you have a view plane normal, N (the looking direction) and a up vector (which mustn't be parallel to N), UV you calculate the actual up vector which will be used in the camera transform by first calculating the vector V = UV - (N * UV)N. V is in turn used to calculate the actual up vector used by creating a vector which is perpendicular to both N and V as U = N x V.
Yes. It is arbitrary, which lets you make the camera "roll", i.e. appear as if the scene is rotating around the eye axis.

Query points epsilon-close to a cut plane in point cloud using the GPU

I am trying to solve the current problem using GPU capabilities: "given a point cloud P and an oriented plane described by a point and a normal (Pp, Np) return the points in the cloud which lye at a distance equal or less than EPSILON from the plane".
Talking with a colleague of mine I converged toward the following solution:
1) prepare a vertex buffer of the points with an attached texture coordinate such that every point has a different vertex coordinate
2) set projection status to orthogonal
3) rotate the mesh such that the normal of the plane is aligned with the -z axis and offset it such that x,y,z=0 corresponds to Pp
4) set the z-clipping plane such that z:[-EPSILON;+EPSILON]
5) render to a texture
6) retrieve the texture from the graphic card
7) read the texture from the graphic card and see what points were rendered (in terms of their indexes), which are the points within the desired distance range.
Now the problems are the following:
q1) Do I need to open a window-frame to be able to do such operation? I am working within MATLAB and calling MEX-C++. By experience I know that as soon as you open a new frame the whole suit crashes miserably!
q2) what's the primitive to give a GLPoint a texture coordinate?
q3) I am not too clear how the render to a texture would be implemented? any reference, tutorial would be awesome...
q4) How would you retrieve this texture from the card? again, any reference, tutorial would be awesome...
I am on a tight schedule, thus, it would be nice if you could point me out the names of the techniques I should learn about, rather to the GLSL specification document and the OpenGL API as somebody has done. Those are a tiny bit too vague answers to my question.
Thanks a lot for any comment.
p.s.
Also notice that I would rather not use any resource like CUDA if possible, thus, getting something which uses
as much OpenGL elements as possible without requiring me to write a new shader.
Note: cross posted at
http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=245911#Post245911
It's simple:
Let n be the normal of the plane and x be the point.
n_u = n/norm(n) //this is a normal vector of unit length
d = scalarprod(n,x) //this is the distance of the plane to the origin
for each point p_i
d_i = abs(scalarprod(p_i,n) - d) //this is the distance of the point to the plane
Obviously "scalarprod" means "scalar product" and "abs" means "absolute value".
If you wonder why just read the article on scalar products at wikipedia.
Ok first as a little disclaimer: I know nothing about 3D programming.
Now my purely mathematical idea:
Given a plane by a normal N (of unit length) and a distance L of the plane to the center (the point [0/0/0]). The distance of a point X to the plane is given by the scalar product of N and X minus L the distance to the center. Hence you only have to check wether
|n . x - L| <= epsilon
. being the scalar product and | | the absolute value
Of course you have to intersect the plane with the normal first to get the distance L.
Maybe this helps.
I have one question for Andrea Tagliasacchi, Why?
Only if you are looking at 1000s of points and possible 100s of planes, would there would be any benefit from using the method outlined. As apposed to dot producting the point and plane, as outlined my Corporal Touchy.
Also due to the finite nature of pixels you'll often find two or more points will project to the same pixel in the texture.
If you still want to do this, I could work up a sample glut program in C++, but how this would help with MATLAB I don't know, as I'm unfamiliar with it.
IT seems to me you should be able to implement something similar to Corporal Touchy's method a a vertex program rather than in a for loop, right? Maybe use a C API to GPU programming, such as CUDA?