Taking very large screenshot from scene - opengl

I want to take very large screenshots from my application in OpenGL like 20000x20000 for printing on the banner. First of all, I am not able to create such big framebuffers because of the maximum GPU texture size limitation. Anyone can help me how to capture the framebuffer in different chunks?

As you already noted, capturing it in multiple passes is the way to go. In the simplest form, you can use multiple passes, each rendering a part of the scene.
To get the properr image sub-region, all you need to do is applying another transformation to the clip-space positions of your vertices. This boils down to simple translations and scalings in x and y:
When considering the euclidiean interpretation of the clip space - the normalized device space - the viewing volume is represented by a cube [-1,1] in all 3 dimensions.
To render only an axis-aligned sub-region of that cube, we have to upscale it so that only the sub-region fits into the [-1,1] area, and we have to translate it properly.
Assuming we want to divide the image into an uniform grid of m times n tiles, we could do the following transformations when rendering tile i,j:
Move the bottom left of the tile to the origin. That tile position will be at (-1 + 2*i/m, -1 + 2*j/n), so we have to translate with the negated value:
x' = x + 1 - 2*i/m,
y' = y + 1 - 2*j/n
This is only a helper step to make the final translation easier.
Scale by factors m and n along x and y directions:
x'' = m * x' = x * m + m - 2*i,
y'' = y' * n = y * n + n - 2*j
The tile is now aligned such that it's bottom left corner is (still) at the origin and the upper right center is at (2,2), so just translate it back with (-1, -1) so that we end up in the view voulme again:
x''' = x'' - 1 = x * m + m - 2*i - 1,
y''' = y'' - 1 = y * n + n - 2*j - 1
This can of coure be represented as simple affine transformation matrix:
( m 0 0 m - 2*i - 1)
( 0 n 0 n - 2*j - 1)
( 0 0 1 0 )
( 0 0 0 1 )
In most cases, you can simply pre-multiply that matrix to the projection-matrix (or whatever matrices you use), and won't have to change anything else during rendering.

Related

How to determine the intersection between the camera direction and a plane?

I have a 3D scene with an infinite horizontal plane (parallel to the xz coordinates) at a height H along the Y vertical axis.
I would like to know how to determine the intersection between the axis of my camera and this plane.
The camera is defined by a view-matrix and a projection-matrix.
There are two sub-problems here: 1) Extracting the position and view-direction from the camera matrix. 2) Calculating the intersection between the view-ray and the plane.
Extracting position and view-direction
The view matrix describes how points are transformed from world-space to view space. The view-space in OpenGL is usually defined such that the camera is in the origin and looks into the -z direction.
To get the position of the camera, we have to transform the origin [0,0,0] of the view-space back into world-space. Mathematically speaking, we have to calculate:
camera_pos_ws = inverse(view_matrix) * [0,0,0,1]
but when looking at the equation we'll see that we are only interrested in the 4th column of the inverse matrix which will contain 1
camera_pos_ws = [-view_matrix[12], -view_matrix[13], -view_matrix[14]]
The orientation of the camera can be found by a similar calculation. We know that the camera looks in -z direction in view-space thus the world space direction is given by
camera_dir_ws = inverse(view_matrix) * [0,0,-1,0];
Again, when looking at the equation, we'll see that this only takes the third row of the inverse matrix into account which is given by2
camera_dir_ws = [-view_matrix[2], -view_matrix[6], -view_matrix[10]]
Calculating the intersection
We now know the camera position P and the view direction D, thus we have to find the x,z value along the ray R(x,y,z) = P + l * D where y equals H. Since there is only one unknown, l, we can calculate that from
y = Py + l * Dy
H = Py + l * Dy
l = (H - Py) / Dy
The intersection point is then given by pasting l back into the ray equation.
Notes
1 The indices assume that the matrix is stored in a column-major linear array.
2 Note, that the inverse of a matrix of the form
M = [ R T ]
0 1
, where R is a orthogonal 3x3 matrix, is given by
inv(M) = [ transpose(R) -T ]
0 1
For a general line-plane intersection there are lot of answers and tutorials.
Your case is simple due to the plane is horizontal.
I suppose the camera is at C(cx, cy, cz) and it looks at T(tx, ty,tz).
Then the line camera-target can be defined by:
cx - x cy - y cz - z
------ = ------ = ------ /// These are two independant equations
tx - cx ty - cy tz - cz
For a horizontal plane, only a equation is needed: y = H.
Substitute this value in the line equations and you get
(cx-x)/(tx-cx) = (cy-H)/(ty-cy)
(cz-z)/(tz-cz) = (cy-H)/(ty-cy)
So
x = cx - (tx-cx)*(cy-H)/(ty-cy)
y = H
z = cz - (tz-cz)*(cy-H)/(ty-cy)
Of course if your camera looks in an also horizontal line then ty=cy and there is not solution.

OpenGL sutherland-hodgman polygon clipping algorithm in homogeneous coordinates (4D, CCS)

I have two questions. (I marked 1, 2 below)
In OpenGl, the clipping is done by sutherland-hodgman.
However, I wonder how to work sutherland-hodgman algorithm in homogeneous system (4D)
I made a situation.
In VCS, there is a line, R= (0, 3, -2, 1), S = (0, 0, 1, 1) (End points of the line)
And a frustum is right = 1, left = -1, near = 1, far = 3, top = 4, bottom = -4
Therefore, the projection matrix P is
1 0 0 0
0 1/4 0 0
0 0 -2 -3
0 0 -1 0
If we calculate the line with the P, then the each end points is like that
R' = (0, 3/4, 1, 2), S' = (0, 0, -5, -1)
I know that perspective division should not be done now, because if we do perspective division, the clipping result is not correct.
Here I am curious. What makes a correct clipping because we did not just do perspective division. What mathematical properties are here?
How to calculate the clipping result in above situation?
(The fact that two intersections occur in the w-y coordinate system makes me confused. I thought the result line is one, not divided two parts)
I'm not quite sure whether you understood the sutherland-hodgman algorithm correctly (or at least I didn't get your example). Thus I will prove here, that it doesn't make any difference whether clipping happens before or after the perspective divide. The proof is only shown for one plane (clipping has to be done against all 6 planes), since applying multiple such clipping operations after each other makes not difference here.
Let's assume we have two points (as you described) R' and S' in clip space. And we have a clipping plane P given in hessian normal form [n, p] (if we take the left plane this is [1,0,0,1]).
If we would be calculating in pure 3d space (R^3), then checking whether a line crosses this plane would be done by calculating the signed distance of both points to the plane and checking if the sign is different. The signed distance for a point X = [x/w,y/w,z/w] is given by
D = dot(n, X) + p
Let's write down the actual equation we have (including the perspective divide):
d = n_x * x/w + n_y * y/w + n_z * z/w + p
In order to find the exact intersection point, we would, again in R^3 space, calculate for both points (A = R'/R'w, B = S'/S'w) the distance to the plane (da, db) and perform a linear interpolation (I will only write the equations for the x-coordinate here since y and z are working similar):
x = A_x * (1 - da/(da - db)) + A_y * (da/(da-db))
x = R'x/R'w * (1 - da/(da - db)) + S'x/S'w * (da/(da-db))
And w = 1 (since we interpolate between two points both having w = 1)
Now we already know from the previous discussion, that clipping has to happen before the perspective divide, thus we have to adapt this equation. This means, that for each point, the clipping cube has a different scaling w. Lt's see what happens when we try to perform the the same operations in P^3 (before the perspective divide):
First, we "revert" the perspective divide to get to X=[x,y,z,w] for which the distance to the plane is given by
d = n_x * x/w + n_y * y/w + n_z * z/w + p
d = (n_x * x + n_y * y + n_z * z) / w + p
d * w = n_x * x + n_y * y + n_z * z + p * w
d * w = dot([n, p], [x,y,z,w])
d * w = dot(P, X)
Since we are only interested in the sign of the whole calculation, which we haven't changed by our operations, we can compare the D*ws and get the same inside-out result as in R^3.
For the two points R' and S', the calculated distances in P^3 are dr = da * R'w and ds = db * S'w. When we now use the same interpolation equation as above but for R' and S' we get for x:
x' = R'x * (1 - (da * R'w)/(da * R'w - db * S'w)) + S'x * (da * R'w)/(da * R'w - db * S'w)
On the first view this looks rather different from the result we got in R^3, but since we are still in P^3 (thus x'), we still have to do the perspective divide on the result (this is allowed here, since the interpolated point will always be at the border of the view-frustum and thus dividing by w will not introduce any problems). The interpolated w component is given as:
w' = R'w * (1 - (da * R'w)/(da * R'w - db * S'w)) + S'w * (da * R'w)/(da * R'w - db * S'w)
And when calculating x/w we get
x = x' / w';
x = R'x/R'w * (1 - da/(da - db)) + S'x/S'w * (da/(da-db))
which is exactly the same result as when calculating everything in R^3.
Conclusion: The interpolation gives the same result, no matter if we perform the perspective divide first and interpolation afterwards or interpolating first and dividing then. But with the second variant we avoid the problem with points flipping from behind the viewer to the front since we are only dividing points that are guaranteed to be inside (or on the border) of the viewing frustum.
You speak of polygon clipping in a homogeneous system (4D) but from your question I assume that you actually mean homogeneous coordinates, which makes a lot more sense. (There are many possible homogenous systems.)
Ok, so you want to use "4D" coordinates, which are really "3D coordinates and a w term". The w term represents (projection transformations) the projective term that partially relates the screen-space coordinate to the original world space position. Assuming that you are NOT interested in projective space clipping, this term is not relevant.
I'm assuming this because the clipping box you describe is axis-aligned on planes in 3D. Even if it was rotated or scaled in 3D space, each of the planes would still be a 3D plane, the 4th coordinate always being '1'.
So how to clip:
clip line segment L against each of the planes of the clipping box, i.e. 6 clipping planes in total (you describe the normals of each clipping plane aptly), and see if any intersection point v is shared by the line and the tested plane P so that
v lies on the line segment (i.e. a t between 0 and 1)
v lies within the bounds of the plane P (i.e. the coordinate should not lie beyond any of the adjacent planes. Since you are using axis-aligned clipping planes, this is easy to check.)
Any of these intersections between a (3D + w) line and one of the 3D planes occurs in 3D, and intersection points have to be a 3D coordinates. You can extend each of these coordinates with a 4th w coordinate into a "4D" coordinate so that you can further transform them using 4x4 matrices for view and projection processing.

How to normalize a mesh into -1 to 1, then revert from normalized mesh to original one?

I have a mesh model in X, Y, Z format. Lets say.
Points *P;
In first step, I want to normalize this mesh into (-1, -1, -1) to (1, 1, 1).
Here normalize means to fit this mesh into a box of (-1, -1, -1) to (1, 1, 1).
then after that I do some processing to normalized mesh, finally i want to revert the dimensions to similar with the original mesh.
step-1:
P = Original Mesh dimensions;
step-2:
nP = Normalize(P); // from (-1, -1, -1) to (1, 1, 1)
step-3:
cnP = do something with (nP), number of vertices has increased or decreased.
step-4:
Original Mesh dimensions = Revert(cnP); // dimension should be same with the original mesh
how can I do that?
I know how easy it can be to get lost in programming and completely miss the simplicity of the underlying math. But trust me, it really is simple.
The most intuitive way to go about your problem is probably this:
determine the minimum and maximum value for all three coordinate axes (i.e., x, y and z). This information is contained by the eight corner vertices of your cube. Save these six values in six variables (e.g., min_x, max_x, etc.).
For all points p = (x,y,z) in the mesh, compute
q = ( 2.0*(x-min_x)/(max_x-min_x) - 1.0
2.0*(y-min_y)/(max_y-min_y) - 1.0
2.0*(z-min_z)/(max_z-min_z) - 1.0 )
now q equals p translated to the interval (-1,-1,-1) -- (+1,+1,+1).
Do whatever you need to do on this intermediate grid.
Convert all coordinates q = (xx, yy, zz) back to the original grid by doing the inverse operation:
p = ( (xx+1.0)*(max_x-min_x)/2.0 + min_x
(yy+1.0)*(max_y-min_y)/2.0 + min_y
(zz+1.0)*(max_z-min_z)/2.0 + min_z )
Clean up any mess you've made and continue with the rest of your program.
This is so easy, it's probably a lot more work to find out which library contains these functions than it is to write them yourself.
It's easy - use shape functions. Here's a 1D example for two points:
-1 <= u <= +1
x(u) = x1*(1-u)/2.0 + x2*(1+u)/2.0
x(-1) = x1
x(+1) = x2
You can transform between coordinate systems using the Jacobean.
Let's see what it looks like in 2D:
-1 <= u <= =1
-1 <= v <= =1
x(u, v) = x1*(1-u)*(1-v)/4.0 + x2*(1+u)*(1-v)/4.0 + x3*(1+u)*(1+v)/4.0 + x4*(1-u)*(1+v)/4.0
y(u, v) = y1*(1-u)*(1-v)/4.0 + y2*(1+u)*(1-v)/4.0 + y3*(1+u)*(1+v)/4.0 + y4*(1-u)*(1+v)/4.0

When drawing an ellipse or circle with OpenGL, how many vertices should we use?

Should we just blindly use 360 vertices? 720 seems to work better, but where do we stop?
It depends on how much error you can tolerate (i.e. the visual quality) and the size of the circle (ellipse). A bigger circle will need more points to achieve the same quality. You can work out exactly how many points you need for a given error with a bit of maths.
If you consider the circle represented by a series of line segments, the end points of the line segments lie exactly on the circle (ignoring the pixel grid). The biggest deviation between the real circle and our line segment representation occurs in the center of each line segment, and this error is the same for all of the line segments.
Looking at the first segment from the x-axis going anti-clockwise, its two endpoints are:
A = (r, 0)
B = (r . cos(th), r . sin(th))
where r is the radius of the circle and th is the angle covered by each line segment (e.g. if we have 720 points then each line segment covers 0.5 degree so th would be 0.5 degrees).
The midpoint of this line segment is at
M = A + (B - A) / 2
= (r, 0) + (r (cos(th) - 1) / 2, r . sin(th) / 2)
= (r / 2) . (1 + cos(th), sin(th))
and the distance from the origin to the point is
l = (r / 2) . sqrt((1 + cos(th))^2 + (sin(th))^2)
= (r / 2) . sqrt(2) . sqrt(1 + cos(th))
If our line segment representation were perfect then this length should be equal to the radius (the midpoint of the line segment should fall on the circle). Normally there'll be some error and this point will be slightly less than the radius. The error is
e = r - l
= r . (1 - sqrt(2) . sqrt(1 + cos(th)) / 2)
Rearranging so we have th in terms of e and r
2 . e / r = 2 - sqrt(2) . sqrt(1 + cos(th))
sqrt(2) . sqrt(1 + cos(th)) = 2 . (1 - e / r)
1 + cos(th) = 2 . (1 - e / r)^2
th = arccos(2 . (1 - e / r)^2 - 1)
This lets us calculate the maximum angle we can have between each point to achieve a certain error. For example, say we're drawing a circle with a radius of 100 pixels and we want a maximum error of 0.5 pixels. We can calculate
th = arccos(2 . (1 - 0.5 / 100)^2 - 1))
= 11.46 degrees
This corresponds to ceil(360 / 11.46) = 32 points. So if we draw a circle of radius 100 using 32 points our worst pixel will be off by less than a half which should mean every pixel we draw will be in the correct place (ignoring aliasing).
This kind of analysis can be performed for ellipses too, but in the spirit of all good maths that is left as an exercise for the reader ;) (the only difference is determining where the maximum error occurs).
as many as the resolution you are using requires, or as many as the visual result requires an accurate representation. It's difficult to say, and mostly depends on what you want to achieve. In a CAD program, having a circle visually similar to an octagon could be annoying. On the other hand, if you are programming a game on the iphone, if the wheel of a car looks like an octagon it's not a big deal.
A possible strategy you could use is to evaluate the length of each segment with respect to the resolution of the current view, and if longer than, say, 3 pixels, increase the number of vertexes you use, but only for the visible segments. This way, you increase the resolution when zooming in, but you don't have to describe vertexes you are not going to draw.

Normalizing from [0.5 - 1] to [0 - 1]

I'm kind of stuck here, I guess it's a bit of a brain teaser. If I have numbers in the range between 0.5 to 1 how can I normalize it to be between 0 to 1?
Thanks for any help, maybe I'm just a bit slow since I've been working for the past 24 hours straight O_O
Others have provided you the formula, but not the work. Here's how you approach a problem like this. You might find this far more valuable than just knowning the answer.
To map [0.5, 1] to [0, 1] we will seek a linear map of the form x -> ax + b. We will require that endpoints are mapped to endpoints and that order is preserved.
Method one: The requirement that endpoints are mapped to endpoints and that order is preserved implies that 0.5 is mapped to 0 and 1 is mapped to 1
a * (0.5) + b = 0 (1)
a * 1 + b = 1 (2)
This is a simultaneous system of linear equations and can be solved by multiplying equation (1) by -2 and adding equation (1) to equation (2). Upon doing this we obtain b = -1 and substituting this back into equation (2) we obtain that a = 2. Thus the map x -> 2x - 1 will do the trick.
Method two: The slope of a line passing through two points (x1, y1) and (x2, y2) is
(y2 - y1) / (x2 - x1).
Here we will use the points (0.5, 0) and (1, 1) to meet the requirement that endpoints are mapped to endpoints and that the map is order-preserving. Therefore the slope is
m = (1 - 0) / (1 - 0.5) = 1 / 0.5 = 2.
We have that (1, 1) is a point on the line and therefore by the point-slope form of an equation of a line we have that
y - 1 = 2 * (x - 1) = 2x - 2
so that
y = 2x - 1.
Once again we see that x -> 2x - 1 is a map that will do the trick.
Subtract 0.5 (giving you a new range of 0 - 0.5) then multiply by 2.
double normalize( double x )
{
// I'll leave range validation up to you
return (x - 0.5) * 2;
}
To add another generic answer.
If you want to map the linear range [A..B] to [C..D], you can apply the following steps:
Shift the range so the lower bound is 0. (subract A from both bounds:
[A..B] -> [0..B-A]
Scale the range so it is [0..1]. (divide by the upper bound):
[0..B-A] -> [0..1]
Scale the range so it has the length of the new range which is D-C. (multiply with D-C):
[0..1] -> [0..D-C]
Shift the range so the lower bound is C. (add C to the bounds):
[0..D-C] -> [C..D]
Combining this to a single formula, we get:
(D-C)*(X-A)
X' = ----------- + C
(B-A)
In your case, A=0.5, B=1, C=0, D=1 you get:
(X-0.5)
X' = ------- = 2X-1
(0.5)
Note, if you have to convert a lot of X to X', you can change the formula to:
(D-C) C*B - A*D
X' = ----- * X + ---------
(B-A) (B-A)
It is also interesting to take a look at non linear ranges. You can take the same steps, but you need an extra step to transform the linear range to a nonlinear range.
Lazyweb answer: To convert a value x from [minimum..maximum] to [floor..ceil]:
General case:
normalized_x = ((ceil - floor) * (x - minimum))/(maximum - minimum) + floor
To normalize to [0..255]:
normalized_x = (255 * (x - minimum))/(maximum - minimum)
To normalize to [0..1]:
normalized_x = (x - minimum)/(maximum - minimum)
× 2 − 1
should do the trick
You could always use clamp or saturate within your math to make sure your final value is between 0-1. Some saturate at the end, but I've seen it done during a computation, too.