I have a matrix stored in
GLdouble m[16];
Now, using glMultMatrixd(m),I multiplied this matrix with 3d coordinates of a point. I want to know what are the new coordinates that are formed after multiplying with the matrix m. Is there any command in openGL that can do this?
No, there isn't any useful way.
In modern GL, the integrated matrix stack has been completely removed, for good reasons. Application programmers are reuired to write the own matrix functions, or use some existing library like glm (which implements all of the stuff that used to be in old OpenGL in some header-only C++ library). It is worth noting in this context that operations like glMultMatrix never were GPU-accelerated, and were always carried out directly on the CPU by the GL implementation, so there is nothing lost by removing this stuff from the GL.
I'm not saying it would be impossible to somehow let OpenGL do that matrix * vector multiplication for you and to read back the result. The most direct approach for that would be to use transfrom feedback to capture the results of the vertex shader in some buffer object. However, for transforming a single point, the overhead would be exorbitant.
Another - totally cumbersome - approach to get old GL with it's builtin matrix functions to calculate that product for you would be simply putting your point as the first column into a matrix, multiply that by using glMultMatrix to the matrix you set, read back the current matrix and find the transformed point in the first column of the resulting matrix.
Related
Which way of rendering two graphic elements with the same shape but different coordinates is more efficient using OpenGL (for eg. one object is upside down)?
Generate two different sets of points using CPU and then use only one shader in while loop
Generate only one set of points using CPU, create two different shaders and then switch between them in the while loop?
Additional question: Is there a possibility to create a set of points inside a shader? (This points will be calculated using sin() and cos() functions)
If in the first case you're switching buffers between OpenGL function calls then probably both solutions are equally bad. If you're holding two shapes in a single buffer and drawing everything in a single call then it's going to be faster than solution 2, but it requires twice the memory, which is also bad.
( You should have a single buffer for the shape and another one for transformations )
In general, you should use as few OpenGL calls as possible.
Regarding the second question: yes. For example, you could use a hard-coded array or derive the point coordinates from gl_VertexID.
For reference, I'm following this tutorial. Now suppose I have a little application with multiple types of model, if I understand correctly I have to send my MPV matrix from the CPU to the GPU (in other words to my vertex shader) for each model, because each model might have a different model matrix from one to another.
Now looking at the tutorial and this post, I understand that the call to send the matrix to my shader (glUniformMatrix4fv(myMatrixID, 1, GL_FALSE, &myModelMVP[0][0])) should be done for each frame and for each model since each time it overwrites the previous value of my MVP (the one for my last model). But, being concerned about the performance of my app, I don't want to send useless data through the bus and if I understand correctly, my model matrix is constant for each model.
I'm thinking about having an uniform for each model's MVP matrix, but I think it is not scalable and I would also have to update all of them if my view or projection matrices changed... Is there a way to avoid sending multiple times my model matrices and only send my view and projection matrices upon change?
These are essentially two questions: how to avoid sending data when only part of a transformation sequence changes, and how to efficiently supply per-model data which may or may not have changed since the last frame.
Transformation Sequence
For the first, you have a transformation sequence. Your positions are in model space. You then conceptually transform them into world space, then to camera/view space, then finally to clip space, where you write the position to gl_Position.
Most of these transformations are constant throughout a frame, but may change on a frame-to-frame basis. So you want to avoid changing data that doesn't strictly need to be changed.
If you want to do this, then clearly you cannot provide an "MVP" matrix. That is, you should not have a single matrix that contains the whole transformation. You should instead have a matrix that represents particular parts of the transformation.
However, you will need to do this decomposition for reasons other than performance. You cannot do many lighting operations in clip-space; as a non-linear space, it messes up lots of lighting operations. Therefore, if you're going to do lighting at all, you need a transformation that stops before clip space.
Camera/view space is the most common stopping point for lighting computations.
Now, if you use model-to-camera and camera-to-clip, then the model-to-camera matrix for every model will change when the camera changes, even if the model itself has not moved. And therefore, you may need to upload a bunch of matrices that don't strictly need to be changed.
To avoid that, you would need to use model-to-world and world-to-clip (in this case, you do your lighting in world space). The issue here is that you are exposed to the perils of world space Numerical precision may become problematic.
But is there a genuine performance issue here? Obviously it somewhat depends on the hardware. However, consider that many applications have hundreds if not thousands of objects, each with matrices that change every frame. An animated character usually has over a hundred matrices just for themselves that change every frame.
So it seems unlikely that the performance cost of uploading a few matrices that could have been constant is a real-world problem.
Per-Object Storage
What you really want is to separate your storage of per-object data from the program object itself. This can be done with UBOs or SSBOs; in both cases, you're storing uniform data in buffer objects.
The former are typically smaller in size (64KB or so), while the latter are essentially unbounded in their storage (16MB minimum). Obviously, the former are typically faster to access, but SSBOs shouldn't be considered to be slow.
Each object would have a section of the buffer that gets used for per-object data. And thus, you could choose to change it or not as you see fit.
Even so, such a system does not guarantee faster performance. For example, if the implementation is still reading from that buffer from last frame when you try to change it this frame, the implementation will have to either allocate new memory or just wait until the GPU is finished. This is not a hypothetical possibility; GPU rendering for complex scenes frequently lags a frame behind the CPU.
So to avoid that, you would need to double-buffer your per-object data. But when you do that, you will have to always upload their data, even if it doesn't change. Why? Because it might have changed two frames ago, and your double buffer has old data in it.
Basically, your goal of trying to avoid uploading of sometimes-static per-model data is just as likely to harm performance as to help it.
First of all, it's likely that at least something in your scene moves. If it is the objects then the model matrix will change from frame to frame, if it is the camera then the view or projection matrix will change. MVP includes the composition of the three, so it actually will change anyways and you can't get away from updating it in one way or the other.
However, you may still benefit from employing some of these:
Use Uniform Buffer Objects. You can send the uniforms to the GPU only once, and then rebind the buffer that the program will read the uniforms from. So different models may use different UBOs for their parameters (like model matrix).
Use Instancing. Even if you render only one instance of every model, you can pass the model matrix as an instanced vertex attribute. It will be stored in the VAO, and so sent to the GPU only once (or when you have to update it). On the plus side you may now easily render multiple instances of the same model through instanced draw calls.
Note it might be beneficial to separate the model, view and projection matrices. View and projection might be passed through a 'camera description' uniform buffer object updated only once per frame, then referenced by all programs. Model matrix, if it isn't changed, then will be constant within the VAO. To do proper lighting you have to separate model-view from projection anyways. It might look intimidating to work with three matrices on the GPU, but you actually don't have to, as you may switch to quaternion-based pipeline instead, which in turn simplifies such things like tangent space interpolation.
Two words: Premature Optimization!
I don't want to send useless data through the bus and if I understand correctly, my model matrix is constant for each model.
The amount of data transmitted is insignificant. A 4×4 matrix of single precision floats takes up 64 bytes. For all intents and purposes this is practically nothing. Heck it takes more data to issue the actual drawing commands to the GPU (and usually uniform value changes are packed into the same bus transaction as the drawing commands).
I'm thinking about having an uniform for each model's MVP matrix
Then you're going to run out of uniforms. There's only so many uniform locations a GPU is required to support. You could of course use a uniform buffer object, but that's hardly the right application for that.
I'm currently working on 2D graphics, and as far as I can tell every vertex is ultimately processed as a 4D point in homogeneous space. So I say to myself: what a waste of resources! I gather that the hardware is essentially designed to handle 3D scenes, and as such may be hardcoded to do 4d linear algebra. Yet, is there a way to write shaders (or enable a bunch of options) so that only genuine 2d coordinates are used in hard memory? I know one could embed two 2x2 matrices in a 4x4 matrix, but the gl_Position variable being a vec4 seems to end the track here. I'm not looking for some kind of "workaround" hack like this, but rather of a canonical way to make OpenGL do it, like a specific mode/state.
I've not been able to find either sample code or even a simple mention of such a fact on the net, so I gather it should simply be impossible/not desirable for, say, performance reasons. Is that so?
Modern GPUs are actually scalar architectures. In GLSL you can write also shorter vectors. vec2 is a perfectly valid type and you can create vertex arrays with just 2 scalar elements per vector, as defined by the size parameter of glVertexAttribPointer
As Anon M. Coleman commented, OpenGL will internally perform a vec4(v, [0, [0]], 1) construction for any data passed in as a vertex attribute of dimension < 4.
In the vertex shader you must assign a vec4 to gl_Position. But you can trivially expand a vec2 to a vec4:
vec2 v2;
gl_Position = vec4(v2, 0, 1);
Yes, the gl_Position output always must be a vec4, due to the fact OpenGL specifies operations in clip space. But this is not really a bottleneck at all.
All credit goes to Andon M. Coleman, who perfectly answered the question as a comment. I just quote it here for the sake of completion:
«Absolutely not. Hardware itself is/was designed around 4-component data and instructions for many years. Modern GPUs are scalar friendly, and they have to be considering the push for GPGPU (but older NV GPUs pre-GeForce 8xxx have a purely vector ALU). Now, as for vertex attributes, you get 16 slots of size (float * 4) for storage. This means whether you use a vec2 or vec4 vertex attribute, it actually behaves like a vec4. This can be seen if you ever write vec4 and only give enough data for 2 of the components - GL automatically assigns Z = 0.0 and W = 1.0.
Furthermore, you could not implement clipping in 2D space with 2D coordinates. You need homogeneous coordinates to produce NDC coordinates. You would need to use window space coordinates, which you cannot do from a vertex shader. After the vertex shader finishes, GL will perform clipping, perspective divide and viewport mapping to arrive at window space. But window space coordinates are still 4D (the Z component may not contribute to a location in window space, but it does affect fragment tests). »
I'm trying to make an GLSL shader that multiplies a 90x10 matrix with an 10x1 one. The 90x1 result corresponds to the xyz values of 30 vertices. The first large matrix is only loaded at startup. The other matrix, on the other hand, can change at each render.
How could this be done? I'm guessing the first matrix could be stored as a texture, but I have no idea what to do with the second.
Just pass the second matrix as a uniform array of floats.
uniform float vec10[10];
and perform the multiplication element by element.
Note that if that's too slow, you can try packing your large texture in such a way that you can read 4 elements with a single texelfetch.
If you want to see the syntax for binding uniform arrays, consult http://www.opengl.org/wiki/Uniform_(GLSL) .
Note, that its also completely legal to store this second matrix in texture as well; I'm just not sure of the performance impact of doing so as opposed to sending as a uniform. But get it working first, profile and optimize later.
I need to create ellipse with OpenGL. The simplest way I found to do this was to use a GLUquadric with gluDisk(...), which generates a circular disk, along with glScale(...) to turn the disk into an ellipse. The only apparent problem with this method is that the normal vectors generated by gluDisk will cease to be normalized once glScale is called.
Will that affect lighting in any way? What is OpenGL's defined behavior in this case? Are the vectors automatically renormalized, is the behavior undefined, or is it something else?
This will have (bad) effects on the lighting. The normal vector is expected to have a length of 1, if it is not, the lighting could be more/less intense than intended. Values that rely on the dot product with the normal will obviously be incorrect. In short, either be sure to normalize the normals manually or use GL_NORMALIZE. OpenGL will just assume that the normals you give it are already normalized unless using GL_NORMALIZE. There will be no "error" if they are not normalized. And if you decide to use GL_NORMALIZE, then there will be an additional computational cost each frame. So be wary if performance is going to be an issue. Regardless, using GL_NORMALIZE is probably exactly what you are looking for at this point.
You can have them automatically normalized by using glEnable(GL_NORMALIZE). Or normalize them in your shaders (if any).