Storing and loading matrices in OpenGL - opengl

Is it possible to tell OpenGL to store the current transformation matrix in a specific location (instead of pushing it on a stack) and to load a matrix from a specific location?
I prefer a solution that does not involve additional data transfer between the video device and main memory (i.e. it's better to store the matrix somewhere in video memory).

To answer the first part of you question:
The functions glLoadMatrix and glGet can do that
// get current model view matrix
double matrix[16];
glGetDoublev(GL_MODELVIEW_MATRIX, matrix)
// manipulate matrix
glLoadMatrixd(matrix);
Note that these functions are not supported with OpenGL 4 anymore. Matrix operations have to be done on application site anyway and provided as uniform variables to the shader programs.

On old fixed function pipeline the matrices were loaded into some special GPU registers on demand, but never were in VRAM, all matrix calculations happened on the CPU. In modern day OpenGL matrix calculations still happen on CPU (and right so), and are loaded into registers called "Uniforms" on demand again. However modern OpenGL also has a feature called "Uniform Buffer Objects" that allow to load Uniform (register) values from VRAM. http://www.opengl.org/wiki/Uniform_Buffer_Object
But they make little use for storing transform matrices. First, you'll change them constantly for doing animations. Second, the overhead of managing the UBOs for just a simple matrix eats much more performance, than setting it from the CPU. A matrix is just 16 scalars, or the equivalent of one single vertex with a position, normal, texture coordinate and tangent attribute.

Related

Partially updating D3D11 constant buffer

In my spare time, I am working on a 3D engine using D3D11. To get the 3D effect, I use the typical model view projection matrix multiplication in my HLSL shaders. These matrices are uploaded to a d3d11 constant buffer. The projection matrix only changes when the viewport is resized but the model and view matrix can change on a frame per frame basis (when the model or camera is moved). These changes in the matrices must be uploaded to the same constant buffer in order to be used in the shaders. When uploading these changes, the projection matrix is (generally) not changed so I do not want to reupload this matrix. So in short, I need to partially update my constant buffer by only updating specific parts (offsets) in my buffer.
In openGL, we have uniform buffers and these work (I think) in the same way as a d3d11 constant buffer does. However, if you want to update a specific part of the uniform buffer, you can use the openGL function glBufferSubData. I tried looking for a similar way to do this in D3D11 but I did not found anything. I did find someone with a similar issue but he was using D3D11.1.
link to original post: How to partially update constant buffer in DirectX 11.1. Someone also said that in D3D11, you need to upload all the data in to the constant buffer if you want to change a specifi portion. But this could result in keeping an entire copy of my buffer on the CPU side (RAM). There must be a better way right?
tldr; How can I update my d3d11 constant buffer on a specific offset?
You have several options to manage those.
The simplest ones is to create one constant buffer per update rate (so you would have one buffer for world matrices, which changes per object, one for view which would update once per frame, and one for projection which updates very sporadically.
That said, I never found the fact of splitting view and projection matrices to be that beneficial (I mean we speak about a 64 bytes extra data update per frame, which is really nothing, specially considering constant buffers are aligned on a 256 bytes boundary).
This is also even more true if you will start to add shadow maps (if you do frustum fitting your shadow camera projection matrix can change every frame).
Also if you plan to other post effects later on (like ao, depth of field...), you will also need to add extra data for camera (position, view inverse, view projection inverse, projection inverse), so it's generally more beneficial to update all that camera data once per frame (I normally use a reserved cbuffer slot for camera buffer, so I don't have to rebind it all the time).
Why not have multiple constant buffers? That would allow you to have one that rarely if ever changes while others can be changed with each frame. I'm pretty sure I've seen that used in some of the samples (though I didn't go looking).

Does OpenGL change vertices in memory when you apply transformations?

So let's say I have a single vertex (to make things easy) in my program, (0, 0, 0). Right at the origin. I render a single frame with a simple translation matrix, moving the vertex two units down the x-axis. The vertex is rendered accordingly. Does the same vertex now show up in the VRAM as
(2, 0, 0)? I've read that it's important to load all the respective identity matrices in OpenGL every time a frame is rendered--and I assume that's because everything would continually move, rotate, etc. further and further, implying that applying transformations DOES modify actual data, not just the appearance onscreen.
Strictly speaking, OpenGL is just an API definition. An implementation can do whatever it wants as long as it meets the specifications.
That being said, the answer to your question is generally: NO. It's hard to picture how storing transformed vertices back into the memory that also contained the original vertices would ever make sense.
The original vertex positions are passed into the vertex shader, where they are processed, which can include transformations. Once they exit the vertex shader, the transformed positions will most likely be stored in some kind of cache or dedicated on-chip GPU memory until they are processed by the next steps of the pipeline, which includes perspective division, application of the viewport transform, and rasterization. Once those vertex processing steps are completed, the transformed vertices can be discarded. They may stay in a cache for a little longer, for possible reuse of the processed vertex in case the same original vertex is used again. But they are not stored in any persistent way.
The way I interpret it, what you heard about having to reset the matrices for each frame was probably a misunderstanding. If you want to apply the same matrices in the next frame, you don't have to do anything at all.
What they were most likely talking about is related to how the matrix stack in legacy OpenGL works. Most calls that modify the current matrix, like glTranslatef(), glRotatef(), etc, are applied incrementally to the current matrix. For example, if you call glRotatef(), the rotation is combined with the transformation that was already on the matrix stack. The result it that your newly specified rotation is applied to the vertices first, followed by the transformations that were already on the matrix stack.
Based on this, if you want to specify transformations from scratch at the start of each frame, you will call glLoadIdentity() to reset the current transformation on the matrix stack before you start specifying your new transformations. Or you can use glPushMatrix()/glPopMatrix() to save and restore the desired state of the matrix stack.
If you use what many people call "modern OpenGL", meaning that you don't use the legacy fixed pipeline functionality, you don't have to worry about any of that. The matrix stack is gone for good, and you get to calculate your own transformation matrices, and pass them to your shader code.
Here is a link on wiki about the mathematics involved with Transformation Matrices. https://en.wikipedia.org/wiki/Transformation_matrix this will give you an understanding of the math behind the scenes. Another way to look at this is also on the lines of linear or vector algebra. So what happens under the hood when you render a scene is that all of the vertex (pixel) data is sent from the CPU to the GPU to be rasterized and drawn to the screen. This is your batch process or render call, now you also have a frame function that will happen x amount of times per second which will give you your frames per second. So if you are rendering at say 60 FPS then these pixels, vertices, triangles etc., will be drawn 60 times each second. When you apply a transformation to this set of vertices what happens here is you have a transformation matrix that is being multiplied to your model view projection matrix. MVP * T which this will be saved back into your existing MVP matrix if this is how you have your calculations set up. There are some differences between which version of OpenGL you are using as you go from OpenGL v1.0 Pure CPU calls up to v4.5. As far as I know after version 3.2 or 3.3 I don't remember which version off hand you have to implement the MVP yourself where versions greater than v1.5 where shaders were first introduced was handled for you already. Here is the documentation on OpenGL https://www.opengl.org/ and on the main page there will be a topic that says documentation from there you can either select OpenGL Registry or which ever specific version you want to look at. From here you can read their documentation about the OpenGL API since this site covers everything that is available in their API. So as you begin to understand this process, yes the actual coordinate data for these vertices does change, however it will not continuously change unless you are incrementing a static type variable with a factor of time thus giving you some kind of simulation of movement or animation. If you apply only a single transformation then these pixels, vertices, triangles, etc., will either Rotate, Translate, Scale, or Shear depending on which Transformation you are applying. I will tell you that the order of these operations does matter, but I will not tell you which order they are, that will be for you to read up on and to figure out. These reason this does matter is due to the fact that not every Matrix Multiplication has a valid Inverse Matrix. The Identity is used for reasons such as round off errors and floating point precision, so that if you happen to apply say 1,000 transformations in a matter of about 10 seconds, you do not have astronomical errors. This should be enough to point you in the right direction and also serve as a guide as to how the OpenGL API works.

GPU particle metaball-surface rendering

I have a question about a very specific method on how to render surface particles. The method is explained very well in the Nvidia GPU Gems 3 chapter 7 "Point-Based Visualization of Metaballs on a GPU", link to this chapter.
The article is about rendering an implicit surface using points or splats that are evenly distributed over the surface. They say that the computation of these particles is done completely on the GPU. Only the data which defines the surface is sent from CPU to the GPU to keep the traffic as low as possible.
They also gave some pseudo code examples of fragment shader programs to compute the particle positions, velocity etc. and for me it looks like these programs should run once for every particle.
Now my question is, how do they store these particles? What kind of data structure is it?
It must be some kind of buffer or texture that can be accessed for reading as well as for writing operations on the GPU. But how do I render this buffer/texture again in the next rendering step?
My first idea was some kind of vertex-buffer-object which is sent to the GPU once at the beginning and continuously updated there at each rendering pass. Is that possible at all?
One requirement for me is that it must be implemented using OpenGL/GLSL, I hope that is possible.
Yes you need some kind of VBO and repeated passes over the same data. The data structure can be a SoA (Struct of Arrays) or AoS (Array of Structs) depending on how you prefer to code the access to the different properties of the array, ie:
SoA:
Positions Array
Speed Array
Normal Array
AoS:
Just one Array containing [Position, Speed, Normal].
AoS are the same as interleaved arrays for rendering where in only one array you keep all the properties of the mesh.
You could use either a VBO or a Texture, the only difference is the way the caching is done, since textures are optimized for 2D access.
The rendering is done in steps exactly like you are picturing it, so all you need to do is to "render" the physical stepping of the system using shaders that compute the properties you want and then bind the same structures to the true graphics rendering in a subsequent step.

issues abour shaders and transformations in opengl

If I'm not wrong, shaders are programs that run in GPU, right?
Do we send data to this programs using glUniformMatrix*?
I don't know if it's right but if I send a MVP matrix to the shader, the object's vertices that I want to render will use the position calculated by the shader right before calling the render function.
If I want to render a lot of objects and I must send the MVP matrix then render the object right after, so I will have a code that send to GPU -> render a lot of times. However if I'm not wrong again this is not a good practice because I'm losing performance because the cost of send information to GPU is very expensive. So a way to get a better performance is send all the informations to GPU then render all the objects.
And the questions of 1 million dollars is, How can the shader program identify that the MVP matrix is used by a single object and not another one?
If I'm not wrong, shaders are programs that run in GPU, right?
Possibly. Many implementations of OpenGL have software renderers that they can fall back to if resources on the GPU are constrained. But usually, yes, they're run on the GPU.
Do we send data to this programs using glUniformMatrix*?
That's the usual way. You also set things like texture coordinates either via immediate mode methods like glTexCoord*() (in legacy OpenGL), or via buffer objects.
I don't know if it's right but if I send a MVP matrix to the shader, the object's vertices that I want to render will use the position calculated by the shader right before calling the render function.
There are different types of shaders. A vertex shader is called once for each vertex. A fragment shader is called once per fragment (roughly once per output screen-space pixel that actually gets drawn). Generally you will probably want to send the model, view, and projection matrices separately to the vertex shader. (Or possibly in some combination that lifts some computations out of the shader.) Then you'll multiply each vertex by the appropriate matrix (or combo of matrices).
And there are other types of shaders beyond those, but those 2 are the most common.
If I want to render a lot of objects and I must send the MVP matrix then render the object right after, so I will have a code that send to GPU -> render a lot of times. However if I'm not wrong again this is not a good practice because I'm losing performance because the cost of send information to GPU is very expensive. So a way to get a better performance is send all the informations to GPU then render all the objects.
I wouldn't get overly worried about performance until you have shaders working properly. Performance can be dependent on a lot of different factors. One is how often you send or receive data to or from the GPU and how much data you're transferring. Another is how many passes you do for each shader, and another is the size of your textures, geometry, and other stuff.
And the questions of 1 million dollars is, How can the shader program identify that the MVP matrix is used by a single object and not another one?
The way I've done that in the past is to set the current shader program and uniforms via glUseProgram() and glUniform*(), then upload my geometry for an object, and repeat as necessary for each object or set of objects as needed.

Create view matrices in GLSL shader

I have many positions and directions stored in 1D textures on the GPU. I want to use those as rendersources in a GLSL geometry shader. To do this, I need to create corresponding view matrices from those textures.
My first thought is to take a detour to the CPU, read the textures to memory and create a bunch of view matrices from there, with something like glm::lookat(). Then send the matrices as uniform variables to the shader.
My question is, wether it is possible to skip this detour and instead create the view matrices directly in the GLSL geometry shader? Also, is this feasible performance wise?
Nobody says (or nobody should say) that your view matrix has to come from the CPU through a uniform. You can just generate the view matrix from the vectors in your texture right inside the shader. Maybe the implementation of the good old gluLookAt is of help to you there.
If this approach is a good idea performance-wise, is another question, but if this texture is quite large or changes frequently, this aproach might be better than reading it back to the CPU.
But maybe you can pre-generate the matrices into another texture/buffer using a simple GPGPU-like shader that does nothing more than generate a matrix for each position/vector in the textures and store this in another texture (using FBOs) or buffer (using transform feedback). This way you don't need to make a roundtrip to the CPU and you don't need to generate the matrices anew for each vertex/primitive/whatever. On the other hand this will increase the required memory as a 4x4 matrix is a bit more heavy than a position and a direction.
Sure. Read the texture, and build the matrices from the values...
vec4 x = texture(YourSampler, WhateverCoords1);
vec4 y = texture(YourSampler, WhateverCoords2);
vec4 z = texture(YourSampler, WhateverCoords3);
vec4 w = texture(YourSampler, WhateverCoords4);
mat4 matrix = mat4(x,y,z,w);
Any problem with this ? Or did I miss something ?
The view matrix is a uniform, and uniforms don't change in the middle of a render batch, nor can they be written to from a shader (directly). Insofar I don't see how generating it could be possible, at least not directly.
Also note that the geometry shader runs after vertices have been transformed with the modelview matrix, so it does not make all too much sense (at least during the same pass) to re-generate that matrix or part of it.
You could of course probably still do some hack with transform feedback, writing some values to a buffer, and either copy/bind this as uniform buffer later or just read the values from within a shader and multiply as a matrix. That would at least avoid a roundtrip to the CPU -- the question is whether such an approach makes sense and whether you really want to do such an obscure thing. It is hard to tell what's best without knowing exactly what you want to achieve, but quite probably just transforming things in the vertex shader (read those textures, build a matrix, multiply) will work better and easier.