Whats the proper way to draw a large model in OpenGL? - opengl

Im trying to draw large gridblock using OpenGL (for example: 114x112x21 cells).
As far as I know ... each cell should be drawn as 6 faces (12 triangle), each contains 4 vertices. each of the vertices has position, normal, and color vectors (each of these is 3*sizeof(GLfloat)).
These values will be passed to VRam in VBO(s). I did some calculations for the example mentioned and found out that it will cost ~200MB to store this data. I'm not sure if this is right, but if it is, I think it's way too much VRAM for such 1 model.
I'm sure there are more efficient ways to do this. and if any can point me to the right direction I would be very thankful.
EDIT: I may have been unclear about the nature of the cells. they do NOT have uniform dimensions that can be scaled/translated to produce other cells or other faces on the same cell. Almost each cell has different dimensions on each face. (these are predefined)
Also let me note that the colors are per cell and are based on a algorithmic scale of different values (depending on which the user wants to visualize). so if the user chooses a value (one for each cell) to be visualized, colors are calculated based on a scale and used to color the cells.
As #BDL suggested in his answer, I'll probably use geometry shader to calculate per face normals.

There are several things that can be done:
First of all, each vertex position (except the ones on the sides) are shared between 8 cells.
If you need per face normals, in which case a position would require several normals, calculate them in a geometry shader instead of storing them in the VBO.
If each cell has a constant color, store it in a 3d-texture and sample the texture in the fragment shader.
For more hints you would have to provide more details on the cells and on what you want to achieve.

There are a few tricks you could do.
To start with, you could use instancing per cube. You then have per vertex positions and normals for a single cell plus a single position and color per cell.
You can actually eliminate the cell positions by deriving it from the instance id, by reversing the formula id = z * width * height + y * width + x.
Furthermore, using a float per component is probably overkill for your colors, you may want to use a smaller format such as GL_RGBA8.
Applying that to your example (268128 cells) we get a buffer size of approximately 1 MiB (of which the 4 bytes color per cell is the most significant, the others are only for a single cell).
Note that this assumes that you want a single color for your entire cell. If you want a color per vertex, or per vertex per face, you can do so by using an 1D texture and indexing by instance and vertex id.
The biggest part of your data is going to be color though, unless there is a constant pattern. If you still want floats per component en per face per vertex colors it is going to take ~73 MiB on color alone.

You can use instanced rendering. It renders the same vertex data with the same shader multiple times in just 1 draw call. Here is a link to the wiki(external): https://en.wikipedia.org/wiki/Geometry_instancing

Related

Draw particles trajectories of undefined length with opengl

I have to draw a physical simulation that displays trajectories of moving around particles. 3D position data are read from a database in realtime while drawing. Once set up a VBO for each object, the drawing call will be the standard glDrawArrays(GL_LINE_STRIP, 0, size). The problem is that VBOs storing trail points are updated every frame since new points are added. This seems to me extremely inefficient! Furthermore what if I want to draw the trajectories with a gradient color from the particle's actual position to the older points? I have to update the color of all vertices in the VBO at every draw call! What is the standard way through this kind of stuff?
To summarize:
I want draw lines of undefined - potentially infinite - length (the length increase with time).
I want the color of points in the trajectories to shade based on the actual relative position on the trajectories (for example white in the beginning (actual particle position), black in the end (first particle position), grey in middle).
I read many tutorials but I haven't found nothing about drawing ever-updating and indefinitely-growing lines... I will appreciate any suggestion! Thanks!
Use multiple VBOs so that you have a fixed number of vertices per. That way you only have to modify the last VBO in the sequence when you add new points instead of completely updating one giant VBO.
Add a sequence number vertex attribute or use gl_VertexID and pass in the total point count as a uniform. Then you can divide a given vertex's sequence number by the total count and use that fraction to mix between your gradient colors.

Changing the size of a pixel depending on it's color with GLSL

I have a application that will encode data for bearing and intensity using 32 bits. My fragment shader already decodes the values and then sets the color depending on bearing and intensity.
I'm wondering if it's also possible, via shader, to change the size (and possibly shape) of the drawn pixel.
As an example, let's say we have 4 possible values for intensity, then 0 would cause a single pixel to be drawn, 1 would draw a 2x2 square, 2 a 4x4 square and 3 a circle with a radius of 6 pixels.
In the past, we had to do all this on the CPU side and I was hoping to offload this job to the GPU.
No, fragment shaders cannot affect the "size" of the data they write. Once something has been rasterized into fragments, it doesn't really have a "size" anymore.
If you're rendering GL_POINTS primitives, you can change their size from the vertex shader. As for point sizes, it's rather difficult to ensure that a particular point covers an exact number of fragments.
The first thing that came into my mind is doing something similiar to blur technique, but instead of bluring the texture, we use it to look for neighbouring texels with a range to check if it has the intensity above 1.0f. if yes, then set the current texel color to for example red.
If you're using a fbo that is 1:1 in comparison to window size, use 1/width and 1/height in texture coordinates to get approximately 1 pixel (well not exactly because it is not a pixel but texel, just nearly)
Although this work just fine, the downside of this is it is very expensive as it will have n^2 complexity and probably some branching.
Edit: after thinking awhile this might not work for size with even number

Representation of mesh - one vertex (position in space) = one pos, tex coord, normal?

I'm writing the class for storing, loading and rendering the static mesh in my DirectX application.
I know that Box model can uses 8 vertices, but usually it uses 24 or 36 (because one "vertex in space" is de facto 3 vertices with the same positions and different texture coord/normals).
My question is: how about meshes (e.g. character mesh exported from 3ds max or similar application)?
I'm working on the exporter plug-in now for my application. Should I:
For each vertex in mesh, exports it (as position, normal, texture coordinates) - will it be enough always/for some cases?
or maybe:
For each face, export 3 vertices (pos, normal, tex coord)
?
I'm not sure if the texture coordinates/normals will be different for the same "vertex" in 3ds max (same position but different normals/tex coord?).
The second aprouch should works fine with D3DPT_TRIANGLELIST, but for 5.000 vertices mesh from 3ds max I will get 15.000 "real" vertices (+/- each 3 will have same pos and different tex coord/normals).
Also, on the example below, the object consists of two parts - top one has #1 smoothing group and bottom one has #2 smoothing group -will my problem looks different for vertex "inside" smoothing group and between two groups (e.g. connection of top/bottom)?
In other words: is the black circled "vertex" one "real" vertex with same pos/tex coord/normals or it's 4 vertices with same positions only (4 vertices with same everything makes sense?)?
And what about red one?
I'm using indexed buffers.
Exporting three vertices for each triangle is always sufficient, but may involve duplicating data unnecessarily. If adjacent triangles have disjoint texture coordinates or other properties, you must duplicate in order to produce correct rendering results. If you want, you can merge duplicate vertices in a post-processing step.
Regarding your reference images, it doesn't look like there is any texture applied to the surface, so the red circled vertex must be two distinct vertices in the buffer in order to create the color discontinuity, as it should be. The black circled vertex doesn't need to be, but it still may be duplicated. In general, you should share vertices within a smoothing group, and not worry about deduplicating across groups. If vertex throughput ends up being an issue, you can look into optimizing it, but for most workloads you'll be pixel or fillrate bound.

Using IBO for animation - good or bad?

I just watching my animated sprite code, and get some idea.
Animation was made by altering tex coords. It have buffer object, which holds current frame texture coords, as new frame requested, new texture coords feed up in buffer by glBufferData().
And what if we pre-calculate all animation frames texture coords, put them in BO and create Index Buffer Object with just a number of frame, which we need to draw
GLbyte cur_frames = 0; //1,2,3 etc
Now then as we need to update animation, all we need is update 1 byte (instead of 4 /quad vertex count/ * 2 /s, t/ * sizeof(GLfloat) bytes for quad drawing with TRIANGLE_STRIP) frame of our IBO with glBufferData, we don't need hold any texture coords after init of our BO.
I am missing something? What are contras?
Edit: of course your vertex data may be not gl_float just for example.
As Tim correctly states, this depends on your application, let us talk some numbers, you mention both IBO's and inserting texture coordinates for all frames into one VBO, so let us take a look at the impact of each.
Suppose a typical vertex looks like this:
struct vertex
{
float x,y,z; //position
float tx,ty; //Texture coordinates
}
I added a z-component but the calculations are similar if you don't use it, or if you have more attributes. So it is clear this attribute takes 20 bytes.
Let's assume a simple sprite: a quad, consisting of 2 triangles. In a very naive mode you just send 2x3 vertices and send 6*20=120 bytes to the GPU.
In comes indexing, you know you have actually only four vertices: 1,2,3,4 and two triangles 1,2,3 and 2,3,4. So we send two buffers to the GPU: one containing 4 vertices (4*20=80 byte) and one containing the list of indices for the triangles ([1,2,3,2,3,4]), let's say we can do this in 2 byte (65535 indices should be enough), so this comes down to 6*2=12 byte. In total 92 byte, we saved 28 byte or about 23%. Also, when rendering the GPU is likely to only process each vertex once in the vertex shader, it saves us some processing power also.
So, now you want to add all texture coordinates for all animations at once. First thing you have to note is that a vertex in indexed rendering is defined by all it's attributes, you can't split it in an index for positions and an index for texture coordinates. So if you want to add extra texture coordinates, you will have to repeat the positions. So each 'frame' that you add will add 80 byte to the VBO and 12 byte to the IBO. Suppose you have 64 frames, you end up with 64*(80+12)=5888byte. Let's say you have 1000 sprites, then this would become about 6MB. That does not seem too bad, but note that it scales quite rapidly, each frame adds to the size, but also each attribute (because they have to be repeated).
So, what does it gain you?
You don't have to send data to the GPU dynamically. Note that updating the whole VBO would require sending 80 bytes or 640 bits. Suppose you need to do this for 1000 sprites per frame at 30 frames per second, you get to 19200000bps or 19.2Mbps (no overhead included). This is quite low (e.g. 16xPCI-e can handle 32Gbps), but it could be worth wile if you have other bandwidth issues (e.g. due to texturing). Also, if you construct your VBO's carefully (e.g. separate VBO's or non-interleaved), you could reduce it to only updating the texture-part, which is only 16 byte per sprite in the above example, this could reduce bandwidth even more.
You don't have to waste time computing the next frame position. However, this is usually just a few additions and few if's to handle the edges of your textures. I doubt you will gain much CPU power here.
Finally, you also have the possibility to simply split the animation image over a lot of textures. I have absolutely no idea how this scales, but in this case you don't even have to work with more complex vertex attributes, you just activate another texture for each frame of animation.
edit: another method could be to pass the frame number in a uniform and do the calculations in your fragment shader, before sampling. Setting a single integer uniform should be that much of an overhead.
For a modern GPU, accessing/unpacking single bytes is not necessarily faster than accessing integer types or even vectors (register sizes & load instructions, etc.). You can just save memory and therefore memory bandwidth, but I wouldn't expect this to give much of a difference in relation to all other vertex attribute array accesses.
I think, the fastest way to supply a frame index for animated sprites is either an uniform, or if multiple sprites have to be rendered with one draw call, the usage of instanced vertex attrib arrays. With the latter, you could provide a single index for fixed-size subsequences of vertices.
For example, when drawing 'sprite-quads', you'd have one frame index fetch per 4 vertices.
A third approach would be a buffer-texture, when using instanced rendering.
I recommend a global (shared) uniform for time/frame index calculation, so you can calculate the animation index on the fly within you shader, which doesn't require you to update the index buffer (which then just represents the relative animation state among sprites)

OpenGL DirectX XNA Vertex uniqueness When use or not Indices?

This is what I am getting when reading a FBX File
Normals Count = 6792
TextureCoords Count = 6792
Faces = 2264
Vertices = 3366
What I dont get is why I have less Vertices than Normals / TextCoords
I need your help to understand when I should use Index Buffer and when not
Index buffers help to reduce bandwidth to the Graphics Card, got it.
Index buffers help to not repeat the vertex with the SAME data, got it.
Let's say I have a model with 1000 Vertices and 3000 Faces formed from those Vertices,
thus an Index Buffer of 9000 elements (3 indices per face)
I have 1000 unique Positions array but 9000 unique TextCoords plus Normals array
If the Vertices were only the Position, that is the best scenario for the Index Buffer, no redundant Vertices
But it happens that I also have TextureCoords and Normals, and per face they can have different values per Position, in other words, the Position shared between faces but with different attributes for each face
So the uniqueness of the Vertex will be -Position AND TextureCoord AND Normal-
It will be unlikely I have repeated vertices with that full combination then the Indices are useless, right?
I will need to repeat the Position for each TextureCoord AND Normal
In the end seems I can't take advantage of having only 1000 Indexed-Positions
Then my point is , I don't need Indices right? or am I missunderstanding the concepts?
It will be unlikely I have repeated vertices with that full combination then the Indices are useless, right?
In the event that you have a model where every face has its own entirely unique texture coordinates, yes, indices won't help you very much. Also, you don't have repeated vertices; you have repeated positions. Positions aren't vertices; they're part of a vertex's data, just like the normal, texcoord, etc.
However, I defy you to actually show me such a model for any reasonable object (ie: something not explicitly faceted for effect, or something not otherwise jigsawed together as far as its texture coordinates are concerned). And for a model with 3000 individual faces, so no cubes.
In the real world, most models will have plenty of vertex reuse. If yours don't, then either your modeller is terrible at his job, or it is a very special case.
Assuming you're storing your buffers on the GPU and not in client memory, an index buffer won't do anything to reduce bandwidth to the GPU after initialization. It instead saves VRAM space by not having to duplicate vertex data.
If your data is set up in such a way that no vertices ever repeat, using indices is redundant and won't save space.
In general you should be thinking of a vertex as a set of unique positions, texture coordinates, and normals. If two vertices have the same position, but different texture coordinates and normals, they are not the same vertex and should not be treated as such.
Typically when dealing with 3d models that consist of thousands of vertices, there will be a lot of vertex overlap and using indices will help a lot. It's a bit interesting that you don't have that many duplicated vertices.
Here are two examples of where indexing is useful and where it is not:
Example 1
You are drawing a square as two separate triangles.
v0---v3
|\ |
| \ |
| \|
v1---v2
Since this example is in 2d, we can only really use positions and texture coordinates. Without indexing, your vertex buffer will look like this if you interleave the positions and texture coordinates together:
p0x, p0y, t0x, t0y,
p1x, p1y, t1x, t1y,
p2x, p2y, t2x, t2y,
p0x, p0y, t0x, t0y,
p2x, p2y, t2x, t2y,
p3x, p3y, t3x, t3y
When you use indices, your vertex buffer will look like this:
p0x, p0y, t0x, t0y,
p1x, p1y, t1x, t1y,
p2x, p2y, t2x, t2y,
p3x, p3y, t3x, t3y
and you'll have an index buffer that looks like this:
0, 1, 2,
0, 2, 3
Assuming your vertex data is all floats and your indices are bytes, the amount of space taken unindexed is 96 bytes, and indexed is 70 bytes.
It's not a lot, but this is just for a single square. Yes, this example isn't the most optimized way of drawing a square (you can avoid indices in the second example by drawing as triangle strips and not triangles), but it's the simplest example I can come up with.
Typically with more complicated models, you'll either be indexing vertices together for really long triangle strips or triangles and the memory savings become huge.
Example 2
You're drawing a circle as a triangle fan. If you don't know how triangle fans work, here's a pretty good image that explains it. The vertices in this case are defined alphabetically A-F.
Image from the Wikipedia article for Triangle fan.
To draw a circle using this method, you would start at any vertex and add all the other vertices in order going either clockwise or counter-clockwise. (You might be able to better visualize it by imagining A moved down to be below F and B in the above diagram) It works out so that the indices are sequential and never repeat.
In this case, adding an index buffer will be redundant and take up more space than the unindexed version. The index buffer would look something like:
0, 1, 2, 3, 4, ..., n
The unindexed version would be the exact same, just without the index buffer.
In 3d, you'll find it far less common to be able to find a drawing mode that matches your indexing completely, as seen in the circle example. In 3d, indexing is almost always useful to some degree.