I've been reading up on how some OpenGL-based architectures manage their objects in an effort to create my own light weight engine based on my application's specific needs (please no "why don't you just use this existing product" responses). One architecture I've been studying is Qt's Quick Scene Graph, and while it makes a lot of sense I'm confused about something.
According to their documentation, opaque primitives are ordered front-to-back and non-opaque primitives are ordered back-to-front. The ordering is to do early z-killing by hopefully eliminating the need to process pixels that appear behind others. This seems to be a fairly common practice and I get it. It makes sense.
Their documentation also talks about how items that use the same Material can be batched together to reduce the number of state changes. That is, a shared shader program can be bound once and then multiple items rendered using the same shader. This also makes sense and I'm good with it.
What I don't get is how these two techniques work together. Say I have 3 different materials (let's just say they are all opaque for simplification) and 100 items that each use one of the 3 materials, then I could theoretically create 3 batches based off the materials. But what if my 100 items are at different depths in the scene? Would I then need to create more than 3 batches so that I can properly sort the items and render them front-to-back?
Based on what I've read of other engines, like Ogre 3D, both techniques seem to be used pretty regularly, I just don't understand how they are used together.
If you really have 3 materials, you can only batch objects that are rendered in a group according to their sorting. At times the sorting can be optimized for objects that do not overlap each other to minimize the material switches.
The real "trick" behind all that how ever is to combine the materials. If the engine is able to create one single material out of the 3 source materials and use the shaders to properly apply the material settings to the different objects (mostly that is transforming the texture coordinates), everything can be batched and ordered at the same time. But if that is not possible the engine can't optimize it further and has to switch the material every now and then.
You don't have to group every material in your scene together. But if it's possible to group those materials that often switch with each other, it can already improve the performance a lot.
Related
I'm trying to implement transparency in OpenGL and from what I've read it's necessary to first render all opaque objects and then render the transparent objects in the correct order.
My issue is how do I go about separating opaque from transparent objects in my scene graph so that I can render the opaque objects first. My scene graph consists of a bunch of nodes which can have entities attached to them, and each entity can be composed of several meshes with different materials, each with one or more textures.
If I load a scene into my graph I need to know which materials are partially or completely transparent which means I need to check if the loaded textures have any alpha values smaller than 1. I'm currently using Assimp to handle loading the models/scenes and SOiL to read the textures and I haven't found any simple way to separate transparent materials from opaque ones.
This probably has a really simple solution because I haven't found anyone else with the same question, but I'm still starting with OpenGL and I have been stuck on this matter for the past hours.
How is transparency normlly done and how are opaque objects separated from partially or fully transparent ones so that they may be rendered first?
For most rendering method, you don't strictly have to separate the opaque from the transparent objects. If you think about it, transparency (or opacity) is a continuous quality. In OpenGL, the alpha component is typically used to define opacity. Opaque objects have an alpha value of 1.0, but this is just one value in a continuous spectrum. Methods that can correctly handle all alpha values will not suddenly fail just because the alpha value happens to be 1.0.
Putting it differently: Is an object with alpha value 0.9 opaque? What if the alpha value is 0.99, can you justify treating it differently from alpha value 1.0? It is really all continuous, and not a binary decision.
With that said, there are reasons why it's common to treat opaque objects differently. The main ones I can think of:
Since the non-opaque objects have to be sorted for common simple transparency rendering methods, you can save work by sorting only the non-opaque objects. Sorting is not cheap, and you reduce processing time this way. For most of these methods, you get perfectly fine results by sorting all objects, at the price of being less efficient.
Often times, objects cannot be sorted perfectly, or doing so is at least not easy. An obvious problem is when objects overlap, but the challenges go beyond that case (see my answer here for some more in depth illustration of problematic cases: Some questions about OpenGL transparency). In these cases, you get artifacts from incorrect sorting. By drawing opaque objects with depth testing enabled, you avoid the possibility of artifacts for those objects, and reduce the overall occurrence of noticeable artifacts.
Your situation of not knowing which objects contain transparency seems somewhat unusual. In most cases, you know which objects are opaque because you're in control of the rendering, and the content. So having an attribute that specifies if an object is opaque normally comes pretty much for free.
If you really have no way to define which objects are opaque, a couple of options come to mind. The first one is that you sort all objects, and render them in order. Based on the explanation above, you could encounter performance or quality degradation, but it's worth trying.
There are methods that can render with transparency without any need for sorting, or separating opaque and transparent objects. A simple one that comes to mind is alpha-to-coverage. Particularly if you render with MSAA anyway, it results in no overhead. The downside is that the quality can be mediocre depending on the nature of your scene. But again, it's worth trying.
You can find a basic explanation of alpha-to-coverage, as well as some other simple transparency rendering methods, in my answer to this question: OpenGL ES2 Alpha test problems.
There are more advanced transparency rendering methods that partly rely on recent hardware features. Covering them goes beyond the scope of a post here (and also largely beyond my knowledge...), but you should be able to find material by searching for "order independent transparency".
In short: What is the "preferred" way to wrap OpenGL's buffers, shaders and/or matrices required for a more high level "model" object?
I am trying to write this tiny graphics engine in C++ built on core OpenGL 3.3 and I would like to implement an as clean as possible solution to wrapping a higher level "model" object, which would contain its vertex buffer, global position/rotation, textures (and also a shader maybe?) and potentially other information.
I have looked into this open source engine, called GamePlay3D and don't quite agree with many aspects of its solution to this problem. Is there any good resource that discusses this topic for modern OpenGL? Or is there some simple and clean way to do this?
That depends a lot on what you want to be able to do with your engine. Also note that these concepts are the same with DirectX (or any other graphic API), so don't focus too much your search on OpenGL. Here are a few points that are very common in a 3D engine (names can differ):
Mesh:
A mesh contains submeshes, each submesh contains a vertex buffer and an index buffer. The idea being that each submesh will use a different material (for example, in the mesh of a character, there could be a submesh for the body and one for the clothes.)
Instance:
An instance (or mesh instance) references a mesh, a list of materials (one for each submesh in the mesh), and contains the "per instance" shader uniforms (world matrix etc.), usually grouped in a uniform buffer.
Material: (This part changes a lot depending on the complexity of the engine). A basic version would contain some textures, some render states (blend state, depth state), a shader program, and some shader uniforms that are common to all instances (for example a color, but that could also be in the instance depending on what you want to do.)
More complex versions usually separates the materials in passes (or sometimes techniques that contain passes) that contain everything that's in the previous paragraph. You can check Ogre3D documentation for more info about that and to take a look at one possible implementation. There's also a very good article called Designing a Data-Driven Renderer in GPU PRO 3 that describes an even more flexible system based on the same idea (but also more complex).
Scene: (I call it a scene here, but it could really be called anything). It provides the shader parameters and textures from the environment (lighting values, environment maps, this kind of things).
And I thinks that's it for the basics. With that in mind, you should be able to find your way around the code of any open-source 3D engine if you want the implementation details.
This is in addition to Jerem's excellent answer.
At a low level, there is no such thing as a "model", there is only buffer data and the code used to process it. At a high level, the concept of a "model" will differ from application to application. A chess game would have a static mesh for each chess piece, with shared textures and materials, but a first-person shooter could have complicated models with multiple parts, swappable skins, hit boxes, rigging, animations, et cetera.
Case study: chess
For chess, there are six pieces and two colors. Let's over-engineer the graphics engine to show how it could be done if you needed to draw, say, thousands of simultaneous chess games in the same screen, instead of just one game. Here is how you might do it.
Store all models in one big buffer. This buffer has all of the vertex and index data for all six models clumped together. This means that you never have to switch buffers / VAOs when you're drawing pieces. Also, this buffer never changes, except when the user goes into settings and chooses a different style for the chess pieces.
Create another buffer containing the current location of each piece in the game, the color of each piece, and a reference to the model for that piece. This buffer is updated every frame.
Load the necessary textures. Maybe the normals would be in one texture, and the diffuse map would be an array texture with one layer for white and another for black. The textures are designed so you don't have to change them while you're drawing chess pieces.
To draw all the pieces, you just have to update one buffer, and then call glMultiDrawElementsIndirect()... once per frame, and it draws all of the chess pieces. If that's not available, you can fall back to glDrawElements() or something else.
Analysis
You can see how this kind of design won't work for everything.
What if you have to stream new models into memory, and remove old ones?
What if the models have different size textures?
What if the models are more complex, with animations or forward kinematics?
What about translucent models?
What about hit boxes and physics data?
What about different LODs?
The problem here is that your solution, and even the very concept of what a "model" is, will be very different depending on what your needs are.
A mobile application that I'm working on is expanding in scope. The client would like to have actual 3D objects in a product viewer within the app that a potential customer/dealer could zoom and rotate. I'm concerned about bringing a model into an OpenGL environment within a mobile device.
My biggest concern is complexity. I've looked at some of the engineering models for the products and some of them contain more than 360K faces! Does anyone know of any guidelines which would discuss how complex of an object OpenGL is able to handle?
Does anyone know of any guidelines which would discuss how complex of an object OpenGL is able to handle?
OpenGL is just a specification and doesn't deal with geometrical complexity. BTW: OpenGL doesn't treat geometry as coherent objects. For OpenGL it's just a bunch of loose points, lines or triangles that it throws (i.e. renders) to a framebuffer, one at a time.
Any considerations regarding performance make only sense with respect to an actual implementation. For example a low end GPU may be able to process as little as 500k vertices per second, while high end GPUs process several 10 million vertices per second with ease.
What is the best method to store 3d models in game ?
I store in vectors:
vector triangles (each triangle contain number of texcords, numer of vertex and number of normal),
vector points;
vector normals;
vector texCords;
I'm not sure what constitutes "the best method" in this case, as that's going to be situation dependent and in your question, it's somewhat open to interpretation.
If you're talking about how to rapidly render static objects, you can go a long way using Display Lists. They can be used to memoize all of the OpenGL calls once and then recall those instructions to render the object whenever used in your game. All of the overhead you incured to calculate vertex locations, normals, etc are only performed once when you build each display list. The drawback is that you won't see much of a performance gain if your models change too often.
EDIT: SurvivalMachine below mentions that display lists are deprecated. In particular, they are deprecated in OpenGL Version 3.0 and completely removed from the standard in Version 3.1. After a little research, it appears that the Vertex Buffer Object (VBO) extension is the prefered alternative, though a number of sources I found claimed that performance wasn't as good as display lists.
I chose to import models from the .ms3d format, and while I may refactor later, I think it provided a decent foundation for the data structure of my 3D models.
The spec (in C header format) is a pretty straightforward read; I am writing my game in Java so I simply ported over each data structure: vertex, triangle, group, material, and optionally the skeletal animation elements.
But really, a model is just triplets of vertices (or triangles), each with a material, right? Start by creating those basic structures, write a draw function that takes a model for an argument and draws it, and then add on any other features you might need as you need them. Iterative design, if you will.
I am working on a simple CAD program which uses OpenGL to handle on-screen rendering. Every shape drawn on the screen is constructed entirely out of simple line segments, so even a simple drawing ends up processing thousands of individual lines.
What is the best way to communicate changes in this collection of lines between my application and OpenGL? Is there a way to update only a certain subset of the lines in the OpenGL buffers?
I'm looking for a conceptual answer here. No need to get into the actual source code, just some recommendations on data structure and communication.
You can use a simple approach such as using a display list (glNewList/glEndList)
The other option, which is slightly more complicated, is to use Vertex Buffer Objects (VBOs - GL_ARB_vertex_buffer_object). They have the advantage that they can be changed dynamically whereas a display list can not.
These basically batch all your data/transformations up and them execute on the GPU (assuming you are using hardware acceleration) resulting in higher performance.
Vertex Buffer Objects are probably what you want. Once you load the original data set in, you can make modifications to existing chunks with glBufferSubData().
If you add extra line segments and overflow the size of your buffer, you'll of course have to make a new buffer, but this is no different than having to allocate a new, larger memory chunk in C when something grows.
EDIT: A couple of notes on display lists, and why not to use them:
In OpenGL 3.0, display lists are deprecated, so using them isn't forward-compatible past 3.0 (2.1 implementations will be around for a while, of course, so depending on your target audience this might not be a problem)
Whenever you change anything, you have to rebuild the entire display list, which defeats the entire purpose of display lists if things are changed often.
Not sure if you're already doing this, but it's worth mentioning you should try to use GL_LINE_STRIP instead of individual GL_LINES if possible to reduce the amount of vertex data being sent to the card.
My suggestion is to try using a scene graph, some kind of hierarchical data structure for the lines/curves. If you have huge models, performance will be affected if you have plain list of lines. With a graph/tree structure you can check easily which items are visible and which are not by using bounding volumes. Also with a scenegraph you can apply transformation easily and reuse geometries.