I want to draw multiple line strips of different length.
All vertices are in one common buffer.
The order looks for example as follow:
v_1_1,v_1_2,v_1_3,v_2_1,v_2_2,v_3_1,.. for each vertex v_i_j where i is the index of the strip and j the index of the vertex in the strip.
Is there a possibility to use an index buffer to specify the begin and end indices for each strip in that buffer?
Or any other way to solve that problem?
In OpenGL, draw call overhead is not that high, compared to some other APIs. The issue is the overhead of state changes between draw calls. So the primary goal in terms of optimization should be to reduce the number of state changes (particularly expensive ones) that you need between different draw calls.
But draw calls aren't completely without cost, and there's no sense in throwing away free performance, so use a primitive restart index. Basically, what you do is designate an index (typically the maximum index for the index type. 16-bit indices would use 0xFFFF) to refer, not to an index, but to the intent to restart the primitive. So in your example, you would do this:
v_1_1, v_1_2, v_1_3, 0xFFFF, v_2_1, v_2_2, 0xFFFF, v_3_1,..
You put the restart index between the strips.
There are two forms of primitive restart: user-defined indices and fixed indices. The user-defined index version allows you to specify what index represents "restart"; the fixed index always uses the maximum index.
Even though fixed-index restart requires a higher GL version (4.3 rather than 3.1), the fixed-index version is actually more commonly capable among all GPU hardware. OpenGL ES for example doesn't have the non-fixed version, and neither does Vulkan. And there's no real downside to just using the max index. So even if the implementation doesn't support fixed restart indices, you should always use the maximum index as your user-defined restart index.
Related
If using alpha-to-coverage without explicitly setting the samples from the shader (a hardware 4.x feature?), is the coverage mask for alpha value ‘a‘ then guaranteed to be the bit-flip of the coverage mask for alpha value ‘1.f-a‘?
Or in other words: if i render two objects in the same location, and the pixel alphas of the two objects sum up to 1.0, is it then guaranteed that all samples of the pixel get written to (assuming both objects fully cover the pixel)?
The reason why I ask is that I want to crossfade two objects and during the crossfade each object should still properly depth-sort in respect to itself (without interacting with the depth values of the other object and without becoming ‚see-through‘).
If not, how can I realize such a ‚perfect‘ crossfade in a single render pass?
The logic for alpha-to-coverage computation is required to have the same invariance and proportionality guarantees as GL_SAMPLE_COVERAGE (which allows you to specify a floating-point coverage value applied to all fragments in a given rendering command).
However, said guarantees are not exactly specific:
It is intended that the number of 1’s in this value be proportional to the sample coverage value, with all 1’s corresponding to a value of 1.0 and all 0’s corresponding to 0.0.
Note the use of the word "intended" rather than "required". The spec is deliberately super-fuzzy on all of this.
Even the invariance is really fuzzy:
The algorithm can and probably should be different at different pixel locations. If it does differ, it should be defined relative to window, not screen, coordinates, so that rendering results are invariant with respect to window position.
Again, note the word "should". There are no actual requirements here.
So basically, the answer to all of your questions are "the OpenGL specification provides no guarantees for that".
That being said, the general thrust of your question suggests that you're trying to (ab)use multisampling to do cross-fading between two overlapping things without having to do a render-to-texture operation. That's just not going to work well, even if the standard actually guaranteed something about the alpha-to-coverage behavior.
Basically, what you're trying to do is multisample-based dither-based transparency. But like with standard dithering methods, the quality is based entirely on the number of samples. A 16x multisample buffer (which is a huge amount of multisampling) would only give you an effective 16 levels of cross-fade. This would make any kind of animated fading effect not smooth at all.
And the cost of doing 16x multisampling is going to be substantially greater than the cost of doing render-to-texture cross-fading. Both in terms of rendering time and memory overhead (16x multisample buffers are gigantic).
If not, how can I realize such a ‚perfect‘ crossfade in a single render pass?
You can't; not in the general case. Rasterizers accumulate values, with new pixels doing math against the accumulated value of all of the prior values. You want to have an operation do math against a specific previous operation, then combine those results and blend against the rest of the previous operations.
That's simply not the kind of math a rasterizer does.
I was looking for ways to associate attributes with arbitrary groupings of verticies, at first instancing appeared to be the only way for me to accomplish this, but then I stumbled up this question and this answer states :
However what is possible with newer versions of OpenGL is setting the rate at which a certain vertex attribute's buffer offset advances. Effectively this means that the data for a given vertex array gets duplicated to n vertices before the buffer offset for a attribute advances. The function to set this divisor is glVertexBindingDivisor.
(emphasis mine)
Which to me seems as if the answer is claiming I can divide on the number of vertices instead of the number of instances. However, when I look at glVertexBindingDivisor's documentation and compare it to glVertexAttribDivisor's they both appear to refer to the division taking place over instances and not vertices. For example in glVertexBindingDivisor's documentation it states:
glVertexBindingDivisor and glVertexArrayBindingDivisor modify the rate at which generic vertex attributes advance when rendering multiple instances of primitives in a single draw command. If divisor is zero, the attributes using the buffer bound to bindingindex advance once per vertex. If divisor is non-zero, the attributes advance once per divisor instances of the set(s) of vertices being rendered. An attribute is referred to as instanced if the corresponding divisor value is non-zero.
(emphasis mine)
So what is the actual difference between these two functions?
OK, first a little backstory.
As of OpenGL 4.3/ARB_vertex_attrib_binding (AKA: where glVertexBindingDivisor comes from, so this is relevant), VAOs are conceptually split into two parts: an array of vertex formats that describe a single attribute's worth of data, and an array of buffer binding points which describe how to fetch arrays of data (the buffer object, the offset, the stride, and the divisor). The vertex format specifies which buffer binding point its data comes from, so that multiple attributes can get data from the same array (ie: interleaving).
When VAOs were split into these two parts, the older APIs were re-defined in terms of the new system. So if you call glVertexAttribPointer with an attribute index, this function will set the vertex format data for the format at the given index, and it will set the buffer binding state (buffer object, byte offset, etc) for the same index. Now, these are two separate arrays of VAO state data (vertex format and buffer binding); this function is simply using the same index in both arrays.
But since the vertex format and buffer bindings are separate now, glVertexAttribPointer also does the equivalent of saying that the vertex format at index index gets its data from the buffer binding at index index. This is important because that's not automatic; the whole point of vertex_attrib_binding is that a vertex format at one index can use a buffer binding from a different index. So when you're using the old API, it's resetting itself to the old behavior by linking format index to binding index.
Now, what does all that have to do with the divisor? Well, because that thing I just said is literally the only difference between them.
glVertexAttribDivisor is the old-style API for setting the divisor. It takes an attribute index, but it acts on state which is part of the buffer binding point (instancing is a per-array construct, not a per-attribute construct now). This means that the function assumes (in the new system) that the attribute at index fetches its data from the buffer binding point at index.
And what I just said is a bit of a lie. It enforces this "assumption" by directly setting the vertex format to use that buffer binding point. That is, it does the same last step as glVertexAttribPointer did.
glVertexBindingDivisor is the modern function. It is not passed an attribute index; it is passed a buffer binding index. As such, it does not change the attribute's buffer binding index.
So glVertexAttribDivisor is exactly equivalent to this:
void glVertexAttribDivisor(GLuint index, GLuint divisor)
{
glVertexBindingDivisor(index, divisor);
glVertexAttribBinding(index, index);
}
Obviously, glVertexBindingDivisor doesn't do that last part.
So what is the actual difference between these two functions?
Modern OpenGL has two different APIs for specifying vertex attribute arrays and their properties. The traditional glVertexAttribArray and friends, where glVertexAttribDivisor is also part of.
With ARB_vertex_attrib_binding (in core since GL 4.3), a new API was introduced, which separates the vertex format from the pointers. It is expected that switching the data pointers is fast, while switching the vertex format can be more expensive. The new API allows to explictely control both aspects separately, while the old API always sets both at once.
For the new API, a new layer of introduction was introduced: the buffer binding points. (See the OpenGL wiki for more details.) glVertexBindingDivisor specifies the attribute instancing divisor for such a binding point, so it is the conceptual equivalent of the glVertexAttribDivisor function for the new API.
I'm drawing a bunch of primitives (in my case lines) using the glMultiDrawArrays command. Each of these arrays (lines) have some additional attribute(s) specific to that array.
I would essentially like to pass these "array attributes" as separate uniforms specific to each array.
The two ways I can think of now is:
Draw each array (line) in separate draw calls and specify the attribute as a uniform.
Pass these attributes as vertex attributes. This would require me to store as many copies of the same value as I have vertices (I can have up to a 100k in the arrays). Not an option if I do have to store them!
Is there a smarter way of doing this in OpenGL?
Say I have n number of primitives to draw.
The glMultiDrawArrays command already requires me to pass along two arrays of size n. One array (lineStartIndex) of start indices and one array (lineCount) storing how many vertices each array contains .
To me it seems as it should be possible to specify vertex array attributes in a similar manner. E.g. a arrayAttributes vector of size n that could also be passed along with the draw call.
So instead of vertexAttribArray I'd like something like vertexArrayAttribArray ;)
Btw, I am only using one VAO and one VBO.
To me it seems as it should be possible to specify vertex array attributes in a similar manner.
Why? Vertex attributes are for things that change per-vertex; why would you expect to be able to use them for things that don't change per-vertex?
Well, you can (as I will explain), but I don't know why you think it should be possible.
The instancing and base instance feature can be (ab)used to provide a uniform-like value to a shader in a draw call.
You need to set up the attribute you want to provide "per-line" as an instanced array attribute. So you would use glVertexAttribDivisor with a value of 1 for that attribute.
Then, for each line, you would call glDrawArraysInstancedBaseInstance. Each call represents a single line, so you would provide an instance count of 1. But the base instance represents the index in the attribute array for that line's "per-line" value.
Yes, you're not using multi-draw. But OpenGL's draw call overhead is minimal; it's combining draw calls with state changes that get you. And you're not changing state between calls, so there's no problem.
However, if you're concerned about this minor overhead, and have access to multi-draw indirect functionality, you can always use that. Each line is a separate draw in the multi-draw call.
Of course, base instance requires GL 4.2 or ARB_base_instance. If you don't have access to hardware of this nature, then you're going to have to stick with the uniform variable method.
I'm working with OpenGL and am not totally happy with the standard method of passing values PER TRIANGLE (or in my case, quads) that need to make it to the fragment shader, i.e., assign them to each vertex of the primitive and pass them through the vertex shader to presumably be unnecessarily interpolated (unless using the "flat" directive) in the fragment shader (so in other words, non-varying per fragment).
Is there some way to store a value PER triangle (or quad) that needs to be accessed in the fragment shader in such a way that you don't need redundant copies of it per vertex? Is so, is this way better than the likely overhead of 3x (or 4x) the data moving code CPU side?
I am aware of using geometry shaders to spread the values out to new vertices, but I heard geometry shaders are terribly slow on non up to date hardware. Is this the case?
OpenGL fragment language supports the gl_PrimitiveID input variable, which will be the index of the primitive for the currently processed fragment (starting at 0 for each draw call). This can be used as an index into some data store which holds per-primitive data.
Depending on the amount of data that you will need per primitive, and the number of primitives in total, different options are available. For a small number of primitives, you could just set up a uniform array and index into that.
For a reasonably high number of primitives, I would suggest using a texture buffer object (TBO). This is basically an ordinary buffer object, which can be accessed read-only at random locations via the texelFetch GLSL operation. Note that TBOs are not really textures, they only reuse the existing texture object interface. Internally, it is still a data fetch from a buffer object, and it is very efficient with none of the overhead of the texture pipeline.
The only issue with this approach is that you cannot easily mix different data types. You have to define a base data type for your TBO, and every fetch will get you the data in that format. If you just need some floats/vectors per primitive, this is not a problem at all. If you e.g. need some ints and some floats per primitive, you could either use different TBOs, one for each type, or with modern GLSL (>=3.30), you could use an integer type for the TBO and reinterpret the integer bits as floating point with intBitsToFloat(), so you can get around that limitation, too.
You can use one element in the vertex array for rendering multiple vertices. It's called instanced vertex attributes.
I am having a hard time to match up the OpenGL specification (version 3.1, page 27) with common example usage all over the internet.
The OpenGL spec version 3.1 states for DrawElements:
The command
void DrawElements(enum mode, sizei count, enum type, void *indices);
constructs a sequence of geometric primitives by successively transferring the
count elements whose indices are stored in the currently bound element array
buffer (see section 2.9.5) at the offset defined by indices to the GL. The i-th element transferred by DrawElements will be taken from element indices[i] of
each enabled array.
I tend to interpret this as follows:
The indices parameter holds at least count values of type type. Its elements serve as offsets into the actual element buffer. Since for every usage of DrawElements an element buffer must be currently bound, we actually have 2 obligatory sets of indices here: one in the element buffer and another in the indices array.
This would seem somehow wasting for most situations. Unless one has to draw a model which is defined with an element array buffer but needs to sort its elements back to front due to transparency or so. But how would we achieve to render with the plain element array buffer (no sorting) than ?
Now, strange enough, most examples and tutorials in the internet (here,here half page down 'Indexed drawing'.) give a single integer as indices parameter, mostly it is 0. Sometimes (void*)0. It is always only a single integer offset - clearly no array for the indices parameter!
I have used the last variant (giving a single pointerized integer for indices) successfully with some NVIDIA graphics. But I get crashes on Intel on board chips. And I am wondering, who is wrong: me, the spec or thousands of examples. What are the correct parameter and usage of DrawElements? If the single integer is allowed, how does this go along with the spec?
You're tripping over the legacy glDrawElements has ever since OpenGL-1.1. Back then there were no VBOs, but just client side arrays, and the program would actually give a pointer (=array in C terms) of indices into the buffer/array set with the gl…Pointer functions.
Now with index buffers, the parameter is actually just an offset into the server side buffer. You might be very interested in this SO Question: What is the result of NULL + int?
Also I gave an exhaustive answer there, I strongly recommend reading https://stackoverflow.com/a/8284829/524368
What I wrote about function signatures and typecasts also applies to glDraw… calls.