OpenGL Batching: Why does my draw call exceed array buffer bounds? - c++

I trying to implement some relatively simple 2D sprite batching in OpenGL ES 2.0 using vertex buffer objects. However, my geometry is not drawing correctly and some error I can't seem to locate is causing the GL ES analyzer in Instruments to report:
Draw Call Exceeded Array Buffer Bounds
A draw call accessed a vertex outside the range of an array buffer in use. This is a serious error, and may result in a crash.
I've tested my drawing with the same vertex layout by drawing single quads at a time instead of batching and it draws as expected.
// This technique doesn't necessarily result in correct layering,
// but for this game it is unlikely that the same texture will
// need to be drawn both in front of and behind other images.
while (!renderQueue.empty())
{
vector<GLfloat> batchVertices;
GLuint texture = renderQueue.front()->textureName;
// find all the draw descriptors with the same texture as the first
// item in the vector and batch them together, back to front
for (int i = 0; i < renderQueue.size(); i++)
{
if (renderQueue[i]->textureName == texture)
{
for (int vertIndex = 0; vertIndex < 24; vertIndex++)
{
batchVertices.push_back(renderQueue[i]->vertexData[vertIndex]);
}
// Remove the item as it has been added to the batch to be drawn
renderQueue.erase(renderQueue.begin() + i);
i--;
}
}
int elements = batchVertices.size();
GLfloat *batchVertArray = new GLfloat[elements];
memcpy(batchVertArray, &batchVertices[0], elements * sizeof(GLfloat));
// Draw the batch
bindTexture(texture);
glBufferData(GL_ARRAY_BUFFER, elements, batchVertArray, GL_STREAM_DRAW);
prepareToDraw();
glDrawArrays(GL_TRIANGLES, 0, elements / BufferStride);
delete [] batchVertArray;
}
Other info of plausible relevance: renderQueue is a vector of DrawDescriptors. BufferStride is 4, as my vertex buffer format is interleaved position2, texcoord2: X,Y,U,V...
Thank you.

glBufferData expects its second argument to be the size of the data in bytes. The correct way to copy your vertex data to the GPU would therefore be:
glBufferData(GL_ARRAY_BUFFER, elements * sizeof(GLfloat), batchVertArray, GL_STREAM_DRAW);
Also make sure that the correct vertex buffer is bound when calling glBufferData.
On a performance note, allocating a temporary array is absolutely unnecessary here. Just use the vector directly:
glBufferData(GL_ARRAY_BUFFER, batchVertices.size() * sizeof(GLfloat), &batchVertices[0], GL_STREAM_DRAW);

Related

Inconsistent behavior in instance rendering with glDrawElementsInstanced, somtimes no rendering with no errors

I've been working on project using OpenGL. Particles are rendered using instanced draw calls.
The issue is that sometimes glDrawElementsInstanced will not render anything. And no errors are reported. Other models and effects render fine. But no particles in
my particle system will render. The draw call looks something like
ec(glBindVertexArray(vao));
ec(glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ebo));
ec(glDrawElementsInstanced(GL_TRIANGLES, triangleElementIndices.size(), GL_UNSIGNED_INT, reinterpret_cast<void*>(0), instanceCount));
ec is a macro used to error check opengl. it effectively does this:
while (GLenum error = glGetError()){
std::cerr << "OpenGLError:" << std::hex << error << " " << functionName << " " << file << " " << line << std::endl;
}
The issue rendering particles is more prevalent in Release mode, rather than debug mode; but occurs in both modes. The issue occurs about 8/10 in release mode and 1/10 in debug mode.
Below is the rendering process for particles:
for each instanced drawcall...
bind a shared vertex buffer object(vbo)
put data into that vertex buffer object (vbo)
iterate over many vertex array objects (vao), associate the VBO with them and set up vertex attributes
render each vao
All of the objects share the same VBO, but the are rendered sequentially. The entire application is currently single threaded, so that shouldn't be an issue.
A given frame for particles A (two vaos), and B(one vao) would be like:
-buffer A's data into vertex buffer named VBO
-bind A_vao1
-set up A's instance vertex attributes
-bind A_vao2
-set up A's instance vertex attributes
-render A_vao1
-render A_vao2
-buffer B's data into vertex buffer name VBO (no glGenBuffers, this is same buffer)
-bind B_vao1
-set up B's instance vertex attributes
-render B_vao1
Is there an obvious problem with that approach?
The source below has been simplified, but I left most of the relevant parts. Unlike what I have above, it actually uses 2 shared vertex buffer objects (VBOs), one for matrix4s, and one for vector4s.
GLuint instanceMat4VBO = ... //valid created vertex buffer objects
GLuint instanceVec4VBO = ... //valid created vertex buffer objects
//iterate over all the instnaces; data is stored in class EffectInstanceData
for(EffectInstanceData& eid : instancedEffectsData)
{
if (eid.numInstancesThisFrame > 0)
{
// ---- BUFFER data ---- before binding it to all VAOs (model's may have multiple meshes, each with their own VAO)
ec(glBindBuffer(GL_ARRAY_BUFFER, instanceMac4VBO)); //BUFFER MAT4 INSTANCE DATA
ec(glBufferData(GL_ARRAY_BUFFER, sizeof(glm::mat4) * eid.mat4Data.size(), &eid.mat4Data[0], GL_STATIC_DRAW));
ec(glBindBuffer(GL_ARRAY_BUFFER, instanceVec4VBO)); //BUFFER VEC4 INSTANCE DATA
ec(glBufferData(GL_ARRAY_BUFFER, sizeof(glm::vec4) * eid.vec4Data.size(), &eid.vec4Data[0], GL_STATIC_DRAW));
//meshes may have multiple VAO's that need rendering, set up buffers with instance data for each VAO before instance rendering is done
for (GLuint effectVAO : eid.effectData->mesh->getVAOs())
{
ec(glBindVertexArray(effectVAO));
{ //set up mat4 buffer
ec(glBindBuffer(GL_ARRAY_BUFFER, instanceMat4VBO));
GLsizei numVec4AttribsInBuffer = 4 * eid.numMat4PerInstance;
size_t packagedVec4Idx_matbuffer = 0;
//pass built-in data into instanced array vertex attribute
{
//mat4 (these take 4 separate vec4s)
{
//model matrix
ec(glEnableVertexAttribArray(8));
ec(glEnableVertexAttribArray(9));
ec(glEnableVertexAttribArray(10));
ec(glEnableVertexAttribArray(11));
ec(glVertexAttribPointer(8, 4, GL_FLOAT, GL_FALSE, numVec4AttribsInBuffer * sizeof(glm::vec4), reinterpret_cast<void*>(packagedVec4Idx_matbuffer++ * sizeof(glm::vec4))));
ec(glVertexAttribPointer(9, 4, GL_FLOAT, GL_FALSE, numVec4AttribsInBuffer * sizeof(glm::vec4), reinterpret_cast<void*>(packagedVec4Idx_matbuffer++ * sizeof(glm::vec4))));
ec(glVertexAttribPointer(10, 4, GL_FLOAT, GL_FALSE, numVec4AttribsInBuffer * sizeof(glm::vec4), reinterpret_cast<void*>(packagedVec4Idx_matbuffer++ * sizeof(glm::vec4))));
ec(glVertexAttribPointer(11, 4, GL_FLOAT, GL_FALSE, numVec4AttribsInBuffer * sizeof(glm::vec4), reinterpret_cast<void*>(packagedVec4Idx_matbuffer++ * sizeof(glm::vec4))));
ec(glVertexAttribDivisor(8, 1));
ec(glVertexAttribDivisor(9, 1));
ec(glVertexAttribDivisor(10, 1));
ec(glVertexAttribDivisor(11, 1));
}
}
}
{ //set up vec4 buffer
ec(glBindBuffer(GL_ARRAY_BUFFER, instanceVec4VBO));
GLsizei numVec4AttribsInBuffer = eid.numVec4PerInstance;
size_t packagedVec4Idx_v4buffer = 0;
{
//package built-in vec4s
ec(glEnableVertexAttribArray(7));
ec(glVertexAttribPointer(7, 4, GL_FLOAT, GL_FALSE, numVec4AttribsInBuffer * sizeof(glm::vec4), reinterpret_cast<void*>(packagedVec4Idx_v4buffer++ * sizeof(glm::vec4))));
ec(glVertexAttribDivisor(7, 1));
}
}
}
//activate shader
... code setting uniforms on shaders, does not appear to be issue...
//instanced render
for (GLuint vao : eid.effectData->mesh->getVAOs()) //this actually results in function calls to a mesh class instances, but effectively is doing this loop
{
ec(glBindVertexArray(vao));
ec(glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ebo));
ec(glDrawElementsInstanced(GL_TRIANGLES, triangleElementIndices.size(), GL_UNSIGNED_INT, reinterpret_cast<void*>(0), instanceCount));
}
//clear data for next frame
eid.clearFrameData();
}
}
ec(glBindVertexArray(0));//unbind VAO's
Is any of this visibility wrong? I've debugged with RenderDoc and when the issue is not present, a draw call is present in the event browser like the image:
But when the issue does happen, the draw call does not appear at all in RenderDoc like the following image:
This seems very strange to me. I've verified with the debugger that the draw call is being executed. But it seems to silently fail.
I've tried debugging with nvidia nsight, but cannot reproduce it when launched through nvidia nsight.
I've verified
instance VBO buffer size doesn't change or grow too large, its size is stable
uniforms are be correctly finding values
vao binding appears to happen in correct orderings
System specs: windows 10; Opengl3.3, 8gb memory; i7-8700k, NVIDIA GeForce GTX TITAN X
Also observed issue on on my laptop, with roughly same reproduction rates. It has an intel graphics chip.
github link to actual source if anyone tries to compile let me know, you need to replace the hidden .suo with the copy I made to automatically fill out the linker settings. function: ParticleSystem::handlePostRender
It turns out this isn't an issue with instancing. I implemented a non-instance version and had the same issue. The real issue is with my rendering systems. Currently the swap buffer and the render particles are listening to the same delegate (event) and occasionally the swap buffers will come first when the event broadcasts. So the ordering was:
clear screen
render scene
swap buffers
render particles
clear screen
render scene
swap buffers
render particles
So, the particles were never visible because they were immediately cleared at what was supposed to be the start of the next frame.

OpenGL draws weird lines on top of polygons

Let me introduce you to Fishtank:
It's an aquarium simulator I am doing on OpenGL to learn before going into Vulkan.
I have drawn many fish like these:
Aquarium
Now I added the grid functionnality which goes like this:
Grid
But when I let it turn for some time, these lines appear:
Weird Lines
I've seen somewhere to clear the Depth Buffer, which I did, but that doesn't resolve the problem.
Here's the code of the function:
void Game::drawGrid()
{
std::vector<glm::vec2> gridVertices;
for (unsigned int x = 1; x < mGameMap.mColCount; x += 1) //Include the last one as the drawed line is the left of the column
{
gridVertices.push_back(glm::vec2(transformToNDC(mWindow, x*mGameMap.mCellSize, mGameMap.mCellSize)));
gridVertices.push_back(glm::vec2(transformToNDC(mWindow, x*mGameMap.mCellSize, (mGameMap.mRowCount-1)*mGameMap.mCellSize)));
}
for (unsigned int y = 1; y < mGameMap.mRowCount; y += 1) //Same here but special info needed:
// Normally, the origin is at the top-left corner and the y-axis points to the bottom. However, OpenGL's y-axis is reversed.
// That's why taking into account the mRowCount-1 actually draws the very first line.
{
gridVertices.push_back(glm::vec2(transformToNDC(mWindow, mGameMap.mCellSize, y*mGameMap.mCellSize)));
gridVertices.push_back(glm::vec2(transformToNDC(mWindow, (mGameMap.mColCount - 1)*mGameMap.mCellSize, y*mGameMap.mCellSize)));
}
mShader.setVec3("color", glm::vec3(1.0f));
glBufferData(GL_ARRAY_BUFFER, gridVertices.size()*sizeof(glm::vec2), gridVertices.data(), GL_STATIC_DRAW);
glVertexAttribPointer(0, 2, GL_FLOAT_VEC2, GL_FALSE, sizeof(glm::vec2), (void*)0);
glEnableVertexAttribArray(0);
glDrawArrays(GL_LINES, 0, gridVertices.size()*sizeof(glm::vec2));
glClear(GL_DEPTH_BUFFER_BIT);
}
I'd like to erase those lines and understand why OpenGL does this (or maybe it's me but I don't see where).
This is the problematic line:
glDrawArrays(GL_LINES, 0, gridVertices.size()*sizeof(glm::vec2));
If you look at the documentation for this function, you will find
void glDrawArrays( GLenum mode, GLint first, GLsizei count);
count: Specifies the number of indices to be rendered
But you are passing the byte size. Hence, you are asking OpenGL to draw more vertices than there are in your vertex buffer. The specific OpenGL implementation you are using is probably reading past the end of the grid vertex buffer and finds vertices from the fish vertex buffer to draw (but this behavior is undefined).
So, just change it to
glDrawArrays(GL_LINES, 0, gridVertices.size());
A general comment: Do not create vertex buffers every time you want to draw the same thing. Create them at the beginning of the application and re-use them. You can also change their content if needed, but be careful with that since it comes with a performance price. Creating vertex buffers is even more costlier.

glDrawElements doesn't render all the points

In the first place, I'm rendering a point cloud with OpenGL.
// The object pointCloud wraps some raw data in different buffers.
// At this point, everything has been allocated, filled and enabled.
glDrawArrays(GL_POINTS, 0, pointCloud->count());
This works just fine.
However, I need to render a mesh instead of just points. To achieve that, the most obvious way seems to be using GL_TRIANGLE_STRIP and glDrawElements with the good array of indices.
So I start by transforming my current code by something that should render the exact same thing.
// Creates a set of indices of all the points, in their natural order
std::vector<GLuint> indices;
indices.resize(pointCloud->count());
for (GLuint i = 0; i < pointCloud->count(); i++)
indices[i] = i;
// Populates the element array buffer with the indices
GLuint ebo = -1;
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ebo);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, indices.size(), indices.data(), GL_STATIC_DRAW);
// Should draw the exact same thing as the previous example
glDrawElements(GL_POINTS, indices.size(), GL_UNSIGNED_INT, 0);
But it doesn't work right. It's rendering something that seems to be only the first quarter of the points.
If I mess with the indices range by making it 2 or 4 times smaller, the same points are displayed. If it's 8 times smaller, only the first half of them is.
If I fill it with only even indices, one half of the same set of points is shown.
If I start it at the half of the set, nothing is shown.
There's obviously something that I'm missing about how glDrawElement behaves in comparison to glDrawArrays.
Thanks in advance for your help.
The size passed as the second argument to glBufferData() is in bytes. The posted code passes the number of indices instead. The call needs to be:
glBufferData(GL_ELEMENT_ARRAY_BUFFER,
indices.size() * sizeof(GLuint), indices.data(), GL_STATIC_DRAW);

Techniques for drawing spritesheets in OpenGL with shaders

I'm learning OpenGL with right now and I'd like to draw some sprites to the screen from a sprite sheet. I'm note sure if I'm doing this the right way though.
What I want to do is to build a world out of tiles à la Terraria. That means that all tiles that build my world are 1x1, but I might want things later like entities that are 2x1, 1x2, 2x2 etc.
What I do right now is that I have a class named "Tile" which contains the tile's transform matrix and a pointer to its buffer. Very simple:
Tile::Tile(glm::vec2 position, GLuint* vbo)
{
transformMatrix = glm::translate(transformMatrix, glm::vec3(position, 0.0f));
buffer = vbo;
}
Then when I draw the tile I just bind the buffer and update the shader's UV-coords and vertex position. After that I pass the tile's transform matrix to the shader and draw it using glDrawElements:
glEnableVertexAttribArray(positionAttrib);
glEnableVertexAttribArray(textureAttrib);
for(int i = 0; i < 5; i++)
{
glBindBuffer(GL_ARRAY_BUFFER, *tiles[i].buffer);
glVertexAttribPointer(positionAttrib, 2, GL_FLOAT, GL_FALSE, 4 * sizeof(GLfloat), 0);
glVertexAttribPointer(textureAttrib, 2, GL_FLOAT, GL_FALSE, 4 * sizeof(GLfloat), (void*)(2 * sizeof(GLfloat)));
glUniformMatrix4fv(transformMatrixLoc, 1, GL_FALSE, value_ptr(tiles[i].transformMatrix));
glDrawElements(GL_TRIANGLE_STRIP, 4, GL_UNSIGNED_INT, 0);
}
glDisableVertexAttribArray(positionAttrib);
glDisableVertexAttribArray(textureAttrib);
Could I do this more efficiently? I was thinking that I could have one buffer for 1x1 tiles, one buffer for 2x1 tiles etc. etc. and then just have the Tile class contain UVpos and UVsize and then just send those to the shader, but I'm not sure how I'd do that.
I think what I described with one buffer for 1x1 and one for 2x1 sounds like it would be a lot faster.
Could I do this more efficiently?
I don't think you could do it less efficiently. You are binding a whole buffer object for each quad. You are then uploading a matrix. For each quad.
The way tilemap drawing (and only the map, not the entities) normally works is that you build a buffer object that contains some portion of the visible screen's tiles. Empty space is rendered as a transparent tile. You then render all of the tiles for that region of the screen, all in one drawing call. You provide one matrix for all of the tiles in that region.
Normally, you'll have some number of such visible regions, to make it easy to update the tiles for that region when they change. Regions that go off-screen are re-used for regions that come on-screen, so you fill them with new tile data.

Use index as coordinate in OpenGL

I want to implement a timeseries viewer that allows a user to zoom and smoothly pan.
I've done some immediate mode opengl before, but that's now deprecated in favor of VBOs. All the examples of VBOs I can find store XYZ coordinates of each and every point.
I suspect that I need to keep all my data in VRAM in order to get a framerate during pan that can be called "smooth", but I have only Y data (the dependent variable). X is an independent variable which can be calculated from the index, and Z is constant. If I have to store X and Z then my memory requirements (both buffer size and CPU->GPU block transfer) are tripled. And I have tens of millions of data points through which the user can pan, so the memory usage will be non-trivial.
Is there some technique for either drawing a 1-D vertex array, where the index is used as the other coordinate, or storing a 1-D array (probably in a texture?) and using a shader program to generate the XYZ? I'm under the impression that I need a simple shader anyway under the new fixed-feature-less pipeline model to implement scaling and translation, so if I could combine the generation of X and Z coordinates and scaling/translation of Y that would be ideal.
Is this even possible? Do you know of any sample code that does this? Or can you at least give me some pseudocode saying what GL functions to call in what order?
Thanks!
EDIT: To make sure this is clear, here's the equivalent immediate-mode code, and vertex array code:
// immediate
glBegin(GL_LINE_STRIP);
for( int i = 0; i < N; ++i )
glVertex2(i, y[i]);
glEnd();
// vertex array
struct { float x, y; } v[N];
for( int i = 0; i < N; ++i ) {
v[i].x = i;
v[i].y = y[i];
}
glVertexPointer(2, GL_FLOAT, 0, v);
glDrawArrays(GL_LINE_STRIP, 0, N);
note that v[] is twice the size of y[].
That's perfectly fine for OpenGL.
Vertex Buffer Objects (VBO) can store any information you want in one of GL supported format. You can fill a VBO with just a single coordinate:
glGenBuffers( 1, &buf_id);
glBindBuffer( GL_ARRAY_BUFFER, buf_id );
glBufferData( GL_ARRAY_BUFFER, N*sizeof(float), data_ptr, GL_STATIC_DRAW );
And then bind the proper vertex attribute format for a draw:
glBindBuffer( GL_ARRAY_BUFFER, buf_id );
glEnableVertexAttribArray(0); // hard-coded here for the sake of example
glVertexAttribPointer(0, 1, GL_FLOAT, false, 0, NULL);
In order to use it you'll need a simple shader program. The vertex shader can look like:
#version 130
in float at_coord_Y;
void main() {
float coord_X = float(gl_VertexID);
gl_Position = vec4(coord_X,at_coord_Y,0.0,1.0);
}
Before linking the shader program, you should bind it's at_coord_Y to the attribute index you'll use (=0 in my code):
glBindAttribLocation(program_id,0,"at_coord_Y");
Alternatively, you can ask the program after linking for the index to which this attribute was automatically assigned and then use it:
const int attrib_pos = glGetAttribLocation(program_id,"at_coord_Y");
Good luck!
Would you store ten millions of XY coordinates, in VRAM?
I would you suggest to compute those coordinates on CPU, and pass them to the shader pipeline as uniforms (since coordinates are fixed to the panned image).
Keep it simple.