I'm currently trying to teach myself some OpenGL using some Tutorials and LWJGL. Obviously I'm just at rendering cubes.
What I've done up until now, and what works is, that for each cube I'll do
glUniformMatrix4(RenderProgram.ModelMatrixID, false,
renderobject.getTransformationBuffer());
glDrawElements(GL_TRIANGLES, renderobject.Model.countIndices(),
GL_UNSIGNED_INT, renderobject.Model.indexOffset);
Since that only gives me about 50-55 FPS with about 70k cubes, I decided trying instanced rendering, like so:
glDrawElementsInstanced(GL_TRIANGLES, Model.countIndices(),
GL_UNSIGNED_INT, 0, instanceCount);
Of course I've created another buffer for that beforehand, filling it with renderobject.getTransformationBuffer() of each cube and I'm binding this buffer before I try to draw instanced.
I also added it to my vertex shader like so layout(location = 12) in mat4 mModel and I've initialized the attrib pointers like so:
for (int i = 0; i < 4; i++) {
glEnableVertexAttribArray(12 + i);
glVertexAttribPointer(12 + i, 4, GL_FLOAT, false, Float.BYTES * 16,
Float.BYTES * 4 * i);
glVertexAttribDivisor(InstanceBufferID, 1);
}
I get no errors and while I don't see anything on screen, it's rendering and I see an FPS increase of about 350% so I think that I don't get the right model matrix in the shader.
Unfortunately I can't debug variable contents within the shader :) So I'm a little bit stumped as to what I might be missing or how I could unravel this... Also, obviously, Google didn't help me much either and SO just comes up with glDrawElements not working for people.
Edit: The accepted answer was the one error that could be determined from the code provided. However, I had another error in the code, which needed fixing before finally something was visible on the screen, which I'd like to share as well: I unbound the VAO before populating the VBO with the matrix data. As soon as I pushed that unbinding after loading the data into the VBO it worked!
Edit2: Interestingly the performance increase is even more imense now that something IS rendered. With my blank screen I got around 170 FPS for around 70k cubes. Now that it renders correctly I'm getting around 350-400 FPS for around 270k cubes! I didn't expect that.
The first argument to glVertexAttribDivisor should be the index of the vertex attribute that you want to use as an instanced array and not InstanceBufferID.
This should thus become:
for (int i = 0; i < 4; i++) {
glEnableVertexAttribArray(12 + i);
glVertexAttribPointer(12 + i, 4, GL_FLOAT, false, Float.BYTES * 16,
Float.BYTES * 4 * i);
glVertexAttribDivisor(12 + i, 1);
}
Related
I'm trying to implement batch rendering for 3D objects in an engine I'm doing, and I can't manage to get the indices fine.
So in a 3D Renderer class I have a Renderer3DData structure that looks like the next:
static const uint MaxQuads = 20000;
static const uint MaxVertices = MaxQuads * 4;
static const uint MaxIndices = MaxQuads * 6;
uint IndicesDrawCount = 0; // Debug var
std::vector<uint> Indices;
Ref<IndexBuffer> IBuffer = nullptr;
// Other data like a VBuffer, VArray...
So the vector of Indices will store the indices to draw on each batch while the IBuffer is the Index Buffer class which handles all OpenGL operations ("Ref" is a typedef to make a shared pointer).
Then a static Renderer3DData* s_3DData; is initialized in the init function and the index buffer is initialized as follows:
uint* indices = new uint[s_3DData->MaxIndices];
s_3DData->IBuffer = IndexBuffer::Create(indices, s_3DData->MaxIndices);
And then bounded together with the Vertex Array and the Vertex Buffer, the initialization process is properly done since without batching this works.
So on each new batch the VArray gets bound and the Indices vector gets cleared and, on each mesh drawn, it gets modified like this:
uint offset = 0;
std::vector<uint> indices = mesh->m_Indices;
for (uint i = 0; i < indices.size(); i += 6)
{
s_3DData->Indices.push_back(offset + 0 + indices[i]);
s_3DData->Indices.push_back(offset + 1 + indices[i]);
s_3DData->Indices.push_back(offset + 2 + indices[i]);
s_3DData->Indices.push_back(offset + 3 + indices[i]);
s_3DData->Indices.push_back(offset + 4 + indices[i]);
s_3DData->Indices.push_back(offset + 5 + indices[i]);
offset += 4;
s_3DData->IndicesDrawCount += 6;
}
I don't know how I did come up with this way of setting the index buffer, I was testing things to do it, pushing only the indices or the indices + offset doesn't works neither. Finally, on each draw, I do the next:
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, BufferID);
glBufferSubData(GL_ELEMENT_ARRAY_BUFFER, 0, s_3DData->Indices.size(), s_3DData->Indices.data());
// With the vArray bound:
glDrawElements(GL_TRIANGLES, s_3DData->IndicesDrawCount, GL_UNSIGNED_INT, nullptr);
As I mentioned, when I'm not batching, the drawing (which doesn't goes through all this process), works, so the data in the mesh and the vertex/index buffers must be good, what I think it's wrong is the way to set the index buffer since I'm not sure how to even set it up (unlike other rendering stuff).
The result is the next one (should be a solid sphere):
The way that "sphere" is rendered makes me think that the indices are wrong. And the objects in the center are objects drawn without batching for me to know that it's not the initial setup that's wrong. Does anybody sees what I'm doing wrong?
I finally solved it (I'm crying, I've been with this a lot of time).
So there was a couple of problems:
First: The function s_3DData->IBuffer = IndexBuffer::Create(indices, s_3DData->MaxIndices); that I posted was doing the next:
glCreateBuffers(1, &m_BufferID);
glBindBuffer(GL_ARRAY_BUFFER, m_BufferID);
glBufferData(GL_ARRAY_BUFFER, count * sizeof(uint), nullptr, GL_STATIC_DRAW);
So the first problem was that I was creating index buffers with GL_STATIC_DRAW instead of GL_DYNAMIC_DRAW as required to batch since we are dynamically updating the buffer (this was my bad to not to post the function entirely, I was pretty asleep when I posted it, I should have done it).
Second: The function glBufferSubData(GL_ELEMENT_ARRAY_BUFFER, 0, s_3DData->Indices.size(), s_3DData->Indices.data()); was wrong on the size parameter.
OpenGL requires the size of this function to be the total size of the buffer that we want to update, which is not the vector size but the vector size multiplied by sizeof(uint) (in this case, uint because the vector is a uint vector).
Third: And final problem was the loop that modified the indices vector on each mesh draw, it was wrong and thought from the point of view of drawing quads in 2D (as I was previously testing batching in 2D).
The correct loop is the next:
std::vector<uint> indices = mesh->m_Indices;
for (uint i = 0; i < indices.size(); ++i)
{
s_3DData->Indices.push_back(s_3DData->IndicesCurrentOffset + indices[i]);
++s_3DData->IndicesDrawCount;
++s_3DData->RendererStats.IndicesCount; // Debug Purpose
}
s_3DData->IndicesCurrentOffset += mesh->m_MaxIndex;
So now each mesh stores the (max index + 1) that it has (for a quad with indices from 0 to 3, this would be 4).
This way, I can go through all mesh indices while updating the indices that we use to draw and then I can update the current offset value so that we properly store all the indices drawn in order.
Again, I'm not intending this to be fast nor performative, I was just learning how to do this (and I did :) ).
The result:
I am trying to draw this free airwing model from Starfox 64 in OpenGL. I converted the .fbx file to .obj in Blender and am using tinyobjloader to load it (all requirements for my university subject).
I pretty much slapped the example code (with the modern API) into my program, replaced the file name, and grabbed the attrib.vertices and attrib.normals vectors to draw the airwing.
I can view the vertices with GL_POINTS:
glEnableClientState(GL_VERTEX_ARRAY);
glVertexPointer(3, GL_FLOAT, 0, &vertices[0]);
glDrawArrays(GL_POINTS, 0, vertices.size() / 3);
glDisableClientState(GL_VERTEX_ARRAY);
Which looks correct (I ... think?):
But I'm not sure how to render a solid model. Simply replacing GL_POINTS with GL_TRIANGLES (shown) or GL_QUADS doesn't work:
I am using OpenGL 1.1 w/ GLUT (again, university). I think I just don't know what I'm doing, really. Help?
E: When I wrote this answer originally I had only worked with vertices and normals. I've figured out how to get materials and textures working, but don't have time to write that out at the moment. I will add that in when I have some time, but it's largely the same logic if you wanna poke around the tinyobj header yourselves in the meantime. :-)
I've learned a lot about TinyOBJLoader in the last day so I hope this helps someone in the future. Credit goes to this GitHub repository which uses TinyOBJLoader very clearly and cleanly in fileloader.cpp.
To summarise what I learned studying that code:
Shapes are of type shape_t. For a single model OBJ, the size of shapes is 1. I'm assuming OBJ files can contain multiple objects but I haven't used the file format much to know.
shape_t's have a member mesh of type mesh_t. This member stores the information parsed from the face rows of the OBJ. You can figure out the number of faces your object has by checking the size of the material_ids member.
The vertex, texture coordinate and normal indices of each face are stored in the indices member of the mesh. This is of type std::vector<index_t>. This is a flattened vector of indices. So for a model with triangulated faces f1, f2 ... fi, it stores v1, t1, n1, v2, t2, n2 ... vi, ti, ni. Remember that these indices correspond to the whole vertex, texture coordinate or normal. Personally I triangulated my model by importing into Blender and exporting it with triangulation turned on. TinyOBJ has its own triangulation algorithm you can turn on by setting the reader_config.triangulate flag.
I've only worked with the vertices and normals so far. Here's how I access and store them to be used in OpenGL:
Convert the flat vertices and normal arrays into groups of 3, i.e. 3D vectors
for (size_t vec_start = 0; vec_start < attrib.vertices.size(); vec_start += 3) {
vertices.emplace_back(
attrib.vertices[vec_start],
attrib.vertices[vec_start + 1],
attrib.vertices[vec_start + 2]);
}
for (size_t norm_start = 0; norm_start < attrib.normals.size(); norm_start += 3) {
normals.emplace_back(
attrib.normals[norm_start],
attrib.normals[norm_start + 1],
attrib.normals[norm_start + 2]);
}
This way the index of the vertices and normals containers will correspond with the indices given by the face entries.
Loop over every face, and store the vertex and normal indices in a separate object
for (auto shape = shapes.begin(); shape < shapes.end(); ++shape) {
const std::vector<tinyobj::index_t>& indices = shape->mesh.indices;
const std::vector<int>& material_ids = shape->mesh.material_ids;
for (size_t index = 0; index < material_ids.size(); ++index) {
// offset by 3 because values are grouped as vertex/normal/texture
triangles.push_back(Triangle(
{ indices[3 * index].vertex_index, indices[3 * index + 1].vertex_index, indices[3 * index + 2].vertex_index },
{ indices[3 * index].normal_index, indices[3 * index + 1].normal_index, indices[3 * index + 2].normal_index })
);
}
}
Drawing is then quite easy:
glBegin(GL_TRIANGLES);
for (auto triangle = triangles.begin(); triangle != triangles.end(); ++triangle) {
glNormal3f(normals[triangle->normals[0]].X, normals[triangle->normals[0]].Y, normals[triangle->normals[0]].Z);
glVertex3f(vertices[triangle->vertices[0]].X, vertices[triangle->vertices[0]].Y, vertices[triangle->vertices[0]].Z);
glNormal3f(normals[triangle->normals[1]].X, normals[triangle->normals[1]].Y, normals[triangle->normals[1]].Z);
glVertex3f(vertices[triangle->vertices[1]].X, vertices[triangle->vertices[1]].Y, vertices[triangle->vertices[1]].Z);
glNormal3f(normals[triangle->normals[2]].X, normals[triangle->normals[2]].Y, normals[triangle->normals[2]].Z);
glVertex3f(vertices[triangle->vertices[2]].X, vertices[triangle->vertices[2]].Y, vertices[triangle->vertices[2]].Z);
}
glEnd();
I'm working for the first time on a 3D project (actually, I'm programming a Bullet Physics integration in a Quartz Composer plug-in), and as I try to optimize my rendering method, I began to use glDrawElements instead of the direct access to vertices by glVertex3d...
I'm very surprised by the result. I didn't check if it is actually quicker, but I tried on this very simple scene below. And, from my point of view, the rendering is really better in immediate mode.
The "draw elements" method keep showing the edges of the triangles and a very ugly shadow on the cube.
I would really appreciate some information on this difference, and may be a way to keep quality with glDrawElements. I'm aware that it could really be a mistake of mines...
Immediate mode
DrawElements
The vertices, indices and normals are computed the same way in the two method. Here are the 2 codes.
Immediate mode
glBegin (GL_TRIANGLES);
int si=36;
for (int i=0;i<si;i+=3)
{
const btVector3& v1 = verticesArray[indicesArray[i]];;
const btVector3& v2 = verticesArray[indicesArray[i+1]];
const btVector3& v3 = verticesArray[indicesArray[i+2]];
btVector3 normal = (v1-v3).cross(v1-v2);
normal.normalize ();
glNormal3f(-normal.getX(),-normal.getY(),-normal.getZ());
glVertex3f (v1.x(), v1.y(), v1.z());
glVertex3f (v2.x(), v2.y(), v2.z());
glVertex3f (v3.x(), v3.y(), v3.z());
}
glEnd();
glDrawElements
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_NORMAL_ARRAY);
glNormalPointer(GL_FLOAT, sizeof(btVector3), &(normalsArray[0].getX()));
glVertexPointer(3, GL_FLOAT, sizeof(btVector3), &(verticesArray[0].getX()));
glDrawElements(GL_TRIANGLES, indicesCount, GL_UNSIGNED_BYTE, indicesArray);
glDisableClientState(GL_NORMAL_ARRAY);
glDisableClientState(GL_VERTEX_ARRAY);
Thank you.
EDIT
Here is the code for the vertices / indices / normals
GLubyte indicesArray[] = {
0,1,2,
3,2,1,
4,0,6,
6,0,2,
5,1,4,
4,1,0,
7,3,1,
7,1,5,
5,4,7,
7,4,6,
7,2,3,
7,6,2 };
btVector3 verticesArray[] = {
btVector3(halfExtent[0], halfExtent[1], halfExtent[2]),
btVector3(-halfExtent[0], halfExtent[1], halfExtent[2]),
btVector3(halfExtent[0], -halfExtent[1], halfExtent[2]),
btVector3(-halfExtent[0], -halfExtent[1], halfExtent[2]),
btVector3(halfExtent[0], halfExtent[1], -halfExtent[2]),
btVector3(-halfExtent[0], halfExtent[1], -halfExtent[2]),
btVector3(halfExtent[0], -halfExtent[1], -halfExtent[2]),
btVector3(-halfExtent[0], -halfExtent[1], -halfExtent[2])
};
indicesCount = sizeof(indicesArray);
verticesCount = sizeof(verticesArray);
btVector3 normalsArray[verticesCount];
int j = 0;
for (int i = 0; i < verticesCount * 3; i += 3)
{
const btVector3& v1 = verticesArray[indicesArray[i]];;
const btVector3& v2 = verticesArray[indicesArray[i+1]];
const btVector3& v3 = verticesArray[indicesArray[i+2]];
btVector3 normal = (v1-v3).cross(v1-v2);
normal.normalize ();
normalsArray[j] = btVector3(-normal.getX(), -normal.getY(), -normal.getZ());
j++;
}
You can (and will) achieve the exact same results with immediate mode and vertex array based rendering. Your images suggest that you got your normals wrong. As you did not include the code with which you create your arrays, I can only guess what might be wrong. One thing I could imagine: you are using one normal per triangle, so in the normal array, you have to repeat that normal for each vertex.
You should be aware that a vertex in the GL is not just the position (which you specify via glVertex in immediate mode), but the set of all attributes like position, normals, texcoords and so on. So if you have a mesh where an end point is part of different triangles, this is only one vertex if all attributes are shared, not just the position. In your case, the normals are per triangle, so you will need different vertices (sharing position with some other vertices, but using a different normal) per triangle.
I began to use glDrawElements
Good!
instead of the direct access to vertices by glVertex3d...
There's nothing "direct" about immediate mode. In fact it's as far away from the GPU as you can get (on modern GPU architectures).
I'm very surprised by the result. I didn't check if it is actually quicker, but I tried on this very simple scene below. And, from my point of view, the rendering is really better with the direct access method.
Actually its several orders of magnitudes slower. Each and every glVertex call causes the overhead of a context switch. Also a GPU needs larger batches of data to work efficiently, so glVertex calls first fill a buffer created ad-hoc.
Your immediate code segment must be actually understand as following
glNormal3f(-normal.getX(),-normal.getY(),-normal.getZ());
glVertex3f (v1.x(), v1.y(), v1.z());
// implicit copy of the glNormal supplied above
glVertex3f (v2.x(), v2.y(), v2.z());
// implicit copy of the glNormal supplied above
glVertex3f (v3.x(), v3.y(), v3.z());
The reason for that is, that a vertex is not just a position, but the whole combination of its attributes. And when working with vertex arrays you must supply the full attribute vector to form a valid vertex.
This is a followup to my previous question. All of my questions were answered in my last thread but this is a new error that I am having. When rendering in intermediate mode, everything looks great.
In fact:
http://i.imgur.com/OFV6i.png
Now, I am rendering with the glDrawRangeElements() and look what happens:
http://i.imgur.com/mEmH5.png
Has anyone seen something like this before? It looks as if some of the indices are simply in the middle, which makes no sense at all.
This is the function I am using to render my level.
void WLD::renderIntermediate(GLuint* textures, long curRegion, CFrustum cfrustum)
{
// Iterate through all of the regions in the PVS
for(int i = 0; i < regions[curRegion].visibility.size(); i++)
{
// Grab a visible region
int vis = regions[curRegion].visibility[i];
// Make sure it points to a mesh
if(regions[vis].meshptr == NULL)
continue;
// Make sure it is in our PVS
if(!cfrustum.BoxInFrustum(regions[vis].meshptr->minX, regions[vis].meshptr->minY, regions[vis].meshptr->minZ, regions[vis].meshptr->maxX, regions[vis].meshptr->maxY, regions[vis].meshptr->maxZ))
continue;
// Optional: Only render the region we are in (for testing)
//if(vis != curRegion)
// continue;
// Now find the ID of the zone mesh in the array
int id = regions[vis].meshptr->id;
// Figure out how many calls we will have to do to render it (different textures)
int calls = zmeshes[id].numPolyTex;
int count = 0;
// Render each call in batches
for(int j = 0; j < calls; j++)
{
// Bind the correct texture
glBindTexture(GL_TEXTURE_2D, textures[zmeshes[id].polyTexs[j].texIndex]);
// Set up rendering states
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
errorLog.writeSuccess("Drawing debug: ID: %i - Min: %i - Max: %i - Polys in this call %i - Count: %i - Location: %i", id, zmeshes[id].minmax[j].min, zmeshes[id].minmax[j].max, zmeshes[id].polyTexs[j].polyCount, zmeshes[id].polyTexs[j].polyCount * 3, count);
glVertexPointer(3, GL_FLOAT, sizeof(Vertex), &zmeshes[id].vertices[0].x);
glTexCoordPointer(2, GL_FLOAT, sizeof(Vertex), &zmeshes[id].vertices[0].u);
// Draw
glDrawRangeElements(GL_TRIANGLES, zmeshes[id].minmax[j].min, zmeshes[id].minmax[j].max, zmeshes[id].polyTexs[j].polyCount * 3, GL_UNSIGNED_SHORT, zmeshes[id].indices + count);
// End of rendering - disable states
glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
// Add the number of indices rendered
count += zmeshes[id].polyTexs[j].polyCount * 3;
}
}
}
I've done a ton of debug information indicating that my min/max values are set up correctly. At this point, I think it might be an error with the indices so I am going to go ahead an look over/rewrite that function.
Managed to get it to work. Very thankful that someone on another forum labeled his question accurately where a third vertex was being drawn at the origin.
http://www.gamedev.net/topic/583558-gldrawelements-is-drawing-all-of-my-vertices-with-one-vertex-at-the-origin-solved/page_gopid_4867052#entry4867052
Seems I needed to change my indices datatype from int to short. Works like a charm and I am getting a %1500 increase in FPS.
I have written a simple application in Java using Jogl which draws a 3d geometry. The camera can be rotated by dragging the mouse. The application works fine, but drawing the geometry with glBegin(GL_TRIANGLE) ... calls ist too slow.
So I started to use vertex buffers. This also works fine until the number of triangles gets larger than 1000000. If that happens, the display driver suddenly crashes and my montior gets dark. Is there a limit of how many triangles fit in the buffer? I hoped to get 1000000 triangles rendered at a reasonable frame rate.
I have no idea on how to debug this problem. The nasty thing is that I have to reboot Windows after each launch, since I have no other way to get my display working again. Could anyone give me some advice?
The vertices, triangles and normals are stored in arrays float[][] m_vertices, int[][] m_triangles, float[][] m_triangleNormals.
I initialized the buffer with:
// generate a VBO pointer / handle
if (m_vboHandle <= 0) {
int[] vboHandle = new int[1];
m_gl.glGenBuffers(1, vboHandle, 0);
m_vboHandle = vboHandle[0];
}
// interleave vertex / normal data
FloatBuffer data = Buffers.newDirectFloatBuffer(m_triangles.length * 3*3*2);
for (int t=0; t<m_triangles.length; t++)
for (int j=0; j<3; j++) {
int v = m_triangles[t][j];
data.put(m_vertices[v]);
data.put(m_triangleNormals[t]);
}
data.rewind();
// transfer data to VBO
int numBytes = data.capacity() * 4;
m_gl.glBindBuffer(GL.GL_ARRAY_BUFFER, m_vboHandle);
m_gl.glBufferData(GL.GL_ARRAY_BUFFER, numBytes, data, GL.GL_STATIC_DRAW);
m_gl.glBindBuffer(GL.GL_ARRAY_BUFFER, 0);
Then, the scene gets rendered with:
gl.glBindBuffer(GL.GL_ARRAY_BUFFER, m_vboHandle);
gl.glEnableClientState(GL2.GL_VERTEX_ARRAY);
gl.glEnableClientState(GL2.GL_NORMAL_ARRAY);
gl.glVertexPointer(3, GL.GL_FLOAT, 6*4, 0);
gl.glNormalPointer(GL.GL_FLOAT, 6*4, 3*4);
gl.glDrawArrays(GL.GL_TRIANGLES, 0, 3*m_triangles.length);
gl.glDisableClientState(GL2.GL_VERTEX_ARRAY);
gl.glDisableClientState(GL2.GL_NORMAL_ARRAY);
gl.glBindBuffer(GL.GL_ARRAY_BUFFER, 0);
Try checking the return value of calling glBufferData. It will return GL_OUT_OF_MEMORY if it cannot satisfy numBytes.