Is it legal to update buffer bound via glBindBufferRange from CPU? - opengl

I'm testing the following code in OpenGL.
glGenBuffers(1, &UBO);
glBindBuffer(GL_UNIFORM_BUFFER, UBO);
glBufferData(GL_UNIFORM_BUFFER, 4 * sizeof(float), nullptr, GL_STREAM_DRAW);
glBindBufferRange(GL_UNIFORM_BUFFER, 1, UBO, 0, 4 * sizeof(float));
while (1) {
glBindBuffer(GL_UNIFORM_BUFFER, UBO);
GLbitfield access = GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_RANGE_BIT;
auto* ptr = glMapBufferRange(GL_UNIFORM_BUFFER, 0, 4 * sizeof(float), access);
float color[4] = {
static_cast <float> (rand()) / static_cast <float> (RAND_MAX),
static_cast <float> (rand()) / static_cast <float> (RAND_MAX),
static_cast <float> (rand()) / static_cast <float> (RAND_MAX),
1.0
};
memcpy(ptr, &color[0], 4 * sizeof(float));
glUnmapBuffer(GL_UNIFORM_BUFFER);
glBindBuffer(GL_UNIFORM_BUFFER, 0);
// <bind VAO>
// <bind shader>
// <submit draw>
}
//clean up stuff
On on NVidia & Windows and it's blazing fast (faster than per-shader uniform updates with glUniform4f(0, color[0], color[1], color[2], color[3]) or persistent UBO mapping)
However I'm concerned that I'm updating the buffer content on CPU for the buffer bound at the very beginning of my program with glBindBufferRange() to the GPU. Perhaps my code should look like this instead:
glGenBuffers(1, &UBO);
glBindBuffer(GL_UNIFORM_BUFFER, UBO);
glBufferData(GL_UNIFORM_BUFFER, 4 * sizeof(float), nullptr, GL_STREAM_DRAW);
//glBindBufferRange(GL_UNIFORM_BUFFER, 1, UBO, 0, 4 * sizeof(float)); //no more 1 time bindings, perhaps it's not legal???
while (1) {
glBindBufferRange(GL_UNIFORM_BUFFER, 1, 0, 0, 4 * sizeof(float)); //clean up binding
glBindBuffer(GL_UNIFORM_BUFFER, UBO);
GLbitfield access = GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_RANGE_BIT;
auto* ptr = glMapBufferRange(GL_UNIFORM_BUFFER, 0, 4 * sizeof(float), access);
float color[4] = {
static_cast <float> (rand()) / static_cast <float> (RAND_MAX),
static_cast <float> (rand()) / static_cast <float> (RAND_MAX),
static_cast <float> (rand()) / static_cast <float> (RAND_MAX),
1.0
};
memcpy(ptr, &color[0], 4 * sizeof(float));
glUnmapBuffer(GL_UNIFORM_BUFFER);
glBindBuffer(GL_UNIFORM_BUFFER, 0);
glBindBufferRange(GL_UNIFORM_BUFFER, 1, UBO, 0, 4 * sizeof(float)); //reestablish binding again
// <bind VAO>
// <bind shader>
// <submit draw>
}
//clean up stuff
I know NVidia OpenGL drivers are very lenient to programmer mistakes and tend to forgive a lot of stuff, so I wonder if it's legal to update buffer contents (on CPU, like in my example) for the UBO buffer bound to certain index with glBindBufferRange() per OpenGL spec.

Broadly speaking, OpenGL does not allow one function to undo state set by another function unless manipulating that particular piece of state is the goal of that function. The contents of GPU-accessible memory and the binding of that memory for a purpose are different state. If you have bound a buffer range to a location in the context for a purpose, the only thing that can affect that binding is attempting to bind to that same location in the context for that purpose (or deleting the buffer).
Now the buffer itself may become temporarily unusable, such as the period when the bound buffer is (non-persistently) mapped. But that doesn't cause the buffer to become unbound from the context; you simply cannot do anything to use it during that period.
So there would be no point in re-binding the same range of the same buffer, simply because you modified its contents. Even the fact that you're invaliding the buffer in its entirety is something an implementation must deal with.

Related

Why is OpenGL immediate mode faster than core?

I am using the following library to render text in OpenGL: fontstash. I have another header file which adds support for OpenGL 3.0+. The question is why is the render implementation with core profile much slower than the immediate mode?
Here is the render code with immediate mode:
static void glfons__renderDraw(void* userPtr, const float* verts, const float* tcoords, const unsigned int* colors, int nverts)
{
GLFONScontext* gl = (GLFONScontext*)userPtr;
if (gl->tex == 0) return;
glBindTexture(GL_TEXTURE_2D, gl->tex);
glEnable(GL_TEXTURE_2D);
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glEnableClientState(GL_COLOR_ARRAY);
glVertexPointer(2, GL_FLOAT, sizeof(float)*2, verts);
glTexCoordPointer(2, GL_FLOAT, sizeof(float)*2, tcoords);
glColorPointer(4, GL_UNSIGNED_BYTE, sizeof(unsigned int), colors);
glDrawArrays(GL_TRIANGLES, 0, nverts);
glDisable(GL_TEXTURE_2D);
glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
glDisableClientState(GL_COLOR_ARRAY);
}
Here is the core profile render code:
static void gl3fons__renderDraw(void* userPtr, const float* verts, const float* tcoords, const unsigned int* colors, int nverts)
{
GLFONScontext* gl = (GLFONScontext*)userPtr;
if (gl->tex == 0) return;
if (gl->shader == 0) return;
if (gl->vao == 0) return;
if (gl->vbo == 0) return;
// init shader
glUseProgram(gl->shader);
// init texture
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, gl->tex);
glUniform1i(gl->texture_uniform, 0);
// init our projection matrix
glUniformMatrix4fv(gl->projMat_uniform, 1, false, gl->projMat);
// bind our vao
glBindVertexArray(gl->vao);
// setup our buffer
glBindBuffer(GL_ARRAY_BUFFER, gl->vbo);
glBufferData(GL_ARRAY_BUFFER, (2 * sizeof(float) * 2 * nverts) + (sizeof(int) * nverts), NULL, GL_DYNAMIC_DRAW);
glBufferSubData(GL_ARRAY_BUFFER, 0, sizeof(float) * 2 * nverts, verts);
glBufferSubData(GL_ARRAY_BUFFER, sizeof(float) * 2 * nverts, sizeof(float) * 2 * nverts, tcoords);
glBufferSubData(GL_ARRAY_BUFFER, 2 * sizeof(float) * 2 * nverts, sizeof(int) * nverts, colors);
// setup our attributes
glEnableVertexAttribArray(0);
glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, 0);
glEnableVertexAttribArray(1);
glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, sizeof(float) * 2, (void *) (sizeof(float) * 2 * nverts));
glEnableVertexAttribArray(2);
glVertexAttribPointer(2, 4, GL_UNSIGNED_BYTE, GL_TRUE, sizeof(int), (void *) (2 * sizeof(float) * 2 * nverts));
glDrawArrays(GL_TRIANGLES, 0, nverts);
glBindBuffer(GL_ARRAY_BUFFER, 0);
glBindVertexArray(0);
glUseProgram(0);
}
I made a small test for each implementation and the results show that immediate mode is significantly faster than core.
Both tests fill the screen with AAA... and I log the time it took to do this for each frame. This is the loop:
// Create GL stash for 512x512 texture, our coordinate system has zero at top-left.
struct FONScontext* fs = glfonsCreate(512, 512, FONS_ZERO_TOPLEFT);
// Add font to stash.
int fontNormal = fonsAddFont(fs, "sans", "fontstash/example/DroidSerif-Regular.ttf");
// Render some text
float dx = 10, dy = 10;
unsigned int white = glfonsRGBA(255,255,255,255);
std::chrono::high_resolution_clock::time_point t1 = std::chrono::high_resolution_clock::now();
fonsSetFont(fs, fontNormal);
fonsSetSize(fs, 20.0f);
fonsSetColor(fs, white);
for(int i = 0; i < 90; i++){
for( int j = 0; j < 190; j++){
dx += 10;
fonsDrawText(fs, dx, dy, "A", NULL);
}
dy += 10;
dx = 10;
}
std::chrono::high_resolution_clock::time_point t2 = std::chrono::high_resolution_clock::now();
std::chrono::duration<double, std::milli> time_span = t2 - t1;
std::cout<<"Time to render: "<<time_span.count()<<"ms"<<std::endl;
And the results show more than 400ms difference between the two:
Core profile (left) vs Immediate mode (right)
What should be changed in order to speed up performance?
I don't know exactly what gl is in this program, but it's pretty clear that, every time you want to render a piece of text, you perform the following operations:
Allocate storage for a buffer, reallocating whatever storage had been created from the last time this was called.
Perform three separate uploads of data to that buffer.
These are not good ways to stream vertex data to the GPU. There are specific techniques for doing this well, but this is not one of them. In particular, the fact that you are constantly reallocating the same buffer is going to kill performance.
The most effective way to deal with this is to have a single buffer with a fixed amount of storage. It gets allocated exactly once and never again. Ideally, whatever API you're getting vertex data from would provide it in an interleaved format, so that you would only need to perform one upload rather than three. But apparently, Fontstash is apparently not so generous.
In any case, the main idea is to avoid reallocation and synchronization. The latter means never trying to write over data that has been written to recently. So your buffer needs to be sufficiently large to hold twice the number of font vertices you ever expect to render. Essentially, you double-buffer the vertex data: writing to one set of data while the other set is being read from.
So at the beginning of the frame, you figure out what the byte offset to where you want to render will be. This will either be the start of the buffer or half-way through it. Then, for each blob of text, you write vertex data to this offset and increment the offset accordingly.
And to avoid having to change VAO state, you should interleave the vertex data manually. Instead of uploading three arrays, you should interleave the vertices so that you're effectively making one gigantic array of vertices. So you never need to call glVertexAttribPointer in the middle of this function; you just use the parameters to glDraw* to draw the part of the array you want.
This also means you only need one glBufferSubData call. But if you have access to persistent mapped buffers, you don't even need that, since you can just write to the memory directly while using the other portion of it. Though if you use persistent mapping, you will need to use a fence sync object when you switch buffer regions to make sure that you're not writing to vertex data that is still being read by the GPU.

Updateing SSBO Contents

I am trying to do a simple calculation on a Compute Shader, where after doing one round of computation I give the result of that round back to the Shader as an Input for the second round.
My Compute shader looks like this:
#version 440 core
layout(std430, binding = 0) buffer Result{
float out_picture[];
};
layout(std430, binding = 1) buffer In_p1{
float in_p1[];
};
layout(local_size_x = 1000) in;
void main() {
in_p1[gl_GlobalInvocationID.x] = 1.0f;
out_picture[gl_GlobalInvocationID.x] = out_picture[gl_GlobalInvocationID.x] + in_p1[gl_GlobalInvocationID.x];
}
Her is my OpenGL code:
int main(int argc, char* argv[]) {
std::vector<float> result_container;
std::vector<SSBO> ssbo_container;
SSBO ssbo_result;
ssbo_result.NUM_PIX = 1920*1080*4;
ssbo_result.WORK_GROUP_SIZE = 1000;
result_container.reserve(ssbo_result.NUM_PIX * sizeof(float));
glGenBuffers(1, &ssbo_result.handle);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, ssbo_result.handle);
glBufferData(GL_SHADER_STORAGE_BUFFER, ssbo_result.NUM_PIX * sizeof(float), NULL, GL_DYNAMIC_DRAW);
for(unsigned int i = 1; i < 2; i++){
SSBO ssbo;
glGenBuffers(1, &ssbo.handle);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, i, ssbo.handle);
glBufferData(GL_SHADER_STORAGE_BUFFER, ssbo_result.NUM_PIX * sizeof(float), NULL, GL_DYNAMIC_DRAW);
ssbo_container.push_back(ssbo);
}
while (!g_win.shouldClose()) {
std::cout << "container:" << result_container[0] << std::endl;
glUseProgram(g_avg_program);
glDispatchCompute(ssbo_result.NUM_PIX / ssbo_result.WORK_GROUP_SIZE, 1, 1);
glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, 0);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, ssbo_result.handle);
GLfloat *ptr = (GLfloat *) glMapBuffer(GL_SHADER_STORAGE_BUFFER, GL_READ_ONLY);
memcpy(result_container.data(), ptr, ssbo_result.NUM_PIX * sizeof(float));
glUnmapBuffer(GL_SHADER_STORAGE_BUFFER);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, ssbo_result.handle);
ptr = (GLfloat *) glMapBuffer(GL_SHADER_STORAGE_BUFFER, GL_WRITE_ONLY);
memcpy(ptr, result_container.data(), ssbo_result.NUM_PIX * sizeof(float));
glUnmapBuffer(GL_SHADER_STORAGE_BUFFER);
g_win.update();
if (glfwGetKey(g_win.getGLFWwindow(), GLFW_KEY_ESCAPE) == GLFW_PRESS) {
g_win.stop();
}
}
return 0;
}
After initializing the ssbo's and calling glDispatchCompute() and glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT) I read out the result and saving it in std::vector<float>result_container. This fills the container with 1.0f as expected for the first run, ssbo_result.handle beeing my out_picture ssbo
I then try feeding the result back in as an input into out_picture the same way I read the data out of it by just flipping the memcpy dest and source, as I understand it that I now have my previous result in the out_picture GPU Memory
As you can see I am trying to add a value (here 1.0f) to the out_picture and feed it back as an input into out_picture for the next computation. The Idea is to just add the result as an input for the next run.
I tryed doing the same with glCopyBufferSubData with the same outcome. Only 1.0 as output everytime instead of a 1, 2, 3 ...
In the while loop 0 is bound to the binding point 0, after the compute shader was launched:
while (!g_win.shouldClose()) {
....
glDispatchCompute(ssbo_result.NUM_PIX / ssbo_result.WORK_GROUP_SIZE, 1, 1);
glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, 0);
....
}
but ssbo_result.handle is never bound to the binding point 0 again. This causes that the computation operates only in the first cycle of the loop, because the binding to the shader storage buffer is broken in the further cycles of the loop.
It is completely superfluous to break the binding of the buffer, since the binding point is not used for anything else.
Delete the line glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, 0); in the loop, to solve the issue. (Of course binding the buffer continuously at the begin of the while loop, would do the job, too - add glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, ssbo_result.handle); before glDispatchCompute)

OpenGL Array Buffer from a vector of pointers

Problem
I have a vector of pointers to particles:
struct Particle {
vec3 pos; // just 3 floats, GLM vec3 struct
// ...
}
std::vector<Particle *> particles;
I want to use this vector as the source of data for an array buffer in OpenGL
Like this:
glGenBuffers(1, &particleBuffer);
glBindBuffer(GL_ARRAY_BUFFER, particleBuffer);
int bufferSize = sizeof(Particle) * particles->size();
glBufferData(GL_ARRAY_BUFFER, bufferSize, /* What goes here? */, GL_STATIC_DRAW);
glEnableVertexAttribArray(positionAttribLocation);
glVertexAttribPointer(positionAttribLocation, 3, GL_FLOAT, GL_FALSE, sizeof(Particle), (void *)0);
Where the interesting line is glBufferData( ... )
How do I get OpenGL to get that the data is pointers?
How do I get OpenGL to get that the data is pointers?
You don't.
The whole point of buffer objects is that the data lives in GPU-accessible memory. An pointer is an address, and a pointer to a CPU-accessible object is a pointer to a CPU address. And therefore is not a pointer to GPU-accessible memory.
Furthermore, accessing indirect data structures like that is incredibly inefficient. Having to do two pointer indirection just to access a single value basically destroys all chance of cache coherency on memory accesses. And without that, every independent particle is a cache miss.
That's bad. Which is why OpenGL doesn't let you do that. Or at least, it doesn't let you do it directly.
The correct way to do this is to work with a flat vector<Particle>, and move them around as needed.
glBufferData requires a pointer to an array of the data you wish to use. In your case, a GLfloat[] would be used for the vertices. You could write a function which creates a GLfloat[] from the vector of particles. The code I use creates a GLfloat[] and then passes it as a pointer to a constructor which then sets the buffer data. Here is my code;
Creating the Vertex Array - GlFloat[]
GLfloat vertices[] = { 0, 0, 0,
0, 3, 0,
8, 3, 0,
8, 0, 0 };
After I have created the vertices I then create a buffer object (Which just creates a new buffer and sets it's data);
Buffer* vbo = new Buffer(vertices, 4 * 3, 3);
The constructor for my buffer object is;
Buffer::Buffer(GLfloat* data, GLsizei count, GLuint componentCount) {
m_componentCount = componentCount;
glGenBuffers(1, &m_bufferID);
glBindBuffer(GL_ARRAY_BUFFER, m_bufferID);
glBufferData(GL_ARRAY_BUFFER, count * sizeof(GLfloat), data, GL_STATIC_DRAW); //Set the buffer data to the GLFloat pointer which points to the array of vertices created earlier.
glBindBuffer(GL_ARRAY_BUFFER, 0);
}
After this array has been passed to the buffer, you can delete the object to save memory however it is recommended to hold onto it in case it is reused later.
For more information and better OpenGL practices I recommend you check out the following youtube playlist by TheChernoProject. https://www.youtube.com/watch?v=qTGMXcFLk2E&list=PLlrATfBNZ98fqE45g3jZA_hLGUrD4bo6_&index=12

Use of glDrawElements I Don't Understand

I am studying the source code of an open source project and they have a use of the function glDrawElements which I don't understand. While being a programmer, I am quite new to the GL API so would appreciate if someone could tell me how this works.
Let's start with the drawing part. The code looks like this:
for (int i = 0; i < numObjs; i++) {
glDrawElements(GL_TRIANGLES, vboIndexSize(i), GL_UNSIGNED_INT, (void*)(UPTR)vboIndexOffset(i));
}
vboIndiexSize(i) returns the number of indices for the current object, and vboIndexOffset returns the offset in bytes, in a flat memory array in which vertex data AND the indices of the objects are stored.
The part I don't understand, is the (void*)(UPTR)vboIndexOffset(i)). I look at the code many times and the function vboIndexOffset returns a int32 and UPTR also cast the returned value to an int32. So how you can you cast a int32 to a void* and expect this to work? But let's assume I made a mistake there and that it actually returns a pointer to this variable instead. The 4th argument of the glDrawElements call is an offset in byte within a memory block. Here is how the data is actually stored on the GPU:
int ofs = m_vertices.getSize();
for (int i = 0; i < numObj; i++)
{
obj[i].ofsInVBO = ofs;
obj[i].sizeInVBO = obj[i].indices->getSize() * 3;
ofs += obj[i].indices->getNumBytes();
}
vbo.resizeDiscard(ofs);
memcpy(vbo.getMutablePtr(), vertices.getPtr(), vertices.getSize());
for (int i = 0; i < numObj; i++)
{
memcpy(
m_vbo.getMutablePtr(obj[i].ofsInVBO),
obj[i].indices->getPtr(),
obj[i].indices->getNumBytes());
}
So all they do is calculate the number of bytes needed to store the vertex data then add to this number the number of bytes needed to store the indices of all the objects we want to draw. Then they allocate memory of that size, and copy the data in this memory: first the vertex data and then the indices. One this is done they push it to the GPU using:
glGenBuffers(1, &glBuffer);
glBindBuffer(GL_ARRAY_BUFFER, glBuffer);
checkSize(size, sizeof(GLsizeiptr) * 8 - 1, "glBufferData");
glBufferData(GL_ARRAY_BUFFER, (GLsizeiptr)size, data, GL_STATIC_DRAW);
What's interesting is that they store everything in the GL_ARRAY_BUFFER. They never store the vertex data in a GL_ARRAY_BUFFER and then the indices using a GL_ELEMENT_ARRAY_BUFFER.
But to go back to the code where the drawing is done, they first do the usual stuff to declare vertex attribute. For each attribute:
glBindBuffer(GL_ARRAY_BUFFER, glBuffer);
glEnableVertexAttribArray(loc);
glVertexAttribPointer(loc, size, type, GL_FALSE, stride, pointer);
This makes sense and is just standard. And then the code I already mentioned:
for (int i = 0; i < numObjs; i++) {
glDrawElements(GL_TRIANGLES, vboIndexSize(i), GL_UNSIGNED_INT, (void*)(UPTR)vboIndexOffset(i));
}
So the question: even if (UPTR) actually returns the pointer to variable (the code doesn't indicate this but I may be mistaken, it's a large project), I didn't know it was possible to store all vertex and indices data with the same memory block using GL_ARRAY_BUFFER and then using glDrawElements and having the 4th argument being the offset to the first element of this index list for the current object from this memory block. I thought you needed to use GL_ARRAY_BUFFER and GL_ELEMENT_BUFFER to declare the vertex data and the indices separately. I didn't think you could declare all the data in one go using GL_ARRAY_BUFFER and can't get it to work on my side anyway.
Has anyone see this working before? I haven't got a chance to get it working as yet, and wonder if someone could just potentially tell me if there's something specific I need to be aware of to get it to work. I tested with a simple triangle with position, normal and texture coordinates data, thus I have 8 * 3 floats for the vertex data and I have an array of 3 integers for the indices, 0, 1, 2. I then copy everything in a memory block, initialize the glBufferData with this, then try to draw the triangle with:
int n = 96; // offset in bytes into the memory block, fist int in the index list
glDrawElements(GL_TRIANGLES, 3, GL_UNSIGNED_INT, (void*)(&n));
It doesn't crash but I can't see the triangle.
EDIT:
Adding the code that doesn't seem to work for me (crashes).
float vertices[] = {
0, 1, 0, // Vertex 1 (X, Y)
2, -1, 0, // Vertex 2 (X, Y)
-1, -1, 0, // Vertex 3 (X, Y)
3, 1, 0,
};
U8 *ptr = (U8*)malloc(4 * 3 * sizeof(float) + 6 * sizeof(unsigned int));
memcpy(ptr, vertices, 4 * 3 * sizeof(float));
unsigned int indices[6] = { 0, 1, 2, 0, 3, 1 };
memcpy(ptr + 4 * 3 * sizeof(float), indices, 6 * sizeof(unsigned int));
glGenBuffers(1, &vbo);
glBindBuffer(GL_ARRAY_BUFFER, vbo);
glBufferData(GL_ARRAY_BUFFER, 4 * 3 * sizeof(float) + 6 * sizeof(unsigned int), ptr, GL_STATIC_DRAW);
glGenVertexArrays(1, &vao);
glBindVertexArray(vao);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, NULL);
glEnableVertexAttribArray(0);
free(ptr);
Then when it comes to draw:
glBindVertexArray(vao);
glBindBuffer(GL_ARRAY_BUFFER, vbo);
// see stackoverflow.com/questions/8283714/what-is-the-result-of-null-int/
typedef void (*TFPTR_DrawElements)(GLenum, GLsizei, GLenum, uintptr_t);
TFPTR_DrawElements myGlDrawElements = (TFPTR_DrawElements)glDrawElements;
myGlDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT, uintptr_t(4 * 3 * sizeof(float)));
This crashes the app.
see answer below for solution
This is due to OpenGL re-using fixed-function pipeline calls. When you bind a GL_ARRAY_BUFFER VBO, a subsequent call to glVertexAttribPointer expects an offset into the VBO (in bytes), which is then cast to a (void *). The GL_ARRAY_BUFFER binding remains in effect until another buffer is bound, just as the GL_ELEMENT_ARRAY_BUFFER binding remains in effect until another 'index' buffer is bound.
You can encapsulate the buffer binding and attribute pointer (offset) states using a Vertex Array Object.
The address in your example isn't valid. Cast offsets with: (void *) n
Thanks for the answers. I think though that (and after doing some research on the web),
first you should be using glGenVertexArray. It seems that this is THE standard now for OpenGL4.x so rather than calling glVertexAttribPointer before drawing the geometry, it seems like it's best practice to create a VAO when the data is pushed to the GPU buffers.
I (actually) was able to make combine the vertex data and the indices within the SAME buffer (a GL_ARRAY_BUFFER) and then draw the primitive using glDrawElements (see below). The standard way anyway is to push the vertex data to a GL_ARRAY_BUFFER and the indices to a GL_ELEMENT_ARRAY_BUFFER separately. So if that's the standard way of doing it, it's probably better not to try to be too smart and just use these functions.
Example:
glGenBuffers(1, &vbo);
// push the data using GL_ARRAY_BUFFER
glGenBuffers(1, &vio);
// push the indices using GL_ELEMENT_ARRAY_BUFFER
...
glGenVertexArrays(1, &vao);
// do calls to glVertexAttribPointer
...
Please correct me if I am wrong, but that seems the correct (and only) way to go.
EDIT:
However, it is actually possible to "pack" the vertex data and the indices together into an ARRAY_BUFFER as long as a call to glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, vbo) is done prior to calling glDrawElements.
Working code (compared with code in original post):
float vertices[] = {
0, 1, 0, // Vertex 1 (X, Y)
2, -1, 0, // Vertex 2 (X, Y)
-1, -1, 0, // Vertex 3 (X, Y)
3, 1, 0,
};
U8 *ptr = (U8*)malloc(4 * 3 * sizeof(float) + 6 * sizeof(unsigned int));
memcpy(ptr, vertices, 4 * 3 * sizeof(float));
unsigned int indices[6] = { 0, 1, 2, 0, 3, 1 };
memcpy(ptr + 4 * 3 * sizeof(float), indices, 6 * sizeof(unsigned int));
glGenBuffers(1, &vbo);
glBindBuffer(GL_ARRAY_BUFFER, vbo);
glBufferData(GL_ARRAY_BUFFER, 4 * 3 * sizeof(float) + 6 * sizeof(unsigned int), ptr, GL_STATIC_DRAW);
glGenVertexArrays(1, &vao);
glBindVertexArray(vao);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, NULL);
glEnableVertexAttribArray(0);
free(ptr);
Then when it comes to draw:
glBindVertexArray(vao);
glBindBuffer(GL_ARRAY_BUFFER, vbo); // << THIS IS ACTUALLY NOT NECESSARY
// VVVV THIS WILL MAKE IT WORK VVVV
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, vbo);
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// see stackoverflow.com/questions/8283714/what-is-the-result-of-null-int/
typedef void (*TFPTR_DrawElements)(GLenum, GLsizei, GLenum, uintptr_t);
TFPTR_DrawElements myGlDrawElements = (TFPTR_DrawElements)glDrawElements;
myGlDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT, uintptr_t(4 * 3 * sizeof(float)));

Mixed types in VBO?

Let's say I want to upload unsigned integer and float data to the graphics card, in a single draw call. I use standard VBOs (not VAO, I'm using OpenGL 2.0), with the various vertex attribute arrays combined into the single GL_ARRAY_BUFFER, and pointed to individually using glVertexAttribPointer(...), so:
glBindBuffer(GL_ARRAY_BUFFER, vertexBufferId);
glEnableVertexAttribArray(positionAttributeId);
glEnableVertexAttribArray(myIntAttributeId);
glVertexAttribPointer(positionAttributeId, 4, GL_FLOAT, false, 0, 0);
glVertexAttribPointer(colorAttributeId, 4, GL_UNSIGNED_INT, false, 0, 128);
glClear(...);
glDraw*(...);
The problem I have here is that my buffer (ref'ed by vertexBufferId), has to be created as a FloatBuffer in LWJGL, so that it can support the attribute of type GL_FLOAT, and this would seem to preclude the use of GL_INT here (or else, the other way around - it's either one or the other since the buffer cannot be of two types).
Any ideas? How would this be handled in native C code?
This would be handled in C (in a safe way) by doing this:
GLfloat *positions = malloc(sizeof(GLfloat) * 4 * numVertices);
GLuint *colors = malloc(sizeof(GLuint) * 4 * numVertices);
//Fill in data here.
//Allocate buffer memory
glBufferData(..., (sizeof(GLfloat) + sizeof(GLuint)) * 4 * numVertices, NULL, ...);
//Upload arrays
glBufferSubData(..., 0, sizeof(GLfloat) * 4 * numVertices, positions);
glBufferSubData(..., sizeof(GLfloat) * 4 * numVertices, sizeof(GLuint) * 4 * numVertices, colors);
free(positions);
free(colors);
There are other ways of doing this in as well, which involve a lot of casting and so forth. But this code emulates what you'll have to do in LWJGL.