I'm making a voxel engine and I can render a chunk. I'm using instanced rendering, meaning that I can render all of the chunk with a single draw call. Every blocks of a chunk has a single int (From 0 to 4095) that defines his block type (0 for air, 1 for dirt, etc...). I wanna be able to render my block by applying the good texture in my fragment shader. My chunk contains a tri-dimensionnal array :
uint8_t blocks[16][16][16]
The problem is that I can't find a way to send my array of int to the shader. I tried using a VBO but it makes no-sense (I didn't get any result). I also tried to send my array with glUniform1iv() but I failed.
Is it possible to send an array of int to a shader with glUniformX() ?
In order to prevent storing big data, can I set a byte (uint8_t) instead of int with glUniformX() ?
Is there a good way to send that much data to my shader ?
Is instanced drawing a good way to draw the same model with different textures/types of blocks.
For all purposes and intents, data of this type should be treated like texture data. This doesn't mean literally uploading it as texture data, but rather that that's the frame of thinking you should be using when considering how to transfer it.
Or, in more basic terms: don't try to pass this data as uniform data.
If you have access to OpenGL 4.3+ (which is a reasonably safe bet for most hardware no older than 6-8 years), then Shader Storage Buffers are going to be the most laconic solution:
//GLSL:
layout(std430, binding = 0) buffer terrainData
{
int data[16][16][16];
};
void main() {
int terrainType = data[voxel.x][voxel.y][voxel.z];
//Do whatever
}
//HOST:
struct terrain_data {
int data[16][16][16];
};
//....
terrain_data data = get_terrain_data();
GLuint ssbo;
GLuint binding = 0;//Should be equal to the binding specified in the shader code
glGenBuffers(1, &ssbo);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, ssbo);
glBufferData(GL_SHADER_STORAGE_BUFFER, GLsizeiptr size, data.data, GLenum usage);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, binding, ssbo);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, 0);
Any point after this where you need to update the data, simply bind ssbo, call glBufferData (or your preferred method for updating buffer data), and then you're good to go.
If you're limited to older hardware, you do have some options, but they quickly get clunky:
You can use Uniform Buffers, which behave very similarly to Shader Storage Buffers, but
Have limited storage space (65kb in most implementations)
Have other restrictions that may or may not be relevant to your use case
You can use textures directly, where you convert the terrain data to floating point values (or use as integers, if the hardware supports integer formats internally), and then convert back inside the shader
Compatible with almost any hardware
But requires extra complexity and calculations in your shader code
I do second the approach as laid out in #Xirema's answer, but come to a slightly different recommendation. Since your original data type is just uint8_t, using an SSBO or UBO directly will require to either waste 3 bytes per element or to manually pack 4 elements into a single uint. From #Xirema's answer:
For all purposes and intents, data of this type should be treated like texture data. This doesn't mean literally uploading it as texture data, but rather that that's the frame of thinking you should be using when considering how to transfer it.
I totally agree to that. Hence I recommend the use of a Texture Buffer Object (TBO), (a.k.a. "Buffer Texture").
Using glTexBuffer() you can basically re-interpret a buffer object as a texture. In your case, you can just pack the uint8_t[16][16][16] array into a buffer and interpret it as GL_R8UI "texture" format, like this:
//GLSL:
uniform usamplerBuffer terrainData;
void main() {
uint terrainType = texelFetch(terrainData, voxel.z * (16*16) + voxel.y * 16 + voxel.x).r
//Do whatever
}
//HOST:
struct terrain_data {
uint8_t data[16][16][16];
};
//....
terrain_data data = get_terrain_data();
GLuint tbo;
GLuint tex;
glGenBuffers(1, &tbo);
glBindBuffer(GL_TEXTURE_BUFFER, tbo);
glBufferData(GL_TEXTURE_BUFFER, sizeof(terrain_data), data.data, usage);
glGenTextures(1, &tex);
glBindTexture(GL_TEXTURE_BUFFER, tex);
glTexBuffer(GL_TEXTURE_BUFFER, GL_R8UI, tbo);
Note that this will not copy the data to some texture object. Accessing the texture means directly accessing the memory of the buffer.
TBOs also have the advantage that they are available since OpenGL 3.1.
Related
I've been writing something using GL3.3 which takes a uniform buffer, and uses the information from it to select sprite tiles in a frag shader. It's working on my desktop, with a Nvidia GTX780, but my AMD based laptop (A6-4455M) has some issues with it. Both are on the latest (or very recent) drivers.
Back to the code, It first of all sets up a uniform buffer, which consists of two uints, and a uint array. They then get filled, and are accessed in the shader. At first I got a GL error on the laptop because I was not allocating enough, but a temporary change taking padding into account has sorted that out, and now data is actually being buffered.
The first two uints are no problem. I've also got the array somewhat readable in the shader, there is just one problem; The data is multiplied by four! At the moment the array is just some test data, initialized to its index, so spriteArr[1] == 1, spriteArr[34] == 34, etc. However, Accessing it in the shader, spriteArr[10] gives 40. This goes all the way up to spriteArr[143] == 572. Beyond this and it's something else. I don't know exactly why this is, but it would appear to be an incorrect offset.
I am using the shared uniform layout, and getting the uniform offsets from GL itself, so they should be correct. I did notice that the offsets on the AMD card are much larger, as if it is adding more padding. They are always 0,4,8 on the desktop, but 0,16,32 on the laptop.
If it makes any difference, there is another UBO (binding point 0), which is used for the view and projection matrices. These work as intended. However it is not used in the fragment shader. It is also created before this UBO.
UBO initialisation code:
GLuint spriteUBO;
glGenBuffers(1, &spriteUBO);
glBindBuffer(GL_UNIFORM_BUFFER, spriteUBO);
unsigned maxsize = (2 + 576 + 24) * sizeof(GLuint);
/*Bad I know, but temporary. AMD's driver adds 24 bytes of padding. Nvidias has none.
Not the cause of this problem. At least ensures we have enough allocated. */
glBufferData(GL_UNIFORM_BUFFER, maxsize, NULL, GL_STATIC_DRAW);
glBindBuffer(GL_UNIFORM_BUFFER, 0);
//Set binding point
GLuint spriteUBOIndex = glGetUniformBlockIndex(programID, "SpriteMatchData");
glUniformBlockBinding(programID, spriteUBOIndex, 1);
static const GLchar *unames[] =
{
"width", "height",
//"size",
"spriteArr"
};
GLuint uindices[3];
GLint offsets[3];
glGetUniformIndices(programID,3,unames,uindices);
glGetActiveUniformsiv(programID, 3, uindices, GL_UNIFORM_OFFSET, offsets);
//buffer stuff
glBindBufferBase(GL_UNIFORM_BUFFER, 1, spriteUBO);
glBufferSubData(GL_UNIFORM_BUFFER,offsets[0], sizeof(GLuint), tm.getWidth());
glBufferSubData(GL_UNIFORM_BUFFER, offsets[1], sizeof(GLuint), tm.getHeight());
glBufferSubData(GL_UNIFORM_BUFFER, offsets[2], tm.getTileCount() * sizeof(GLuint), tm.getSpriteArray());
Fragment Shader:
layout (shared) uniform SpriteMatchData{
uint width, height;
uint spriteArr[576];};
Then later on I experiment with the array with something like this:
if(spriteArr[10] == uint(40))
{
debug_colour = vec4(0.0,1.0,0.0,0.0);//green
}
else
{
debug_colour = vec4(1.0,0.0,0.0,0.0); //red
}
With debug_colour turning green in this instance.
Is there any way to sort this out with something that works with both systems? Why is the AMD driver handling this so differently? Could it be a bug in the way it deals with uniform uint arrays?
Why is the AMD driver handling this so differently?
Because that's what you asked for:
layout (shared) uniform SpriteMatchData
You explicitly asked for shared layout. That layout is implementation defined. Therefore, two different implementations are allowed to give you two different layouts. As such, if you want to use SpriteMatchData in a platform-independent way, you must query its layout from the program after linking it.
While you did query the offsets for the values, you did not query the array stride: the byte offset from element to element within the array. There is nothing in the specification that requires that shared layouts tightly pack arrays.
Really though, there's pretty much no reason not to use std140 layout. You can avoid all of this querying of offsets and simply design C++ structs that can be directly consumed by GLSL.
I am trying to use a buffer in a compute shader like this:
layout (binding = 1, std430) writeonly buffer bl1
{
uint data[gl_WorkGroupSize.x * gl_NumWorkGroups.x * gl_NumWorkGroups.y];
};
but I get the following error (because of using gl_NumWorkGroups for the size):
Array size must be a constant integer expression
How can I work around this?
Stop putting in a length at all:
layout (binding = 1, std430) writeonly buffer bl1
{
uint data[];
};
This is a feature unique to SSBOs. And you can only have one unsized array in an SSBO, and it must be the last member in the interface block. The size of data will be computed based on the size of the buffer object range you bind to that binding point. So if you bind 32KB of buffer space, you will get 8K of items (the size of a uint is 4 bytes).
At runtime, your shader can use gl_WorkGroupSize.x * gl_NumWorkGroups.x * gl_NumWorkGroups.y to compute the length of data. Alternatively, just use data.length() to get the length of the buffer that the user gave you. Alternatively... you don't need to explicitly know the length, depending on how you use it.
As long as your OpenGL buffer binding code uses a buffer with enough memory for your dispatch count and work group size, you're fine.
I'm trying to visualize very large point cloud (700 mln points) and on glDrawArrays call debugger throws access violation writing location exception. I'm using the same code to render smaller clouds (100 mln) and everything works fine. I also have enough RAM memory (32GB) to store the data.
To store point cloud I'm using std::vector<Point3D<float>> where Point3D is
template <class T>
union Point3D
{
T data[3];
struct{
T x;
T y;
T z;
};
}
Vertex array and buffer initialization:
glBindVertexArray(pxCloudHeader.uiVBA);
glBindBuffer(GL_ARRAY_BUFFER, pxCloudHeader.xVBOs.uiVBO_XYZ);
glBufferData(GL_ARRAY_BUFFER, pxCloudHeader.iPointsCount * sizeof(GLfloat) * 3, &p3DfXYZ->data[0], GL_STREAM_DRAW);
glVertexAttribPointer((GLuint)0, 3, GL_FLOAT, GL_FALSE, 0, 0);
glEnableVertexAttribArray(0);
glBindVertexArray(0);
Drawing call:
glBindVertexArray(pxCloudHeader.uiVBA);
glDrawArrays(GL_POINTS, 0, pxCloudHeader.iPointsCount); // here exception is thrown
glBindVertexArray(0);
I also checked if there was OpenGL error thrown but I haven't found any.
I suspect your problem is due to the size of GLsizeiptr.
This is the data type used to represent sizes in OpenGL buffer objects, and it is typically 32-bit.
700 million vertices * 4-bytes per-component * 3-components = 8,400,000,000 bytes
There is a serious issue with trying to allocate that many bytes in GL if it is using 32-bit pointers:
8400000000 & 0xFFFFFFFF = 4,105,032,704 (half as many bytes as you actually need)
If sizeof (GLsizeiptr) on your implementation is 4 then you will have no choice but to split your array up. A 32-bit GLsizeiptr only allows you to store 4 contiguous GiB of memory, but you can work around this if you use 3 single-component arrays instead. Using a vertex shader you can reconstruct these 3 separate (small enough) arrays like so:
#version 330
layout (location = 0) in float x; // Vertex Attrib Ptr. 0
layout (location = 1) in float y; // Vertex Attrib Ptr. 1
layout (location = 2) in float z; // Vertex Attrib Ptr. 2
void main (void)
{
gl_Position = vec4 (x,y,z,1.0);
}
Performance is going to be awful, but that is one way to approach the problem with minimal effort.
By the way, the amount of system memory here (32 GiB) is not your biggest issue. You should be thinking in terms of the amount of VRAM on your GPU because ideally buffer objects are designed to be stored on the GPU. Any part of the buffer object that is too large to be stored in GPU memory will have to be transferred over the PCIe (these days) bus when it is used.
You could draw the data in smaller batches. While there is no predefined upper limit for the size of a buffer, storing 8 GBytes of data in a single buffer is a lot. I'm not very surprised that something would blow up.
I would probably start with storing something like 1 million, or at most a few million, points in each buffer. Then use a pool of buffers with this fixed size, enough to accommodate all your data points.
This might even be beneficial for you performance, because it allows you start submitting draw calls before copying all your data into buffers. This will give you better overlap between CPU and GPU work.
With the amount of data you are shuffling around, you may also want to look into using glMapBuffer()/glUnmapBuffer() instead of glBufferData(). This generally avoids one copy operation for the data.
I need to access a one dimensional big(~2MB) buffer from a shader. However, I don't know which type of OpenGL buffer I should use. I'm going to store floats(16F) and unsigned integers (16UI). My data will be like an struct:
struct{
float d; //16F
int a[7]; //or a1,a2,a3,a4,a5,a6,a7; //7x16UI
}
I read about buffer texture and other kind of buffers(Passing a list of values to fragment shader), but it only works for one type of data (float or int), not both. I could use two buffers, but I think this won't be cache friendly nor easy.
Why are there mismatching types in OpenGL?
For example, if I have a vertex buffer object,
GLuint handle = 0;
glGenBuffers(1, &handle_); // this function takes (GLsizei, GLuint*)
Now if I want to know the currently bound buffer
glGetIntegerv( GL_ARRAY_BUFFER_BINDING, reinterpret_cast<GLint *>(&handle ) ); // ouch, type mismatch
Why not have a glGetUnsignedIntegerv or
have glGenBuffers take an GLint * instead.
That is because glGetIntegerv function is intended to get any integral type of information back from OpenGL. It includes also GLint type values (negative ones). And also it includes multiple component values like GL_VIEWPORT:
GLint viewport[4];
glGetIntegerv(GL_VIEWPORT, viewport);
From one point of view - it is simpler to have just one function for getting values back, instead of hundreds for each specific parameter.
Form other point of view - of course it's a bit ugly to cast types.
But no idea why they didn't use GLint for buffer id.
Anyway - you shouldn't bee calling any glGet... functions. They are slow and often requires waiting on GPU complete previous commands - meaning CPU will wait idle in that time.