I'm trying to implement shadow mapping for my simple engine and i've figured out I should combine Omnidirectional Shadow Mapping (cubemap for point lights) with 2d mapping (for directional and spot lights).
My uniform block looks like this:
#define MAX_LIGHTS 128
//...
struct Light
{
//light data...
};
//...
layout (std140) uniform Lights
{
int lightCount; //how many lights were passed into the shader (from 0 to MAX_LIGHTS)
Light lights[MAX_LIGHTS];
};
I have two questions for you.
Are sampler objects costly? Is the following code optimal for multiple lights?
sampler2D shadowMaps2D[MAX_LIGHTS];
samplerCube shadowCubemaps[MAX_LIGHTS];
//...
if (lights[index].type == POINT_LIGHT)
CalculateShadow(shadowCubemaps[lights[index].shadowMapNr]);
else
CalculateShadow(shadowMaps2D[lights[index].shadowMapNr]);
Only lightCount amount of the objects would be actually filled with a texture. We're stuck with a lot of undefined samplers and I think it can cause some problems.
If I understand correctly, I mustn't declare sampler in uniform blocks. So am I really forced to cycle through all of my shaders and update samplers each time the shadow maps get updated? It's a waste of time!
Are sampler objects costly?
That question is a bit misleading since the sampler data types in GLSL are only opaque handles which reference the texture units. What is costly is the actual sampling operation. Also, the number of texture units a particular shade stage is limited. The spec only guarantees 16. Since you can't reuse the same unit for different sampler types, this would limit your MAX_LIGHTS to just 8.
However, one seldom needs arrays of samplers. Instead, you can use array textures, which will allow you to store all of your shadow maps (per texture type) in a single texture object, and you will need only one sampler for it.
Having said all that I still think that your light count is completely unrealistic. Applying 128 shadow maps in real time won't work even on the fastest GPUs out there...
If I understand correctly, I mustn't declare sampler in uniform blocks.
Correct.
So am I really forced to cycle through all of my shaders and update samplers each time the shadow maps get updated? It's a waste of time!
No. The sampler uniforms only need to be updated if the index of the texture unit you want to sample from changes (which is ideally never). Not when a different texture is bound, and not when some texture contents change.
Related
I'm using a texture array to render Minecraft-style voxel terrain. It's working fantastic, but I noticed recently that GL_MAX_ARRAY_TEXTURE_LAYERS is alot smaller than GL_MAX_TEXTURE_SIZE.
My textures are very small, 8x8, but I need to be able to support rendering from an array of hundreds to thousands of them; I just need GL_MAX_ARRAY_TEXTURE_LAYERS to be larger.
OpenGL 4.5 requires GL_MAX_ARRAY_TEXTURE_LAYERS be at least 2048, which might suffice, but my application is targeting OpenGL 3.3, which only guarantees 256+.
I'm drawing up blanks trying to figure out a prudent workaround for this limitation; dividing up the rendering of terrain based on the max number of supported texture layers does not sound trivial at all to me.
I looked into whether ARB_sparse_texture could help, but GL_MAX_SPARSE_ARRAY_TEXTURE_LAYERS_ARB is the same as GL_MAX_ARRAY_TEXTURE_LAYERS; that extension is just a workaround for VRAM usage rather than layer usage.
Can I just have my GLSL shader access from an array of sampler2DArray? GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS has to be at least 80+, so 80+ * 256+ = 20480+ and that would enough layers for my purposes. So, in theory could I do something like this?
const int MAXLAYERS = 256;
vec3 texCoord;
uniform sampler2DArray[] tex;
void main()
{
int arrayIdx = int(texCoord.z + 0.5f) / MAXLAYERS 256
float arrayOffset = texCoord.z % MAXLAYERS;
FragColor = texture(tex[arrayIdx],
vec3(texCoord.x, texCoord.y, arrayOffset));
}
It would be better to ditch array textures and just use a texture atlas (or use an array texture with each layer containing lots of sub-textures, but as I will show, that's highly unnecessary). If you're using textures of such low resolution, you probably aren't using linear interpolation, so you can easily avoid bleed-over from neighboring texels. And even if you have trouble with bleed-over, it can easily be fixed by adding some space between the sub-textures.
Even if your sub-textures need to be 10x10 to avoid bleed-over, a 1024x1024 texture (the minimum size GL 3.3 requires) gives you 102x102 sub-textures, which is 10'404 textures. Which ought to be plenty. And if its not, then make it an array texture with however many layers you need.
Arrays of samplers will not work for your purpose. First, you cannot declare an unsized uniform array of any kind. Well you can, but you have to redeclare it with a size at some point in your shader, so there's no much point to the unsized declaration. The only unsized arrays you can have are in SSBOs, as the last element of the SSBO.
Second, even with a size, the index you use for arrays of opaque types must be a dynamically uniform. And since you're trying to draw all of the faces of the cubes in one draw calls, and each face can have select from a different layer, there is no intent for this expression's value to be dynamically uniform.
Third, even if you did this with bindless texturing, you would run into the same problem: unless you're on NVIDIA hardware, the sampler you pick must be a dynamically uniform sampler. Which requires the index into the array of samplers to be dynamically uniform. Which yours is not.
If I have a simple OpenGL shader that is applied to many cubes at once, then uniform values get slow. I have noticed that things like glColor3f don't slow it down as much (at least from what I have tried) but am currently using glColor3f as a sort of hack so that the shader can read gl_Color and I can use it similarly to a uniform for determing which side of the cube is being rendered for face-indepent flat lighting.
I am using displaylists so I used glColor3f because it baked into the list, and simply did a different color before every face while creating the list. Now, I want to set more values (not in the displaylist this time) before rendering.
What calls from OpenGL can I do that can be read in the shader? I am needing to send 6 ints from 0-8 into the shader before rendering, but I could probably manage to shrink that later on.
I recommend to use uniform blocks or uniforms with datatype ivec2, ivec3 or ivec4, with them you can write 2, 3, or 4 int at once.
Aapart from this there are some built-in uniforms. Apart from the matrices, these are gl_DepthRange, gl_Fog, gl_LightSource[gl_MaxLights], gl_LightModel, gl_FrontLightModelProduct, gl_BackLightModelProduct, gl_FrontLightProduct[gl_MaxLights], gl_BackLightProduct[gl_MaxLights], gl_FrontMaterial, gl_BackMaterial, gl_Point, gl_TextureEnvColor[gl_MaxTextureUnits], gl_ClipPlane[gl_MaxClipPlanes], gl_EyePlaneS[gl_MaxTextureCoords], gl_EyePlaneT[gl_MaxTextureCoords], gl_EyePlaneR[gl_MaxTextureCoords], gl_EyePlaneQ[gl_MaxTextureCoords], gl_ObjectPlaneS[gl_MaxTextureCoords], gl_ObjectPlaneT[gl_MaxTextureCoords], gl_ObjectPlaneR[gl_MaxTextureCoords] and gl_ObjectPlaneQ[gl_MaxTextureCoords].
But the intended manner to read global data from shaders are uniform variables. Another way would be to encode information in textures.
I'd like to implement instanced rendering to my opengl engine but I've just learned that the maximum number of inputs supported by vertex shader is only 16 for my GPU.
These are the following matrices I need to move to input:
uniform mat4 MVP;
uniform mat4 modelMatrix;
uniform mat3 normalMatrix;
uniform mat4 DepthBiasMVP;
If I understand correctly I will need an attribute for every column of each matrix, so I'll need 4+4+3+4 = 15 attribute space. 19 with the attributes that I already use ( pos, color, texCoord, normal ), and it will grow up to 20+ if I add tangents and other stuff.
Is there a way to deal with this or will I have to forget the instanced drawing ? Let's say I managed to get rid of one of these matrices (modelMatrix) and I have about 15 - 16 attributes, will it work on different GPU's ? the 16 limit is the minimum for all GPU's right ?
Note that 16 is the minimum amount of vertex attributes that your implementation actually has; most of the time more are allowed that you can query via:
glGetIntegerv(GL_MAX_VERTEX_ATTRIBS, &n).
Now, when trying to store all your data in instanced arrays you should try and organize what the data actually is that you want to differ per instance. Do you have a lot of instanced objects that only differ in location and/or orientation? Then you probably only need to set your uniform modelMatrix as an instanced array (which requires 4 vertex attributes, an acceptable value). Do you really need a different view and projection matrix for each instance? Probably not. The same holds for DepthBiasMVP.
The normalMatrix is required if you perform non-uniform scaling and if you plan to do that for each instance you also need to set a normalMatrix per instance. You could calculate those on the CPU beforehand and send them as vertex attributes costing you another 4 vertex attributes. Another option is to calculate normalMatrix in the vertex shader, but that this might slow your vertex shader down a little (perhaps an acceptable tradeoff?).
These should reduce the information you need per instance to just the modelMatrix and perhaps the normalMatrix, already reducing it by half. Maybe you only have a different position per instance? In that case even a simple vec4 will do.
Basically, try to think about what data you actually need to update per instance and you'll most likely be surprised as to how much data you actually need for each instance.
One can store the per-instance data in uniform arrays, uniform buffer objects or texture buffer objects and use the gl_InstanceID variable in GLSL to access the data in the buffer object. Uniform arrays might seem the easiest, but are most limited in size and hence are only applicable for a low number of instances. UBOs can be a bit bigger, but are also quite limited. TBOs on the other hand will allow you many megabytes of data, but you have to appropriately pack your data there. In your case, it seems that you only need float types, so a base format with 32 bit floats should suffice.
I am starting to learn OpenGL (3.3+), and now I am trying to do an algorithm that draws 10000 points randomly in the screen.
The problem is that I don't know exactly where to do the algorithm. Since they are random, I can't declare them on a VBO (or can I?), so I was thinking in passing a uniform value to the vertex shader with the varying position (I would do a loop changing the uniform value). Then I would do the operation 10000 times. I would also pass a random color value to the shader.
Here is kind of my though:
#version 330 core
uniform vec3 random_position;
uniform vec3 random_color;
out vec3 Color;
void main() {
gl_Position = random_position;
Color = random_color;
}
In this way I would do the calculations outside the shaders, and just pass them through the uniforms, but I think a better way would be doing this calculations inside the vertex shader. Would that be right?
The vertex shader will be called for every vertex you pass to the vertex shader stage. The uniforms are the same for each of these calls. Hence you shouldn't pass the vertices - be they random or not - as uniforms. If you would have global transformations (i.e. a camera rotation, a model matrix, etc.), those would go into the uniforms.
Your vertices should be passed as a vertex buffer object. Just generate them randomly in your host application and draw them. The will be automatically the in variables of your shader.
You can change the array in every iteration, however it might be a good idea to keep the size constant. For this it's sometimes useful to pass a 3D-vector with 4 dimensions, one being 1 if the vertex is used and 0 otherwise. This way you can simply check if a vertex should be drawn or not.
Then just clear the GL_COLOR_BUFFER_BIT and draw the arrays before updating the screen.
In your shader just set gl_Position with your in variables (i.e. the vertices) and pass the color on to the fragment shader - it will not be applied in the vertex shader yet.
In the fragment shader the last set variable will be the color. So just use the variable you passed from the vertex shader and e.g. gl_FragColor.
By the way, if you draw something as GL_POINTS it will result in little squares. There are lots of tricks to make them actually round, the easiest to use is probably to use this simple if in the fragment shader. However you should configure them as Point Sprites (glEnable(GL_POINT_SPRITE)) then.
if(dot(gl_PointCoord - vec2(0.5,0.5), gl_PointCoord - vec2(0.5,0.5)) > 0.25)
discard;
I suggest you to read up a little on what the fragment and vertex shader do, what vertices and fragments are and what their respective in/out/uniform variables represent.
Since programs with full vertex buffer objects, shader programs etc. get quite huge, you can also start out with glBegin() and glEnd() to draw vertices directly. However this should only be a very early starting point to understand what you are drawing where and how the different shaders affect it.
The lighthouse3d tutorials (http://www.lighthouse3d.com/tutorials/) usually are a good start, though they might be a bit outdated. Also a good reference is the glsl wiki (http://www.opengl.org/wiki/Vertex_Shader) which is up to date in most cases - but it might be a bit technical.
Whether or not you are working with C++, Java, or other languages - the concepts for OpenGL are usually the same, so almost all tutorials will do well.
The number of per vertex attributes that I need to calculate my vertex shader output is bigger than GL_MAX_VERTEX_ATTRIBS. Is there an efficient way to e.g. point to a number of buffers using a uniform array of indices and to access the per vertex data this way?
This is a hardware limitation so the short answer is no.
If you consider workarounds for other ways, like using uniforms that also got limitations so that is a no way to go either.
One possible way I can think of which is rather hackish is to get the extra data from a texture. Since you can access textures from the vertex shader, but texture filtering is not supported ( you wont need it so it doesn't matter for you ).
With the newer OpenGLs its possible to store rather large amount of data in textures and access them without limitation even in the vertex shader, it seems to be one way to go.
Altho with this approach there is a problem you need to face, how do you know the current index, i.e. which vertex it is?
You can check out gl_VertexID built-in for that.
You could bypass the input assembler and bind the extra attributes in an SSBO or texture. Then you can use gl_VertexID in the vertex shader to get the value of index buffer entry you are currently rendering (eg: the index in the vertex data you need to read from)
So for example in a VS the following code is essentially identical (it may however have different performance characteristics depending on your hardware)
in vec3 myAttr;
void main() {
vec3 vertexValue = myAttr;
//etc
}
vs.
buffer myAttrBuffer {
vec3 myAttr[];
};
void main() {
vec3 vertexValue = myAttr[gl_VertexID];
//etc
}
The CPU-side binding code is different, but generally that's the concept. myAttr counts towards GL_MAX_VERTEX_ATTRIBS, but myAttrBuffer does not since it is loaded explicitly by the shader.
You could even use the same buffer object in both cases by binding with a different target.
If you can not absolutely limit yourself to GL_MAX_VERTEX_ATTRIBS attributes, I would advise using multi pass shaders. Redesign your code to work with data with half set of attributes in first pass, and the remaining in second pass.