Strange performance behaviour with SSAO algorithm using OpenGL and GLSL

Strange performance behaviour with SSAO algorithm using OpenGL and GLSL - opengl

I'm working on the SSAO (Screen-Space Ambient Occlusion) algorithm using Oriented-Hemisphere rendering technique.
I) The algorithm
This algorithm requires as inputs:
1 array containing precomputed samples (loaded before the main loop -> In my example I use 64 samples oriented according to the z axis).
1 noise texture containing normalized rotation vectors also oriented according to the z axis (this texture is generated once).
2 textures from the GBuffer: the 'PositionSampler' and the 'NormalSampler' containing the positions and normal vectors in view space.
Here's the fragment shader source code I use:
#version 400
/*
** Output color value.
*/
layout (location = 0) out vec4 FragColor;
/*
** Vertex inputs.
*/
in VertexData_VS
{
vec2 TexCoords;
} VertexData_IN;
/*
** Inverse Projection Matrix.
*/
uniform mat4 ProjMatrix;
/*
** GBuffer samplers.
*/
uniform sampler2D PositionSampler;
uniform sampler2D NormalSampler;
/*
** Noise sampler.
*/
uniform sampler2D NoiseSampler;
/*
** Noise texture viewport.
*/
uniform vec2 NoiseTexOffset;
/*
** Ambient light intensity.
*/
uniform vec4 AmbientIntensity;
/*
** SSAO kernel + size.
*/
uniform vec3 SSAOKernel[64];
uniform uint SSAOKernelSize;
uniform float SSAORadius;
/*
** Computes Orientation matrix.
*/
mat3 GetOrientationMatrix(vec3 normal, vec3 rotation)
{
vec3 tangent = normalize(rotation - normal * dot(rotation, normal)); //Graham Schmidt process
vec3 bitangent = cross(normal, tangent);
return (mat3(tangent, bitangent, normal)); //Orientation according to the normal
}
/*
** Fragment shader entry point.
*/
void main(void)
{
float OcclusionFactor = 0.0f;
vec3 gNormal_CS = normalize(texture(
NormalSampler, VertexData_IN.TexCoords).xyz * 2.0f - 1.0f); //Normal vector in view space from GBuffer
vec3 rotationVec = normalize(texture(NoiseSampler,
VertexData_IN.TexCoords * NoiseTexOffset).xyz * 2.0f - 1.0f); //Rotation vector required for Graham Schmidt process
vec3 Origin_VS = texture(PositionSampler, VertexData_IN.TexCoords).xyz; //Origin vertex in view space from GBuffer
mat3 OrientMatrix = GetOrientationMatrix(gNormal_CS, rotationVec);
for (int idx = 0; idx < SSAOKernelSize; idx++) //For each sample (64 iterations)
{
vec4 Sample_VS = vec4(Origin_VS + OrientMatrix * SSAOKernel[idx], 1.0f); //Sample translated in view space
vec4 Sample_HS = ProjMatrix * Sample_VS; //Sample in homogeneus space
vec3 Sample_CS = Sample_HS.xyz /= Sample_HS.w; //Perspective dividing (clip space)
vec2 texOffset = Sample_CS.xy * 0.5f + 0.5f; //Recover sample texture coordinates
vec3 SampleDepth_VS = texture(PositionSampler, texOffset).xyz; //Sample depth in view space
if (Sample_VS.z < SampleDepth_VS.z)
if (length(Sample_VS.xyz - SampleDepth_VS) <= SSAORadius)
OcclusionFactor += 1.0f; //Occlusion accumulation
}
OcclusionFactor = 1.0f - (OcclusionFactor / float(SSAOKernelSize));
FragColor = vec4(OcclusionFactor);
FragColor *= AmbientIntensity;
}
And here's the result (without blur render pass):
Until here all seems to be correct.
II) The performance
I noticed NSight Debugger a very strange behaviour concerning the performance:
If I move my camera closer and closer toward the dragon the performances are drastically impacted.
But, in my mind, it should be not the case because SSAO algorithm is apply in Screen-Space and do not depend on the number of primitives of the dragon for example.
Here's 3 screenshots with 3 different camera positions (with those 3 case all 1024*768 pixel shaders are executed using all the same algorithm):
a) GPU idle : 40% (pixel impacted: 100%)
b) GPU idle : 25% (pixel impacted: 100%)
c) GPU idle : 2%! (pixel impacted: 100%)
My rendering engine uses in my example exaclly 2 render passes:
the Material Pass (filling the position and normal samplers)
the Ambient pass (filling the SSAO texture)
I thought the problem comes from the addition of the execution of these two passes but it's not the case because I've added in my client code a condition to not compute for nothing the material pass if the camera is stationary. So when I took these 3 pictures above there was just the Ambient Pass executed. So this lack of performance in not related to the material pass. An other argument I could give you is if I remove the dragon mesh (the scene with just the plane) the result is the same: more my camera is close to the plane, more the lack of performance is huge!
For me this behaviour is not logical! Like I said above, in these 3 cases all the pixel shaders are executed applying exactly the same pixel shader code!
Now I noticed another strange behaviour if I change a little piece of code directly within the fragment shader:
If I replace the line:
FragColor = vec4(OcclusionFactor);
By the line:
FragColor = vec4(1.0f, 1.0f, 1.0f, 1.0f);
The lack of performance disappears!
It means that if the SSAO code is correctly executed (I tried to place some break points during the execution to check it) and I don't use this OcclusionFactor at the end to fill the final output color, so there is no lack of performance!
I think we can conclude that the problem does not come from the shader code before the line "FragColor = vec4(OcclusionFactor);"... I think.
How can yo explain a such behaviour?
I tried a lot of combination of code both in the client code and in the fragment shader code but I can't find the solution to this problem! I'm really lost.
Thank you very much in advance for your help!

The short answer is cache efficiency.
To understand this let's look at the following lines from the inner loop:
vec4 Sample_VS = vec4(Origin_VS + OrientMatrix * SSAOKernel[idx], 1.0f); //Sample translated in view space
vec4 Sample_HS = ProjMatrix * Sample_VS; //Sample in homogeneus space
vec3 Sample_CS = Sample_HS.xyz /= Sample_HS.w; //Perspective dividing (clip space)
vec2 texOffset = Sample_CS.xy * 0.5f + 0.5f; //Recover sample texture coordinates
vec3 SampleDepth_VS = texture(PositionSampler, texOffset).xyz; //Sample depth in view space
What you are doing here is:
Translate orignal point in view space
Transform it to clip space
Sample the texture
So how does that correspond to cache efficiency?
Caches work well when accessing neighbouring pixels. For example if you are using a gaussian blur you are accessing only the neighbours, which have a high probability to be already loaded in the cache.
So let's say your object is now very far away. Then the pixels sampled in clip space are also very close to the orignal point -> high locality -> good cache performance.
If the camera is very close to your object, the sample points generated are further away (in clip space) and you are getting a random memory access pattern. That will decrease your performance drastically although you didn't actually do more operations.
Edit:
To improve performance you could reconstruct the view space position from the depth buffer of the previous pass.
If you're using a 32 bit depth buffer that decreases the amount of data required for one sample from 12 byte to 4 byte.
The position reconstruciton looks like this:
vec4 reconstruct_vs_pos(vec2 tc){
float depth = texture(depthTexture,tc).x;
vec4 p = vec4(tc.x,tc.y,depth,1) * 2.0f + 1.0f; //tranformed to unit cube [-1,1]^3
vec4 p_cs = invProj * p; //invProj: inverse projection matrix (pass this by uniform)
return p_cs / p_cs.w;
}

While you're at it, another optimization you can make is to render the SSAO texture at a reduced size, preferably half the size of your main viewport. If you do this, be sure to copy your depth texture to another half-size texture (glBlitFramebuffer) and sample your positions from that. I'd expect this to increase performance by an order of magnitude, especially in the worst-case scenario you've given.

Related

Is it possible to use a shader to map 3d coordinates with Mercator-like projection?

The background:
I am writing some terrain visualiser and I am trying to decouple the rendering from the terrain generation.
At the moment, the generator returns some array of triangles and colours, and these are bound in OpenGL by the rendering code (using OpenTK).
So far I have a very simple shader which handles the rotation of the sphere.
The problem:
I would like the application to be able to display the results either as a 3D object, or as a 2D projection of the sphere (let's assume Mercator for simplicity).
I had thought, this would be simple — I should compile an alternative shader for such cases. So, I have a vertex shader which almost works:
precision highp float;
uniform mat4 projection_matrix;
uniform mat4 modelview_matrix;
in vec3 in_position;
in vec3 in_normal;
in vec3 base_colour;
out vec3 normal;
out vec3 colour2;
vec3 fromSphere(in vec3 cart)
{
vec3 spherical;
spherical.x = atan(cart.x, cart.y) / 6;
float xy = sqrt(cart.x * cart.x + cart.y * cart.y);
spherical.y = atan(xy, cart.z) / 4;
spherical.z = -1.0 + (spherical.x * spherical.x) * 0.1;
return spherical;
}
void main(void)
{
normal = vec3(0,0,1);
normal = (modelview_matrix * vec4(in_normal, 0)).xyz;
colour2 = base_colour;
//gl_Position = projection_matrix * modelview_matrix * vec4(fromSphere(in_position), 1);
gl_Position = vec4(fromSphere(in_position), 1);
}
However, it has a couple of obvious issues (see images below)
Saw-tooth pattern where triangle crosses the cut meridian
Polar region is not well defined
3D case (Typical shader):
2D case (above shader)
Both of these seem to reduce to the statement "A triangle in 3-dimensional space is not always even a single polygon on the projection". (... and this is before any discussion about whether great circle segments from the sphere are expected to be lines after projection ...).
(the 1+x^2 term in z is already a hack to make it a little better - this ensures the projection not flat so that any stray edges (ie. ones that straddle the cut meridian) are safely behind the image).
The question: Is what I want to achieve possible with a VertexShader / FragmentShader approach? If not, what's the alternative? I think I can re-write the application side to pre-transform the points (and cull / add extra polygons where needed) but it will need to know where the cut line for the projection is — and I feel that this information is analogous to the modelViewMatrix in the 3D case... which means taking this logic out of the shader seems a step backwards.
Thanks!

OpenGL shader to shade each face similar to MeshLab's visualizer

I have very basic OpenGL knowledge, but I'm trying to replicate the shading effect that MeshLab's visualizer has.
If you load up a mesh in MeshLab, you'll realize that if a face is facing the camera, it is completely lit and as you rotate the model, the lighting changes as the face that faces the camera changes. I loaded a simple unit cube with 12 faces in MeshLab and captured these screenshots to make my point clear:
Model loaded up (notice how the face is completely gray):
Model slightly rotated (notice how the faces are a bit darker):
More rotation (notice how all faces are now darker):
Off the top of my head, I think the way it works is that it is somehow assigning colors per face in the shader. If the angle between the face normal and camera is zero, then the face is fully lit (according to the color of the face), otherwise it is lit proportional to the dot product between the normal vector and the camera vector.
I already have the code to draw meshes with shaders/VBO's. I can even assign per-vertex colors. However, I don't know how I can achieve a similar effect. As far as I know, fragment shaders work on vertices. A quick search revealed questions like this. But I got confused when the answers talked about duplicate vertices.
If it makes any difference, in my application I load *.ply files which contain vertex position, triangle indices and per-vertex colors.
Results after the answer by #DietrichEpp
I created the duplicate vertices array and used the following shaders to achieve the desired lighting effect. As can be seen in the posted screenshot, the similarity is uncanny :)
The vertex shader:
#version 330 core
uniform mat4 projection_matrix;
uniform mat4 model_matrix;
uniform mat4 view_matrix;
in vec3 in_position; // The vertex position
in vec3 in_normal; // The computed vertex normal
in vec4 in_color; // The vertex color
out vec4 color; // The vertex color (pass-through)
void main(void)
{
gl_Position = projection_matrix * view_matrix * model_matrix * vec4(in_position, 1);
// Compute the vertex's normal in camera space
vec3 normal_cameraspace = normalize(( view_matrix * model_matrix * vec4(in_normal,0)).xyz);
// Vector from the vertex (in camera space) to the camera (which is at the origin)
vec3 cameraVector = normalize(vec3(0, 0, 0) - (view_matrix * model_matrix * vec4(in_position, 1)).xyz);
// Compute the angle between the two vectors
float cosTheta = clamp( dot( normal_cameraspace, cameraVector ), 0,1 );
// The coefficient will create a nice looking shining effect.
// Also, we shouldn't modify the alpha channel value.
color = vec4(0.3 * in_color.rgb + cosTheta * in_color.rgb, in_color.a);
}
The fragment shader:
#version 330 core
in vec4 color;
out vec4 out_frag_color;
void main(void)
{
out_frag_color = color;
}
The uncanny results with the unit cube:

It looks like the effect is a simple lighting effect with per-face normals. There are a few different ways you can achieve per-face normals:
You can create a VBO with a normal attribute, and then duplicate vertex position data for faces which don't have the same normal. For example, a cube would have 24 vertexes instead of 8, because the "duplicates" would have different normals.
You can use a geometry shader which calculates a per-face normal.
You can use dFdx() and dFdy() in the fragment shader to approximate the normal.
I recommend the first approach, because it is simple. You can simply calculate the normals ahead of time in your program, and then use them to calculate the face colors in your vertex shader.

This is simple flat shading, instead of using per vertex normals you can evaluate per face normal with this GLSL snippet:
vec3 x = dFdx(FragPos);
vec3 y = dFdy(FragPos);
vec3 normal = cross(x, y);
vec3 norm = normalize(normal);
then apply some diffuse lighting using norm:
// diffuse light 1
vec3 lightDir1 = normalize(lightPos1 - FragPos);
float diff1 = max(dot(norm, lightDir1), 0.0);
vec3 diffuse = diff1 * diffColor1;

Uniform point arrays and managing fragment shader coordinates systems

My aim is to pass an array of points to the shader, calculate their distance to the fragment and paint them with a circle colored with a gradient depending of that computation.
For example:
(From a working example I set up on shader toy)
Unfortunately it isn't clear to me how I should calculate and convert the coordinates passed for processing inside the shader.
What I'm currently trying is to pass two array of floats - one for x positions and one for y positions of each point - to the shader though a uniform. Then inside the shader iterate through each point like so:
#ifdef GL_ES
precision mediump float;
precision mediump int;
#endif
uniform float sourceX[100];
uniform float sourceY[100];
uniform vec2 resolution;
in vec4 gl_FragCoord;
varying vec4 vertColor;
varying vec2 center;
varying vec2 pos;
void main()
{
float intensity = 0.0;
for(int i=0; i<100; i++)
{
vec2 source = vec2(sourceX[i],sourceY[i]);
vec2 position = ( gl_FragCoord.xy / resolution.xy );
float d = distance(position, source);
intensity += exp(-0.5*d*d);
}
intensity=3.0*pow(intensity,0.02);
if (intensity<=1.0)
gl_FragColor=vec4(0.0,intensity*0.5,0.0,1.0);
else if (intensity<=2.0)
gl_FragColor=vec4(intensity-1.0, 0.5+(intensity-1.0)*0.5,0.0,1.0);
else
gl_FragColor=vec4(1.0,3.0-intensity,0.0,1.0);
}
But that doesn't work - and I believe it may be because I'm trying to work with the pixel coordinates without properly translating them. Could anyone explain to me how to make this work?
Update:
The current result is:
The sketch's code is:
PShader pointShader;
float[] sourceX;
float[] sourceY;
void setup()
{
size(1024, 1024, P3D);
background(255);
sourceX = new float[100];
sourceY = new float[100];
for (int i = 0; i<100; i++)
{
sourceX[i] = random(0, 1023);
sourceY[i] = random(0, 1023);
}
pointShader = loadShader("pointfrag.glsl", "pointvert.glsl");
shader(pointShader, POINTS);
pointShader.set("sourceX", sourceX);
pointShader.set("sourceY", sourceY);
pointShader.set("resolution", float(width), float(height));
}
void draw()
{
for (int i = 0; i<100; i++) {
strokeWeight(60);
point(sourceX[i], sourceY[i]);
}
}
while the vertex shader is:
#define PROCESSING_POINT_SHADER
uniform mat4 projection;
uniform mat4 transform;
attribute vec4 vertex;
attribute vec4 color;
attribute vec2 offset;
varying vec4 vertColor;
varying vec2 center;
varying vec2 pos;
void main() {
vec4 clip = transform * vertex;
gl_Position = clip + projection * vec4(offset, 0, 0);
vertColor = color;
center = clip.xy;
pos = offset;
}

Update:
Based on the comments it seems you have confused two different approaches:
Draw a single full screen polygon, pass in the points and calculate the final value once per fragment using a loop in the shader.
Draw bounding geometry for each point, calculate the density for just one point in the fragment shader and use additive blending to sum the densities of all points.
The other issue is your points are given in pixels but the code expects a 0 to 1 range, so d is large and the points are black. Fixing this issue as #RetoKoradi describes should address the points being black, but I suspect you'll find ramp clipping issues when many are in close proximity. Passing points into the shader limits scalability and is inefficient unless the points cover the whole viewport.
As below, I think sticking with approach 2 is better. To restructure your code for it, remove the loop, don't pass in the array of points and use center as the point coordinate instead:
//calc center in pixel coordinates
vec2 centerPixels = (center * 0.5 + 0.5) * resolution.xy;
//find the distance in pixels (avoiding aspect ratio issues)
float dPixels = distance(gl_FragCoord.xy, centerPixels);
//scale down to the 0 to 1 range
float d = dPixels / resolution.y;
//write out the intensity
gl_FragColor = vec4(exp(-0.5*d*d));
Draw this to a texture (from comments: opengl-tutorial.org code and this question) with additive blending:
glEnable(GL_BLEND);
glBlendFunc(GL_ONE, GL_ONE);
Now that texture will contain intensity as it was after your original loop. In another fragment shader during a full screen pass (draw a single triangle that covers the whole viewport), continue with:
uniform sampler2D intensityTex;
...
float intensity = texture2D(intensityTex, gl_FragCoord.xy/resolution.xy).r;
intensity = 3.0*pow(intensity, 0.02);
...
The code you have shown is fine, assuming you're drawing a full screen polygon so the fragment shader runs once for each pixel. Potential issues are:
resolution isn't set correctly
The point coordinates aren't in the range 0 to 1 on the screen.
Although minor, d will be stretched by the aspect ratio, so you might be better scaling the points up to pixel coordinates and diving distance by resolution.y.
This looks pretty similar to creating a density field for 2D metaballs. For performance you're best off limiting the density function for each point so it doesn't go on forever, then spatting discs into a texture using additive blending. This saves processing those pixels a point doesn't affect (just like in deferred shading). The result is the density field, or in your case per-pixel intensity.
These are a little related:
2D OpenGL ES Metaballs on android (no answers yet)
calculate light volume radius from intensity
gl_PointSize Corresponding to World Space Size

It looks like the point center and fragment position are in different coordinate spaces when you subtract them:
vec2 source = vec2(sourceX[i],sourceY[i]);
vec2 position = ( gl_FragCoord.xy / resolution.xy );
float d = distance(position, source);
Based on your explanation and code, source and source are in window coordinates, meaning that they are in units of pixels. gl_FragCoord is in the same coordinate space. And even though you don't show that directly, I assume that resolution is the size of the window in pixels.
This means that:
vec2 position = ( gl_FragCoord.xy / resolution.xy );
calculates the normalized position of the fragment within the window, in the range [0.0, 1.0] for both x and y. But then on the next line:
float d = distance(position, source);
you subtrace source, which is still in window coordinates, from this position in normalized coordinates.
Since it looks like you wanted the distance in normalized coordinates, which makes sense, you'll also need to normalize source:
vec2 source = vec2(sourceX[i],sourceY[i]) / resolution.xy;

SSAO not displaying correct results, mostly no visible occlusion

I'm following the tutorial by John Chapman (http://john-chapman-graphics.blogspot.nl/2013/01/ssao-tutorial.html) to implement SSAO in a deferred renderer. The input buffers to the SSAO shaders are:
World-space positions with linearized depth as w-component.
World-space normal vectors
Noise 4x4 texture
I'll first list the complete shader and then briefly walk through the steps:
#version 330 core
in VS_OUT {
vec2 TexCoords;
} fs_in;
uniform sampler2D texPosDepth;
uniform sampler2D texNormalSpec;
uniform sampler2D texNoise;
uniform vec3 samples[64];
uniform mat4 projection;
uniform mat4 view;
uniform mat3 viewNormal; // transpose(inverse(mat3(view)))
const vec2 noiseScale = vec2(800.0f/4.0f, 600.0f/4.0f);
const float radius = 5.0;
void main( void )
{
float linearDepth = texture(texPosDepth, fs_in.TexCoords).w;
// Fragment's view space position and normal
vec3 fragPos_World = texture(texPosDepth, fs_in.TexCoords).xyz;
vec3 origin = vec3(view * vec4(fragPos_World, 1.0));
vec3 normal = texture(texNormalSpec, fs_in.TexCoords).xyz;
normal = normalize(normal * 2.0 - 1.0);
normal = normalize(viewNormal * normal); // Normal from world to view-space
// Use change-of-basis matrix to reorient sample kernel around origin's normal
vec3 rvec = texture(texNoise, fs_in.TexCoords * noiseScale).xyz;
vec3 tangent = normalize(rvec - normal * dot(rvec, normal));
vec3 bitangent = cross(normal, tangent);
mat3 tbn = mat3(tangent, bitangent, normal);
// Loop through the sample kernel
float occlusion = 0.0;
for(int i = 0; i < 64; ++i)
{
// get sample position
vec3 sample = tbn * samples[i]; // From tangent to view-space
sample = sample * radius + origin;
// project sample position (to sample texture) (to get position on screen/texture)
vec4 offset = vec4(sample, 1.0);
offset = projection * offset;
offset.xy /= offset.w;
offset.xy = offset.xy * 0.5 + 0.5;
// get sample depth
float sampleDepth = texture(texPosDepth, offset.xy).w;
// range check & accumulate
// float rangeCheck = abs(origin.z - sampleDepth) < radius ? 1.0 : 0.0;
occlusion += (sampleDepth <= sample.z ? 1.0 : 0.0);
}
occlusion = 1.0 - (occlusion / 64.0f);
gl_FragColor = vec4(vec3(occlusion), 1.0);
}
The result is however not pleasing. The occlusion buffer is mostly all white and doesn't show any occlusion. However, if I move really close to an object I can see some weird noise-like results as you can see below:
This is obviously not correct. I've done a fair share of debugging and believe all the relevant variables are correctly passed around (they all visualize as colors). I do the calculations in view-space.
I'll briefly walk through the steps (and choices) I've taken in case any of you figure something goes wrong in one of the steps.
view-space positions/normals
John Chapman retrieves the view-space position using a view ray and a linearized depth value. Since I use a deferred renderer that already has the world-space positions per fragment I simply take those and multiply them with the view matrix to get them to view-space.
I take a similar approach for the normal vectors. I take the world-space normal vectors from a buffer texture, transform them to [-1,1] range and multiply them with transpose(inverse(mat3(..))) of view matrix.
The view-space position and normals are visualized as below:
This looks correct to me.
Orient hemisphere around normal
The steps to create the tbn matrix are the same as described in John Chapman's tutorial. I create the noise texture as follows:
std::vector<glm::vec3> ssaoNoise;
for (GLuint i = 0; i < noise_size; i++)
{
glm::vec3 noise(randomFloats(generator) * 2.0 - 1.0, randomFloats(generator) * 2.0 - 1.0, 0.0f);
noise = glm::normalize(noise);
ssaoNoise.push_back(noise);
}
...
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB16F, 4, 4, 0, GL_RGB, GL_FLOAT, &ssaoNoise[0]);
I can visualize the noise in the fragment shader so that seems to work.
sample depths
I transform all samples from tangent to view-space (samples are random between [-1,1] on xy axis and [0,1] on z-axis and translate them to fragment's current view-space position (origin).
I then sample from linearized depth buffer (which I visualize below when looking close to an object):
and finally compare sampled depth values to current fragment's depth value and add occlusion values. Note that I do not perform a range-check since I don't believe that is the cause of this behavior and I'd rather keep it as minimal as possible for now.
I don't know what is causing this behavior. I believe it is somewhere in sampling the depth values. As far as I can tell I am working in the right coordinate system, linearized depth values are in view-space as well and all variables are set somewhat properly.

Simple curiosity about relation between texture mapping and shader program using Opengl/GLSL

I'm working on a small homemade 3D engine and more precisely on rendering optimization. Until here I developped a sort algorithm whose goal is to gather a maximum of geometry (meshes) which have in common the same material properties and same shader program into batches. This way I minimize the state changes (glBindXXX) and draw calls (glDrawXXX). So, if I have a scene composed by 10 boxes, all sharing the same texture and need to be rendered with the same shader program (for example including ADS lighting) so all the vertices of these meshes will be merged into a unique VBO, the texture will be bind just one time and one simple draw call only will be needed.
Scene description:
- 10 meshes (boxes) mapped with 'texture_1'
Pseudo-code (render):
shaderProgram_1->Bind()
{
glActiveTexture(texture_1)
DrawCall(render 10 meshes with 'texture_1')
}
But now I want to be sure one thing: Let's assume our scene is always composed by the same 10 boxes but this time 5 of them will be mapped with a different texture (not multi-texturing, just simple texture mapping).
Scene description:
- 5 boxes with 'texture_1'
- 5 boxes with 'texture_2'
Pseudo-code (render):
shaderProgram_1->Bind()
{
glActiveTexture(texture_1)
DrawCall(render 5 meshes with 'texture_1')
}
shaderProgram_2->Bind()
{
glActiveTexture(texture_2)
DrawCall(render 5 meshes with 'texture_2')
}
And my fragment shader has a unique declaration of sampler2D (the goal of my shader program is to render geometry with simple texture mapping and ADS lighting):
uniform sampler2D ColorSampler;
I want to be sure it's not possible to draw this scene with a unique draw call (like it was possible with my previous example (1 batch was needed)). It was possible because I used the same texture for the whole geometry. I think this time I will need 2 batches hence 2 draw calls and of course for the rendering of each batch I will bind the 'texture_1' and 'texture_2' before each draw call (one for the first 5 boxes and an other one for the 5 others).
To sum up, if all the meshes are mapped with a simple texture (simple texture mapping):
5 with a red texture (texture_red)
5 with a blue texture (texture_blue)
Is it possible to render the scene with a simple draw call? I don't think so because my pseudo code will look like this:
Pseudo-code:
shaderProgram->Bind()
{
glActiveTexture(texture_blue)
glActiveTexture(texture_red)
DrawCall(render 10 meshes)
}
I think it's impossible to differentiate the 2 textures when my fragment shader has to compute the pixel color using a unique sampler2D uniform variable (simple texture mapping).
Here's my fragment shader:
#version 440
#define MAX_LIGHT_COUNT 1
/*
** Output color value.
*/
layout (location = 0) out vec4 FragColor;
/*
** Inputs.
*/
in vec3 Position;
in vec2 TexCoords;
in vec3 Normal;
/*
** Material uniforms.
*/
uniform MaterialBlock
{
vec3 Ka, Kd, Ks;
float Shininess;
} MaterialInfos;
uniform sampler2D ColorSampler;
struct Light
{
vec4 Position;
vec3 La, Ld, Ls;
float Kc, Kl, Kq;
};
uniform struct Light LightInfos[MAX_LIGHT_COUNT];
uniform unsigned int LightCount;
/*
** Light attenuation factor.
*/
float getLightAttenuationFactor(vec3 lightDir, Light light)
{
float lightAtt = 0.0f;
float dist = 0.0f;
dist = length(lightDir);
lightAtt = 1.0f / (light.Kc + (light.Kl * dist) + (light.Kq * pow(dist, 2)));
return (lightAtt);
}
/*
** Basic phong shading.
*/
vec3 Basic_Phong_Shading(vec3 normalDir, vec3 lightDir, vec3 viewDir, int idx)
{
vec3 Specular = vec3(0.0f);
float lambertTerm = max(dot(lightDir, normalDir), 0.0f);
vec3 Ambient = LightInfos[idx].La * MaterialInfos.Ka;
vec3 Diffuse = LightInfos[idx].Ld * MaterialInfos.Kd * lambertTerm;
if (lambertTerm > 0.0f)
{
vec3 reflectDir = reflect(-lightDir, normalDir);
Specular = LightInfos[idx].Ls * MaterialInfos.Ks * pow(max(dot(reflectDir, viewDir), 0.0f), MaterialInfos.Shininess);
}
return (Ambient + Diffuse + Specular);
}
/*
** Fragment shader entry point.
*/
void main(void)
{
vec3 LightIntensity = vec3(0.0f);
vec4 texDiffuseColor = texture2D(ColorSampler, TexCoords);
vec3 normalDir = (gl_FrontFacing ? -Normal : Normal);
for (int idx = 0; idx < LightCount; idx++)
{
vec3 lightDir = vec3(LightInfos[idx].Position) - Position.xyz;
vec3 viewDir = -Position.xyz;
float lightAttenuationFactor = getLightAttenuationFactor(lightDir, LightInfos[idx]);
LightIntensity += Basic_Phong_Shading(
-normalize(normalDir), normalize(lightDir), normalize(viewDir), idx
) * lightAttenuationFactor;
}
FragColor = vec4(LightIntensity, 1.0f) * texDiffuseColor;
}
Are you agree with me?

It's possible if you either: (i) consider it to be a multitexturing problem where the function per fragment just picks between the two incoming fragments (ideally using mix with a coefficient of 0.0 or 1.0, not genuine branching); or (ii) composite your two textures into one texture (subject to your ability to wrap and clamp texture coordinates efficiently — watch out for those dependent reads — and maximum texture size constraints).
It's an open question as to whether either of these things would improve performance. Definitely go with (ii) if you can.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js