Shader compiler on Alderlake GT1: SIMD32 shader inefficient - glsl

When I compile and link my GLSL shader on an Alderlake GT1 integrated GPU, I get the warning:
SIMD32 shader inefficient
This warning is reported via glDebugMessageCallbackARB mechanism.
I would like to investigate if I can avoid this inefficiency, but I am not sure how to get more information on this warning.
The full output from the driver, for this shader:
WRN [Shader Compiler][Other]{Notification}: VS SIMD8 shader: 11 inst, 0 loops, 40 cycles, 0:0 spills:fills, 1 sends, scheduled with mode top-down, Promoted 0 constants, compacted 176 to 112 bytes.
WRN [API][Performance]{Notification}: SIMD32 shader inefficient
WRN [Shader Compiler][Other]{Notification}: FS SIMD8 shader: 5 inst, 0 loops, 20 cycles, 0:0 spills:fills, 1 sends, scheduled with mode top-down, Promoted 0 constants, compacted 80 to 48 bytes.
WRN [Shader Compiler][Other]{Notification}: FS SIMD16 shader: 5 inst, 0 loops, 28 cycles, 0:0 spills:fills, 1 sends, scheduled with mode top-down, Promoted 0 constants, compacted 80 to 48 bytes.
The messages are created during the fragment shader compiling, by the way.
My vertex shader:
#version 150
in mediump vec2 position;
out lowp vec4 clr;
uniform mediump vec2 rotx;
uniform mediump vec2 roty;
uniform mediump vec2 translation;
uniform lowp vec4 colour;
void main()
{
gl_Position.x = dot( position, rotx ) + translation.x;
gl_Position.y = dot( position, roty ) + translation.y;
gl_Position.z = 1.0;
gl_Position.w = 1.0;
clr = colour;
}
My fragment shader:
#version 150
in lowp vec4 clr;
out lowp vec4 fragColor;
void main()
{
fragColor = clr;
}
That said, I doubt it is shader specific, because it seems to report this for every shader I use on this platform?
GL RENDERER: Mesa Intel(R) Graphics (ADL-S GT1)
OS: Ubuntu 22.04
GPU: AlderLake-S GT1
API: OpenGL 3.2 Core Profile
GLSL Version: 150

This seems to come from an Intel fragment shader compiler, that is part of Mesa.
brw_fs.cpp
Looking at this code, it seems that the compiler has three options: to use SIMD8, SIMD16 or SIMD32. This refers to widths, not to bits. So SIMD32 is 32-wide SIMD.
The compiler uses a heuristic to see if the SIMD32 version will be efficient, and if not, it skips that option.
Of course, this heuristic can get it wrong, so there is an option to force the BRW compiler to try SIMD32 regardless.
The environment variable setting INTEL_DEBUG=do32 will tell the compiler to try the SIMD32 as well.
When I tested this on my system, I indeed observed that the driver now reports three different results:
WRN [Shader Compiler][Other]{Notification}: FS SIMD8 shader: 5 inst, 0 loops, 20 cycles, 0:0 spills:fills, 1 sends, scheduled with mode top-down, Promoted 0 constants, compacted 80 to 48 bytes.
WRN [Shader Compiler][Other]{Notification}: FS SIMD16 shader: 5 inst, 0 loops, 28 cycles, 0:0 spills:fills, 1 sends, scheduled with mode top-down, Promoted 0 constants, compacted 80 to 48 bytes.
WRN [Shader Compiler][Other]{Notification}: FS SIMD32 shader: 10 inst, 0 loops, 928 cycles, 0:0 spills:fills, 2 sends, scheduled with mode top-down, Promoted 0 constants, compacted 160 to 96 bytes.
Observe that in this case, the heuristic definitely got it right: almost 50 times more cycles than SIMD8?
Fun fact: BRW stands for Broadwater, gen4 graphics. But gen12 Intel GPUs still use this compiler.

Related

OpenGL procedural texture antialiasing

I`ve made a grid using a simple GLSL shader, passing texture coordinates to fragment shader. It was applied onto a large scaled plane.
Fragment shader:
#version 330 core
out vec4 fragColor;
smooth in vec2 f_TexCoord;
vec4 gridColor;
void main()
{
if(fract(f_TexCoord.x / 0.0005f) < 0.025f || fract(f_TexCoord.y / 0.0005f) < 0.025f)
gridColor = vec4(0.75, 0.75, 0.75, 1.0);
else
gridColor = vec4(0);
// Check for alpha transparency
if(gridColor.a != 1)
discard;
fragColor = gridColor;
}
As you can see the lines are not smooth and they start to "flickering" at the horizon.
Is it possible to apply some sort of filtering/antialiasing on it? I've tried to increase number of samples (up to 4, because higher values gives me a qt error), but it has no affect on shader.
Switch to GLSL version 4.20 (at least), activate multisampling and use the Auxiliary Storage Qualifier sample for the vertex shader output (and fragment shader input):
#version 420 core
sample smooth in vec2 f_TexCoord;
The qualifier causes per-sample interpolation.

OpenGL: Performing TexCoord calculations within shader, bad practice?

I’m currently working on a 2D graphics engine for a game. My shader takes in 2 UV offset floats and calculates the TexCoord by applying the offset.
Here is a sample of my vertex shader:
#shader vertex
#version 330 core
layout(location = 0) in vec4 position;
layout(location = 1) in vec2 texCoord;
out vec2 v_TexCoord;
uniform float u_Offset;
uniform float v_Offset;
void main()
{
gl_Position = position;
v_TexCoord = vec2(texCoord.x + u_Offset, texCoord.y + v_Offset);
};
Should I worry about this causing performance issues in the long run? How big of a difference would it make if I were to perform the calculations CPU side before passing the final UV in, and is it worth optimizing?
Modifying the attributes data of a mesh is the main purpose of programmable pipeline, moving it to the CPU will be translated into a peformance downgrade. Also, as pointed out by #BDL , you will need to re-send the data to the GPU, which is the worst part of the whole process.
A different case is when you are performing a calculation which is the same for all the shader instances, which will be more appropiate to perform such operation on the CPU and upload it as an uniform.

GLSL alpha test optimized out on NVIDIA

I have a small fragment shader on glsl, that is used when I want to render shadow texture. To have the desired shadow from textures with alpha channel (like leaves), alpha test is used.
#version 130
in lowp vec2 UV; //comes from vertex shader, works fine
uniform sampler2D DiffuseTextureSampler; //ok, alpha channel IS present in passed texture
void main(){
lowp vec4 MaterialDiffuseColor = texture( DiffuseTextureSampler, UV ).rgba;
if(MaterialDiffuseColor.a < 0.5)
{
discard;
}
//no color output is required at this stage
}
The following shader works fine on Intel cards (HD 520 && HD Graphics 3000), but on NVIDIA (420M and GTX 660, on Win7 and Linux Mint 17.3 respectively, using the latest drivers, 37x.xx something...) the alpha test does not work. Shader does compile without any errors, so it seems that NVIDIA optimizer is doing weird stuff.
I've copy-pasted parts of 'fullbright' shader to the 'shadow' shader, and got to this odd result, that does work as intended, though a lot of useless stuff (for shadow rendering) is done.
#version 130
in lowp vec2 UV;
in lowp float fade;
out lowp vec4 color;
uniform sampler2D DiffuseTextureSampler;
uniform bool overlayBool;
uniform lowp float translucencyAlpha;
uniform lowp float useImageAlpha;
void main(){
lowp vec4 MaterialDiffuseColor = texture( DiffuseTextureSampler, UV ).rgba;
//notice the added OR :/
if(MaterialDiffuseColor.a < 0.5 || overlayBool)
{
discard;
}
//next line is meaningless
color = vec4(0.0, 0.0, 1.0, min(translucencyAlpha * useImageAlpha, fade));
}
If I remove any uniform, or change something (like, replace the min function with some arithmetic, save the declaration of uniform variable but do not use it etc), the alpha test breaks again.
Just outputting the color does not work (ie color = vec4(1.0, 1.0, 1.0, 1.0); has no effect).
I tried using the
#pragma optimize (off)
but it did not help.
By the way, when alpha test is broken, the expression "MaterialDiffuseColor.a == 0.0" is true.
Sorry, if it is a dumb question, but what causes such behaviour on NVIDIA cards and what can I do to avoid it? Thank you.

OpenGL shader version error

I am using Visual Studio 2013 but running under Visual Studio 2010 compiler.
I am running Windows 8 in bootcamp on a Macbook Pro with intel iris pro 5200 graphics.
I have a very simple vertex and fragment shader, I am just displaying simple primitives in a window but I am getting warnings in console stating..
OpenGL Debug Output: Source(Shader Comiler), type(Other), Priority(Medium), GLSL compile warning(s) for shader 3, "": WARNING: -1:65535: #version : version number deprecated in OGL 3.0 forward compatible context driver
Anyone have any idea how to get rid of these annoying errors..?
Vertex Shader Code:
#version 330 core
uniform mat4 modelMatrix;
uniform mat4 viewMatrix;
uniform mat4 projMatrix;
in vec3 position;
in vec2 texCoord;
in vec4 colour;
out Vertex {
vec2 texCoord;
vec4 colour;
} OUT;
void main(void) {
gl_Position = (projMatrix * viewMatrix * modelMatrix) * vec4(position, 1.0);
OUT.texCoord = texCoord;
OUT.colour = colour;
}
Frag Shader code
#version 330 core
in Vertex {
vec2 texCoord;
vec4 colour;
} IN;
out vec4 color;
void main() {
color= IN.colour;
//color= vec4(1,1,1,1);
}
I always knew Intel drivers were bad, but this is ridiculous. The #version directive is NOT deprecated in GL 3.0. In fact it is more important than ever beginning with GL 3.2, because in addition to the number you can also specify core (default) or compatibility.
Nevertheless, that is not an actual error. It is an invalid warning, and having OpenGL debug output setup is why you keep seeing it. You can ignore it. AMD seems to be the only vendor that uses debug output in a useful way. NV almost never outputs anything, opting instead to crash... and Intel appears to be spouting nonsense.
It is possible that what the driver is really trying to tell you is that you have an OpenGL 3.0 context and you are using a GLSL 3.30 shader. If that is the case, this has got to be the stupidest way I have ever seen of doing that.
Have you tried #version 130 instead? If you do this, the interface blocks (e.g. in Vertex { ...) should generate parse errors, but it would at least rule out the only interpretation of this warning that makes any sense.
There is another possibility, that makes a lot more sense in the end. The debug output mentions this is related to shader object #3. While there is no guarantee that shader names are assigned sequentially beginning with 0, this is usually the case. You have only shown a total of 2 shaders here, #3 would imply the 4th shader your software loaded.
Are you certain that these are the shaders causing the problem?

converting GLSL #130 segment to #330

I have the following piece of shader code that works perfectly with GLSL #130, but I would like to convert it to code that works with version #330 (as somehow the #130 version doesn't work on my Ubuntu machine with a Geforce 210; the shader does nothing). After several failed attempts (I keep getting undescribed link errors) I've decided to ask for some help. The code below dynamically changes the contrast and brightness of a texture using the uniform variables Brightness and Contrast. I have implemented it in Python using PyOpenGL:
def createShader():
"""
Compile a shader that adjusts contrast and brightness of active texture
Returns
OpenGL.shader - reference to shader
dict - reference to variables that can be passed to the shader
"""
fragmentShader = shaders.compileShader("""#version 130
uniform sampler2D Texture;
uniform float Brightness;
uniform float Contrast;
uniform vec4 AverageLuminance;
void main(void)
{
vec4 texColour = texture2D(Texture, gl_TexCoord[0].st);
gl_FragColor = mix(texColour * Brightness,
mix(AverageLuminance, texColour, Contrast), 0.5);
}
""", GL_FRAGMENT_SHADER)
shader = shaders.compileProgram(fragmentShader)
uniform_locations = {
'Brightness': glGetUniformLocation( shader, 'Brightness' ),
'Contrast': glGetUniformLocation( shader, 'Contrast' ),
'AverageLuminance': glGetUniformLocation( shader, 'AverageLuminance' ),
'Texture': glGetUniformLocation( shader, 'Texture' )
}
return shader, uniform_locations
I've looked up the changes that need to made for the new GLSL version and tried changing the fragment shader code to the following, but then only get non-descriptive Link errors:
fragmentShader = shaders.compileShader("""#version 330
uniform sampler2D Texture;
uniform float Brightness;
uniform float Contrast;
uniform vec4 AverageLuminance;
in vec2 TexCoord;
out vec4 FragColor;
void main(void)
{
vec4 texColour = texture2D(Texture, TexCoord);
FragColor = mix(texColour * Brightness,
mix(AverageLuminance, texColour, Contrast), 0.5);
}
""", GL_FRAGMENT_SHADER)
Is there anyone that can help me with this conversion?
I doubt that raising the shader version profile will solve any issue. #version 330 is OpenGL-3.3 and according to the NVidia product website the maximum OpenGL version supported by the GeForce 210 is OpenGL-3.1, i.e. #version 140
I created no vertex shader cause I didn't think I'd need one (I wouldn't know what I should make it do). It worked before without any vertex shader as well.
Probably only as long as you didn't use a fragment shader or before you were attempting to use a texture. The fragment shader needs input variables, coming from a vertex shader, to have something it can use as texture coordinates. TexCoord is not a built-in variable (and with higher GLSL versions any builtin variables suitable for the job have been removed), so you need to fill that with value (and sense) in a vertex shader.
the glGetString(GL_VERSION) on the NVidia machine reads out OpenGL version 3.3.0. This is Ubuntu, so it might be possible that it differs with the windows specifications?
Do you have the NVidia propriatary drivers installed? And are they actually used? Check with glxinfo or glGetString(GL_RENDERER). OpenGL-3.3 is not too far from OpenGL-3.1 and in theory OpenGL major versions map to hardware capabilities.