detecting if a gl_LightSource is enabled in glsl compatibility profile - opengl

I am writing a GLSL program as part of a plugin running inside of Maya, a closed-source 3D application. Maya uses the fixed function pipeline to define it's lights, so my program has to get it's light information from the gl_LightSource array using the compatibility profile. My light evaluation is working fine (thanks Nicol Bolas) except for one thing, I cannot figure out how to determine if a particular light in the array is enabled or disabled. Here is what I have so far:
#version 410 compatibility
vec3 incidentLight (in gl_LightSourceParameters light, in vec3 position)
{
if (light.position.w == 0) {
return normalize (-light.position.xyz);
} else {
vec3 offset = position - light.position.xyz;
float distance = length (offset);
vec3 direction = normalize (offset);
float intensity;
if (light.spotCutoff <= 90.) {
float spotCos = dot (direction, normalize (light.spotDirection));
intensity = pow (spotCos, light.spotExponent) *
step (light.spotCosCutoff, spotCos);
} else {
intensity = 1.;
}
intensity /= light.constantAttenuation +
light.linearAttenuation * distance +
light.quadraticAttenuation * distance * distance;
return intensity * direction;
}
}
void main ()
{
for (int i = 0; i < gl_MaxLights; ++i) {
if (/* ??? gl_LightSource[i] is enabled ??? */ 1) {
vec3 incident = incidentLight (gl_LightSource[i], position);
<snip>
}
}
<snip>
}
When Maya enables new lights my program works as expected but when Maya disables a previously enabled light, presumably using glDisable (GL_LIGHTi), it's parameters are not reset in the gl_LightSource array and gl_MaxLights obviously does not change, so my program continues to use that stale light information in it's shading computation. Although I am not showing it above, the light colors, for example gl_LightSource[i].diffuse, also continue to have stale non-zero values after they are disabled.
Maya draws all other geometry using the fixed-function pipline (no GLSL) and those objects correctly ignore disabled lights, how can I mimic this behavior in GLSL?

const vec4 AMBIENT_BLACK = vec4(0.0, 0.0, 0.0, 1.0);
const vec4 DEFAULT_BLACK = vec4(0.0, 0.0, 0.0, 0.0);
bool isLightEnabled(in int i)
{
// A separate variable is used to get
// rid of a linker error.
bool enabled = true;
// If all the colors of the Light are set
// to BLACK then we know we don't need to bother
// doing a lighting calculation on it.
if ((gl_LightSource[i].ambient == AMBIENT_BLACK) &&
(gl_LightSource[i].diffuse == DEFAULT_BLACK) &&
(gl_LightSource[i].specular == DEFAULT_BLACK))
enabled = false;
return(enabled);
}

Unfortunately I looked at the GLSL spec and I don't see anything that provides this information. I also saw another thread which seemed to come to the same conclusion.
Is there any way you can modify the light values in your plugin, or add an extra uniform that can be used as an enable/disable flag?

Related

Stuck trying to optimize complex GLSL fragment shader

So first off, let me say that while the code works perfectly well from a visual point of view, it runs into very steep performance issues that get progressively worse as you add more lights. In its current form it's good as a proof of concept, or a tech demo, but is otherwise unusable.
Long story short, I'm writing a RimWorld-style game with real-time top-down 2D lighting. The way I implemented rendering is with a 3 layered technique as follows:
First I render occlusions to a single-channel R8 occlusion texture mapped to a framebuffer. This part is lightning fast and doesn't slow down with more lights, so it's not part of the problem:
Then I invoke my lighting shader by drawing a huge rectangle over my lightmap texture mapped to another framebuffer. The light data is stored in an array in an UBO and it uses the occlusion mapping in its calculations. This is where the slowdown happens:
And lastly, the lightmap texture is multiplied and added to the regular world renderer, this also isn't affected by the number of lights, so it's not part of the problem:
The problem is thus in the lightmap shader. The first iteration had many branches which froze my graphics driver right away when I first tried it, but after removing most of them I get a solid 144 fps at 1440p with 3 lights, and ~58 fps at 1440p with 20 lights. An improvement, but it scales very poorly. The shader code is as follows, with additional annotations:
#version 460 core
// per-light data
struct Light
{
vec4 location;
vec4 rangeAndstartColor;
};
const int MaxLightsCount = 16; // I've also tried 8 and 32, there was no real difference
layout(std140) uniform ubo_lights
{
Light lights[MaxLightsCount];
};
uniform sampler2D occlusionSampler; // the occlusion texture sampler
in vec2 fs_tex0; // the uv position in the large rectangle
in vec2 fs_window_size; // the window size to transform world coords to view coords and back
out vec4 color;
void main()
{
vec3 resultColor = vec3(0.0);
const vec2 size = fs_window_size;
const vec2 pos = (size - vec2(1.0)) * fs_tex0;
// process every light individually and add the resulting colors together
// this should be branchless, is there any way to check?
for(int idx = 0; idx < MaxLightsCount; ++idx)
{
const float range = lights[idx].rangeAndstartColor.x;
const vec2 lightPosition = lights[idx].location.xy;
const float dist = length(lightPosition - pos); // distance from current fragment to current light
// early abort, the next part is expensive
// this branch HAS to be important, right? otherwise it will check crazy long lines against occlusions
if(dist > range)
continue;
const vec3 startColor = lights[idx].rangeAndstartColor.yzw;
// walk between pos and lightPosition to find occlusions
// standard line DDA algorithm
vec2 tempPos = pos;
int lineSteps = int(ceil(abs(lightPosition.x - pos.x) > abs(lightPosition.y - pos.y) ? abs(lightPosition.x - pos.x) : abs(lightPosition.y - pos.y)));
const vec2 lineInc = (lightPosition - pos) / lineSteps;
// can I get rid of this loop somehow? I need to check each position between
// my fragment and the light position for occlusions, and this is the best I
// came up with
float lightStrength = 1.0;
while(lineSteps --> 0)
{
const vec2 nextPos = tempPos + lineInc;
const vec2 occlusionSamplerUV = tempPos / size;
lightStrength *= 1.0 - texture(occlusionSampler, vec2(occlusionSamplerUV.x, 1 - occlusionSamplerUV.y)).x;
tempPos = nextPos;
}
// the contribution of this light to the fragment color is based on
// its square distance from the light, and the occlusions between them
// implemented as multiplications
const float strength = max(0, range - dist) / range * lightStrength;
resultColor += startColor * strength * strength;
}
color = vec4(resultColor, 1.0);
}
I call this shader as many times as I need, since the results are additive. It works with large batches of lights or one by one. Performance-wise, I didn't notice any real change trying different batch numbers, which is perhaps a bit odd.
So my question is, is there a better way to look up for any (boolean) occlusions between my fragment position and light position in the occlusion texture, without iterating through every pixel by hand? Could render buffers perhaps help here (from what I've read they're for reading data back to system memory, I need it in another shader though)?
And perhaps, is there a better algorithm for what I'm doing here?
I can think of a couple routes for optimization:
Exact: apply a distance transform on the occlusion map: this will give you the distance to the nearest occluder at each pixel. After that you can safely step by that distance within the loop, instead of doing baby steps. This will drastically reduce the number of steps in open regions.
There is a very simple CPU-side algorithm to compute a DT, and it may suit you if your occluders are static. If your scene changes every frame, however, you'll need to search the literature for GPU side algorithms, which seem to be more complicated.
Inexact: resort to soft shadows -- it might be a compromise you are willing to make, and even seen as an artistic choice. If you are OK with that, you can create a mipmap from your occlusion map, and then progressively increase the step and sample lower levels as you go farther from the point you are shading.
You can go further and build an emitters map (into the same 4-channel map as the occlusion). Then your entire shading pass will be independent of the number of lights. This is an equivalent of voxel cone tracing GI applied to 2D.

Why fwidth behaves differently?

I'm working on a WebGL project to create isolines on a 3D surface on macOS/amd GPU. My idea is to colour the pixels based on elevation in fragment shader. With some optimizations, I can achieve a relatively consistent line width and I am happy about that. However when I tested it on windows it behaves differently.
Then I figured out it's because of fwidth(). I use fwidth() to prevent fragment shader from coloring the whole horizontal plane when it happens to locate at a isolevel. Please see the screenshot:
I solved this issue by adding the follow glsl line:
if (fwidth(vPositionZ) < 0.001) { /**then do not colour isoline on these pixels**/ };
It works very well on macOS since I got this:
.
However, on windows/nvidia GPU all isolines are gone because fwidth(vPositionZ) always evaluates to 0.0. Which doesn't make sense to me.
What am I doing wrong? Is there any better way to solve the issue presented in the first screenshot? Thank you all!
EDIT:
Here I attach my fragment shader. It's simplified but I think that's all relevant. I know looping is slow but for now I'm not worried about it.
uniform float zmin; // min elevation
uniform vec3 lineColor;
varying float vPositionZ; // elevation value for each vertex
float interval;
vec3 originColor = finalColor.rgb; // original surface color
for ( int i = 0; i < COUNT; i ++ ) {
float elevation = zmin + float( i + 1 ) * interval;
lineColor = mix( originColor, lineColor, step( 0.001, fwidth(vPositionZ)));
if ( vPositionZ <= elevation + lineWidth && vPositionZ >= elevation - lineWidth ) {
finalColor.rgb = lineColor;
}
// same thing but without condition:
// finalColor.rgb = mix( mix( originColor, lineColor, step(elevation - lineWidth, vPositionZ) ),
// originColor,
// step(elevation + lineWidth, vPositionZ) );
}
gl_FragColor = finalColor;
Environment: WebGL2.0, es version 300, chrome browser.
Put fwidth(vPosistionZ) before the loop will work. Otherwise, fwidth() evaluates anything to 0 if it's inside a loop.
I suspect this is a bug with Nvidia GPU.

Custom shader - Unity - Diffuse+Lightmap+normal+cubemap

I've been working on a custom shader for a project using Unity 4.6 because Unity's shaders offers a great variety of options but not the one i'm looking for.
I've looked on Stackoverflow about my shader's issue, but every question is about tricky and higly technical issue using shader. I think mine is quite simple (for an experienced developper) but haven't been posted yet.
Here is the problem :
I want to merge 2 shaders to get a "Diffuse+normal+cubemap+lighmap" shader.
So, on one side I have a "Diffuse + NormalMap + LightMap" shaders which looks like this (it's the legacy/lighmap bumpedspec a with a little tweaking to get the specular shinyness):
Shader "Legacy Shaders/Lightmapped/Custom/BumpedSpec" {
Properties {
_Color ("Main Color", Color) = (1,1,1,1)
_SpecColor ("Specular Color", Color) = (0.5, 0.5, 0.5, 1)
_Shininess ("Shininess", Range (0.03, 1)) = 0.078125
_MainTex ("Base (RGB)", 2D) = "white" {}
_BumpMap ("Normalmap", 2D) = "bump" {}
_LightMap ("Lightmap (RGB)", 2D) = "black" {}
}
SubShader {
LOD 200
Tags { "RenderType" = "Opaque" }
CGPROGRAM
#pragma surface surf BlinnPhong
struct Input {
float2 uv_MainTex;
float2 uv_BumpMap;
float2 uv2_LightMap;
};
sampler2D _MainTex;
sampler2D _LightMap;
sampler2D _BumpMap;
float4 _Color;
float _Shininess;
void surf (Input IN, inout SurfaceOutput o)
{
half4 tex = tex2D (_MainTex, IN.uv_MainTex);
o.Albedo = tex.rgb * _Color;
half4 lm = tex2D (_LightMap, IN.uv2_LightMap);
o.Emission = lm.rgb*o.Albedo.rgb;
o.Gloss = tex.a;
o.Alpha = lm.a * _Color.a;
o.Specular = _Shininess;
o.Normal = UnpackNormal(tex2D(_BumpMap, IN.uv_BumpMap));
}
ENDCG
}
FallBack "Legacy Shaders/Lightmapped/VertexLit"
}
And on the other side, i've got a shader with "Diffuse+cubemap+Lightmap" which looks like this :
Shader "Custom/CubeLightmap" {
Properties {
_Color ("Main Color", Color) = (1,1,1,1)
_ReflectColor ("Reflection Color", Color) = (1,1,1,0.5)
_MainTex ("Base (RGB) RefStrength (A)", 2D) = "white" {}
_Cube ("Reflection Cubemap", Cube) = "_Skybox" { TexGen CubeReflect }
_LightMap ("Lightmap (RGB)", 2D) = "lightmap" { LightmapMode }
}
SubShader {
LOD 200
Tags { "RenderType"="Opaque" }
CGPROGRAM
#pragma surface surf Lambert
sampler2D _MainTex;
samplerCUBE _Cube;
sampler2D _LightMap;
fixed4 _Color;
fixed4 _ReflectColor;
struct Input {
float2 uv_MainTex;
float3 worldRefl;
float2 uv2_LightMap;
};
void surf (Input IN, inout SurfaceOutput o) {
fixed4 tex = tex2D(_MainTex, IN.uv_MainTex);
fixed4 c = tex * _Color;
o.Albedo = c.rgb;
half4 lm = tex2D(_LightMap,IN.uv2_LightMap);
fixed4 reflcol = texCUBE (_Cube, IN.worldRefl);
reflcol *= tex.a;
o.Emission = lm.rgb * reflcol.rgb * _ReflectColor.rgb;
o.Alpha = reflcol.a * _ReflectColor.a * lm.a;
}
ENDCG
}
FallBack "Reflective/VertexLit"
}
So I want to merge both off them (a.k.a include cubemap in the first one or include normalmap in the second one) and I can't figure it out for the moment.
So I'm in need of some advice or help to achieve it.
Thanks in advance,
Regards
Sounds to me like you are looking to create a radiosity normal mapping shader. This will require you to at least know the basics of C++, and HSLS
The biggest problem you will encounter will be how to compute radiosity normal maps, as this requires a specially crafted light mapper. The only software I know of that does this is Beast by AutoDesk
After that you will need some simple shaders. There is some well documented explanations, and relevant code located at
Half Life 2 shading
Valve Software

Schlick geometric attenuation function in shader producing incorrect results

I have been searching online for a while now on why my geometric attenuation term for my physically based shader (Which I posted a question about not too long ago) and I cannot seem to come up with a result. The function I'm trying to implement can be found here: http://blog.selfshadow.com/publications/s2013-shading-course/karis/s2013_pbs_epic_notes_v2.pdf
This is my current iteration of the function.
vec3 Gsub(vec3 v) // Sub Function of G
{
float k = ((roughness + 1) * (roughness + 1)) / 8;
float fdotv = dot(fNormal, v);
return vec3((fdotv) / ((fdotv) * (1.0 - k) + k));
}
vec3 G(vec3 l, vec3 v, vec3 h) // Geometric Attenuation Term - Schlick Modified (k = a/2)
{
return Gsub(l) * Gsub(v);
}
This is the current result of the above in my application:
You can clearly see the strange artifacts on the left side, which should not be present.
One of the things I thought was an issue was my normals. I believe this is the issue, because whenever I put the same function into the Disney BRDF editor (http://www.disneyanimation.com/technology/brdf.html) I get correct results. I believe it is the normals because whenever I view the normals in Disney's application, I get this.
These normals differ from my normals, which -should- be correct:
I use the same model in both applications, and the normals are stored inside the model file. Can anyone give any insight into this?
Additionally I'd like to mention that these are the operations done on my normals:
Vertex Shader
mat3 normalMatrix = mat3(transpose(inverse(ModelView)));
inputNormal = normalize(normalMatrix * vNormal);
Fragment Shader
fNormal = normalize(inputNormal);
P.S. Please excuse my rushy-code, I've been trying to get this to work for a while.

Troubles with Marching Cubes and Texture coordinates

I'm implementing MC algorithm in OpenGL.
Everything went fine, until I reached the point with texture coordinates.
I can't figure out how to implement them!
My progress:
Edit:
What I want to archive is to put some textures on my generated MC triangles.
As far as I understand I need to tell OpenGL uv coordinates, but no idea how to calculate them.
A typical texture coordinate generation algorithms for marching cube algorithms is to use environment mapping.
In short you calculate the vertex-normal at each vertex by averaging the face normals of all adjecting faces, then discard the z-coordinate of the normal and use (x/2+0.5, y/2+0.5) as (u,v) texture-coordinates.
Set up a texture with a nice white spot in the middle and some structure filling the rest of of the texture and you get the terminator-two silver-robot kind of look.
I need to tell OpenGL uv coordinates, but no idea how to calculate them.
You're facing some big problem there: The topology of what comes out of MC can be anything. The topology of a texture in OpenGL is either a (hyper)torus (GL_TEXTURE_1D, GL_TEXTURE_2D, GL_TEXTURE_3D), or a sphere (GL_TEXTURE_CUBE_MAP).
So inevitably you have to cut your surface into so called maps. This is a nontrivial task, but a qood strategy is cutting along regions with high curvature. See the paper
“Least Squares Conformal Maps for Automatic Texture Atlas Generation”
Bruno Lévy, Sylvain Petitjean, Nicolas Ray and Jérome Maillot
http://alice.loria.fr/index.php/publications.html?Paper=lscm#2002
for the dirty details.
The first answer given is partly correct, except you also need to check which plane is best to project from instead of always projecting from the z plane, like this C# Unity example:
Vector2[] getUVs(Vector3 a, Vector3 b, Vector3 c)
{
Vector3 s1 = b - a;
Vector3 s2 = c - a;
Vector3 norm = Vector3.Cross(s1, s2).Normalize(); // the normal
norm.x = Mathf.Abs(norm.x);
norm.y = Mathf.Abs(norm.y);
norm.z = Mathf.Abs(norm.z);
Vector2[] uvs = new Vector2[3];
if (norm.x >= norm.z && norm.x >= norm.y) // x plane
{
uvs[0] = new Vector2(a.z, a.y);
uvs[1] = new Vector2(b.z, b.y);
uvs[2] = new Vector2(c.z, c.y);
}
else if (norm.z >= norm.x && norm.z >= norm.y) // z plane
{
uvs[0] = new Vector2(a.x, a.y);
uvs[1] = new Vector2(b.x, b.y);
uvs[2] = new Vector2(c.x, c.y);
}
else if (norm.y >= norm.x && norm.y >= norm.z) // y plane
{
uvs[0] = new Vector2(a.x, a.z);
uvs[1] = new Vector2(b.x, b.z);
uvs[2] = new Vector2(c.x, c.z);
}
return uvs;
}
Though it is better to do this on the GPU in a shader, especially if you are planning on having very dynamic voxels, such as in an infinitely generated world that's constantly generating around the player or a game with lots of digging and building involved, you wouldn't have to calculate the UVs each time and it's also less data you have to send to the GPU, so it is definitely faster than this. I modified a basic triplanar shader I found on the internet a while ago, unfortunately I wasn't able to find it again, but my modified version is basically a triplanar mapping shader except with no blending and it only samples once per pass, so it should be pretty much as fast as a basic unlit shader and looks exactly the same as the image above. I did this because the normal triplanar shader blending doesn't look good with textures like brick walls at 45 degree angles.
Shader "Triplanar (no blending)"
{
Properties
{
_DiffuseMap("Diffuse Map ", 2D) = "white" {}
_TextureScale("Texture Scale",float) = 1
}
SubShader
{
Tags { "RenderType" = "Opaque" }
LOD 200
CGPROGRAM
#pragma target 3.0
#pragma surface surf Lambert
sampler2D _DiffuseMap;
float _TextureScale;
struct Input
{
float3 worldPos;
float3 worldNormal;
};
void surf(Input IN, inout SurfaceOutput o)
{
IN.worldNormal.x = abs(IN.worldNormal.x);
IN.worldNormal.y = abs(IN.worldNormal.y);
IN.worldNormal.z = abs(IN.worldNormal.z);
if (IN.worldNormal.x >= IN.worldNormal.z && IN.worldNormal.x >= IN.worldNormal.y) // x plane
{
o.Albedo = tex2D(_DiffuseMap, IN.worldPos.zy / _TextureScale);
}
else if (IN.worldNormal.y >= IN.worldNormal.x && IN.worldNormal.y >= IN.worldNormal.z) // y plane
{
o.Albedo = tex2D(_DiffuseMap, IN.worldPos.xz / _TextureScale);
}
else if (IN.worldNormal.z >= IN.worldNormal.x && IN.worldNormal.z >= IN.worldNormal.y) // z plane
{
o.Albedo = tex2D(_DiffuseMap, IN.worldPos.xy / _TextureScale);
}
}
ENDCG
}
}
It ends up looking a lot like a cubemap, though I don't think this is technically a cubemap as we only use three faces, not six.
EDIT: I later realized that you may want it in the fragment shader like that but for my purposes it works exactly the same and would theoretically be faster in the vertex shader:
Shader "NewUnlitShader"
{
Properties
{
_MainTex ("Texture", 2D) = "white" {}
}
SubShader
{
Tags { "RenderType"="Opaque" }
LOD 100
Pass
{
CGPROGRAM
#pragma vertex vert
#pragma fragment frag
// make fog work
#pragma multi_compile_fog
#include "UnityCG.cginc"
struct appdata
{
float4 vertex : POSITION;
float3 normal : NORMAL;
};
struct v2f
{
float2 uv : TEXCOORD0;
UNITY_FOG_COORDS(1)
float4 vertex : SV_POSITION;
};
sampler2D _MainTex;
float4 _MainTex_ST;
v2f vert (appdata v)
{
v2f o;
o.vertex = UnityObjectToClipPos(v.vertex);
v.normal.x = abs(v.normal.x);
v.normal.y = abs(v.normal.y);
v.normal.z = abs(v.normal.z);
if (v.normal.x >= v.normal.z && v.normal.x >= v.normal.y) // x plane
{
o.uv = v.vertex.zy;
}
else if (v.normal.y >= v.normal.x && v.normal.y >= v.normal.z) // y plane
{
o.uv = v.vertex.xz;
}
else if (v.normal.z >= v.normal.x && v.normal.z >= v.normal.y) // z plane
{
o.uv = v.vertex.xy;
}
UNITY_TRANSFER_FOG(o, o.vertex);
return o;
}
fixed4 frag (v2f i) : SV_Target
{
// sample the texture
fixed4 col = tex2D(_MainTex, i.uv);
// apply fog
UNITY_APPLY_FOG(i.fogCoord, col);
return col;
}
ENDCG
}
}
}