So I want to have multiple light sources in my scene. The basic idea is to simply have an array of a (uniform) struct that has all the properties of light you care about such as positions, color, direction, cutoff and w/e you want. My problem is how to represent which lights are on/off? I will list out all the ways I can think of. Pl
Have a uniform int per light structure to indicate if it's on/off.
Have the number of light struct match multiples of 2, 3, or 4 such that I can use that many bool vectors to indicate their status. For example, 16 lights = 4x4 bvec4.
Instead of using many flags and branch, always go through every single light but with the off ones set to (0,0,0,0) for color
I'm leaning towards the last options as it won't have branching ... but I already read that modern graphics card are more okay with branching now.
None of your ideas is really good because all of them require the shader to evaluate which lightsources are active and which aren't. You should also differentiate between scene information (which lights are present in the scene) and data necessary for rendering (which lights are on and illuminate the scene). Scene information shouldn't be stored in a shader since it is unnecessary and will only slow down the shader.
A better way for handling multiple light sources in a scene and render with only the active ones could be as follows:
In every frame:
Evaluate on CPU side which lights are on. Pass only the lights which are on to the shader uniforms together with the total count of the active lights. In pseudocode:
Shader:
uniform lightsources active_lightsources[MAX_LIGHTS];
uniform int light_count;
CPU:
i = 0
foreach (light in lightsources)
{
if (light.state == ON)
{
set_uniform("active_lightsources[i]", light);
i++
}
}
set_uniform("light_count", i);
When illuminating a fragment do the following in the shader
Shader:
for (int i = 0; i < light_count; i++)
{
//Calculate illumination from active_lightsources[i]
}
The major advantage of this approach is that you store less lights inside the shader and that the loop inside the shader only loops over lightsources that are relevant for the current frame. The evaluation which lights are relevant is done once per frame on the CPU instead of once per vertex/fragment in the shader.
Related
I am currently programming a graphics renderer in OpenGL by following several online tutorials. I've ended up with an engine which has a rendering pipeline which basically consists of rendering an object using a simple Phong Shader. My Phong Shader has a basic vertex shader which modifies the vertex based on a transformation and a fragment shader which looks something like this:
// PhongFragment.glsl
uniform DirectionalLight dirLight;
...
vec3 calculateDirLight() { /* Calculates Directional Light using the uniform */ }
...
void main() {
gl_FragColor = calculateDirLight();
The actual drawing of my object looks something like this:
// Render a Mesh
bindPhongShader();
setPhongShaderUniform(transform);
setPhongShaderUniform(directionalLight1);
mesh->draw(); // glDrawElements using the Phong Shader
This technique works well, but has the obvious downside that I can only have one directional light, unless I use uniform arrays. I could do that but instead I wanted to see what other solutions were available (mostly since I don't want to make an array of some large amount of lights in the shader and have most of them be empty), and I stumbled on this one, which seems really inefficient but I am not sure. It basically involves redrawing the mesh every single time with a new light, like so:
// New Render
bindBasicShader(); // just transforms vertices, and sets the frag color to white.
setBasicShaderUniform(transform); // Set transformation uniform
mesh->draw();
// Enable Blending so that all light contributions are added up...
bindDirectionalShader();
setDirectionalShaderUniform(transform); // Set transformation uniform
setDirectionalShaderUniform(directionalLight1);
mesh->draw(); // Draw the mesh using the directionalLight1
setDirectionalShaderUniform(directionalLight2);
mesh->draw(); // Draw the mesh using the directionalLight2
setDirectionalShaderUniform(directionalLight3);
mesh->draw(); // Draw the mesh using the directionalLight3
This seems terribly inefficient to me, though. Aren't I redrawing all the mesh geometry over and over again? I have implemented this and it does give me the result I was looking for, multiple directional lights, but the frame rate has dropped considerably. Is this a stupid way of rendering multiple lights, or is it on par with using shader uniform arrays?
For forward rendering engines where lighting is handled in the same shader as the main geometry processing, the only really efficient way of doing this is to generate lots of shaders which can cope with the various combinations of light source, light count, and material under illumination.
In your case you would have one shader for 1 light, one for 2 lights, one for 3 lights, etc. It's a combinatorial nightmare in terms of number of shaders, but you really don't want to send all of your meshes multiple times (especially if you are writing games for mobile devices - geometry is very bandwidth heavy and sucks power out of the battery).
The other common approach is a deferred lighting scheme. These schemes store albedo, normals, material properties, etc into a "Geometry Buffer" (e.g. a set of multiple-render-target FBO attachments), and then apply lighting after the fact as a set of post-processing operations. The complex geometry is sent once, with the resulting data stored in the MRT+depth render targets as a set of texture data. The lighting is then applied as a set of basic geometry (typically spheres or 2D quads), using the depth texture as a means to clip and cull light sources, and the other MRT attachments to compute the lighting intensity and color. It's a bit of a long topic for a SO post - but there are lots of good presentations around on the web from GDC and Sigraph.
Basic idea outlined here:
https://en.wikipedia.org/wiki/Deferred_shading
The first issue, is how to get from a single light source, to using multiple light sources, without using more than one fragment shader.
My instinct is that each run through of the shader calculations needs light source coordinates, and maybe some color information, and we can just run through the calculations in a loop for n light sources.
How do I pass the multiple lights into the shader program? Do I use an array of uniforms? My guess would be do pass in an array of uniforms with the coordinates of each light source, and then specify how many light sources there are, and then set a maximum value.
Can I call getter or setter methods for a shader program? Instead of just manipulating the globals?
I'm using this tutorial and the libGDX implementation to learn how to do this:
https://gist.github.com/mattdesl/4653464
There are many methods to have multiple light sources. I'll point 3 most commonly used.
1) Specify each light source in array of uniform structures. Light calculations are made in shader loop over all active lights and accumulating result into single vertex-color or fragment-color depending if shading is done per vertex or per fragment. (this is how fixed-function OpenGL was calculating multiple lights)
2) Multipass rendering with single light source enabled per pass, in simplest form passes could be composited by additive blending (srcFactor=ONE dstFactor=ONE). Don't forget to change depth func after first pass from GL_LESS to GL_EQUAL or simply use GL_LEQUAL for all passes.
3) Many environment lighting algorithms mimic multiple light sources, assuming they are at infinte distance from your scene. Simplest renderer should store light intensities into environment texture (prefferably a cubemap), shader job then would be to sample this texture several times in direction around the surface normal with some random angular offsets.
I am starting to learn OpenGL (3.3+), and now I am trying to do an algorithm that draws 10000 points randomly in the screen.
The problem is that I don't know exactly where to do the algorithm. Since they are random, I can't declare them on a VBO (or can I?), so I was thinking in passing a uniform value to the vertex shader with the varying position (I would do a loop changing the uniform value). Then I would do the operation 10000 times. I would also pass a random color value to the shader.
Here is kind of my though:
#version 330 core
uniform vec3 random_position;
uniform vec3 random_color;
out vec3 Color;
void main() {
gl_Position = random_position;
Color = random_color;
}
In this way I would do the calculations outside the shaders, and just pass them through the uniforms, but I think a better way would be doing this calculations inside the vertex shader. Would that be right?
The vertex shader will be called for every vertex you pass to the vertex shader stage. The uniforms are the same for each of these calls. Hence you shouldn't pass the vertices - be they random or not - as uniforms. If you would have global transformations (i.e. a camera rotation, a model matrix, etc.), those would go into the uniforms.
Your vertices should be passed as a vertex buffer object. Just generate them randomly in your host application and draw them. The will be automatically the in variables of your shader.
You can change the array in every iteration, however it might be a good idea to keep the size constant. For this it's sometimes useful to pass a 3D-vector with 4 dimensions, one being 1 if the vertex is used and 0 otherwise. This way you can simply check if a vertex should be drawn or not.
Then just clear the GL_COLOR_BUFFER_BIT and draw the arrays before updating the screen.
In your shader just set gl_Position with your in variables (i.e. the vertices) and pass the color on to the fragment shader - it will not be applied in the vertex shader yet.
In the fragment shader the last set variable will be the color. So just use the variable you passed from the vertex shader and e.g. gl_FragColor.
By the way, if you draw something as GL_POINTS it will result in little squares. There are lots of tricks to make them actually round, the easiest to use is probably to use this simple if in the fragment shader. However you should configure them as Point Sprites (glEnable(GL_POINT_SPRITE)) then.
if(dot(gl_PointCoord - vec2(0.5,0.5), gl_PointCoord - vec2(0.5,0.5)) > 0.25)
discard;
I suggest you to read up a little on what the fragment and vertex shader do, what vertices and fragments are and what their respective in/out/uniform variables represent.
Since programs with full vertex buffer objects, shader programs etc. get quite huge, you can also start out with glBegin() and glEnd() to draw vertices directly. However this should only be a very early starting point to understand what you are drawing where and how the different shaders affect it.
The lighthouse3d tutorials (http://www.lighthouse3d.com/tutorials/) usually are a good start, though they might be a bit outdated. Also a good reference is the glsl wiki (http://www.opengl.org/wiki/Vertex_Shader) which is up to date in most cases - but it might be a bit technical.
Whether or not you are working with C++, Java, or other languages - the concepts for OpenGL are usually the same, so almost all tutorials will do well.
Is it possible for only front facing triangles to be sent to the geometry shader? I believe that culling only happens to emitted triangles after the geometry shader by default.
Yes it is, back in the ancient days of Quake, face culling was done on the CPU using a simple dot product per-triangle. Any triangle that failed the test was not included in the list of indices drawn.
This is not a viable optimization on most hardware these days, but you still see it employed from time to time in specialized applications. One such application I have seen a lot of is using the PS3's Cell SPEs to cull out triangles during batching to save vertex transform workload on the PS3's RSX GPU - keep in mind, the PS3 still uses a basic shader architecture where there are a fixed number of specialized vertex shader units and fragment shader units. Balancing shader workload is important on that GPU.
I may be missing the point of your question though; what benefit do you expect/want to get out of culling the primitives early?
Update:
What I was trying to say is that on modern hardware and software, vertex transform / primitive assembly is usually not a bottleneck. Fragment processing is much more expensive these days, so having primitives culled during rasterization is usually the extent to which you have to worry about things for performance. The PS3's RSX is a special case, it has very poor vertex performance and a CPU architecture that is hard to keep busy, so it makes sense to offload primitive culling to the CPU.
You can still cull triangles before the vertex shader/tessellation/geometry shader on the CPU, but storing normals per-triangle somewhere and transferring a new set of indices to draw each frame hardly makes this a wise use of resources. You may spend more time and memory setting up the reduced list of triangles than you would of if you processed them on the GPU and let GL throw the backward facing primitives out during rasterization.
There is at least one use-case that comes to mind where this actually could still be a useful thing to do. I am referring to tessellated patches. If you can determine on the CPU before tessellation occurs that the entire patch faces the wrong way, you can skip having to tessellate them on the GPU. Ordinarily rendering will not be vertex-bound these days, but tessellation is one case where it may be.
It is actually possible. Like the other answer mentioned you can do it on the software side. But there are stages in-between the vertex shader and geometry shader. Namely, the hull (programmable), primitive generator (fixed), and domain (programmable). In the hull shader you can specify tessellation levels for your patch.
If you set any of these levels to 0.0, then the patch will be discarded and it will not enter the geometry shader!
Hope this helps :)
Although it doesn't technically satisfy the question because the triangle reaches the geometry shader, you can cull triangles within the geometry shader itself. This might be the solution people reading this question are looking for, it was for me.
I used the shader below to implement wireframe drawing of quads with culling, where each quad is drawn using two triangles. The idea of using
glPolygonMode(GL_FRONT_AND_BACK, GL_LINES)
with culling includes the diagonal of each quad, which isn't desired, so I use a geometry shader to convert each triangle to just the two lines. However now culling doesn't work because only lines are emitted by the geometry shader. Instead culling is included at the beginning of the geometry shader, just set the uniform culling to -1, 0 or 1 for your desired behaviour (Note, this culling test assumes that the w coord is positive for each vertex).
#version 430 core
layout (triangles) in;
layout (line_strip, max_vertices=3) out;
uniform int culling;
void main()
{
// Perform culling here with an early return
mat3 M = mat3(gl_in[0].gl_Position.xyz, gl_in[1].gl_Position.xyz, gl_in[2].gl_Position.xyz);
if(culling * determinant(M) < 0)
{
EndPrimitive(); // Necessary?
return;
}
gl_Position = gl_in[0].gl_Position;
EmitVertex();
gl_Position = gl_in[1].gl_Position;
EmitVertex();
gl_Position = gl_in[2].gl_Position;
EmitVertex();
EndPrimitive();
}
I draw lots of quadratic Bézier curves in my OpenGL program. Right now, the curves are one-pixel thin and software-generated, because I'm at a rather early stage, and it is enough to see what works.
Simply enough, given 3 control points (P0 to P2), I evaluate the following equation with t varying from 0 to 1 (with steps of 1/8) in software and use GL_LINE_STRIP to link them together:
B(t) = (1 - t)2P0 + 2(1 - t)tP1 + t2P2
Where B, obviously enough, results in a 2-dimensional vector.
This approach worked 'well enough', since even my largest curves don't need much more than 8 steps to look curved. Still, one pixel thin curves are ugly.
I wanted to write a GLSL shader that would accept control points and a uniform thickness variable to, well, make the curves thicker. At first I thought about making a pixel shader only, that would color only pixels within a thickness / 2 distance of the curve, but doing so requires solving a third degree polynomial, and choosing between three solutions inside a shader doesn't look like the best idea ever.
I then tried to look up if other people already did it. I stumbled upon a white paper by Loop and Blinn from Microsoft Research where the guys show an easy way of filling the area under a curve. While it works well to that extent, I'm having trouble adapting the idea to drawing between two bouding curves.
Finding bounding curves that match a single curve is rather easy with a geometry shader. The problems come with the fragment shader that should fill the whole thing. Their approach uses the interpolated texture coordinates to determine if a fragment falls over or under the curve; but I couldn't figure a way to do it with two curves (I'm pretty new to shaders and not a maths expert, so the fact I didn't figure out how to do it certainly doesn't mean it's impossible).
My next idea was to separate the filled curve into triangles and only use the Bézier fragment shader on the outer parts. But for that I need to split the inner and outer curves at variable spots, and that means again that I have to solve the equation, which isn't really an option.
Are there viable algorithms for stroking quadratic Bézier curves with a shader?
This partly continues my previous answer, but is actually quite different since I got a couple of central things wrong in that answer.
To allow the fragment shader to only shade between two curves, two sets of "texture" coordinates are supplied as varying variables, to which the technique of Loop-Blinn is applied.
varying vec2 texCoord1,texCoord2;
varying float insideOutside;
varying vec4 col;
void main()
{
float f1 = texCoord1[0] * texCoord1[0] - texCoord1[1];
float f2 = texCoord2[0] * texCoord2[0] - texCoord2[1];
float alpha = (sign(insideOutside*f1) + 1) * (sign(-insideOutside*f2) + 1) * 0.25;
gl_FragColor = vec4(col.rgb, col.a * alpha);
}
So far, easy. The hard part is setting up the texture coordinates in the geometry shader. Loop-Blinn specifies them for the three vertices of the control triangle, and they are interpolated appropriately across the triangle. But, here we need to have the same interpolated values available while actually rendering a different triangle.
The solution to this is to find the linear function mapping from (x,y) coordinates to the interpolated/extrapolated values. Then, these values can be set for each vertex while rendering a triangle. Here's the key part of my code for this part.
vec2[3] tex = vec2[3]( vec2(0,0), vec2(0.5,0), vec2(1,1) );
mat3 uvmat;
uvmat[0] = vec3(pos2[0].x, pos2[1].x, pos2[2].x);
uvmat[1] = vec3(pos2[0].y, pos2[1].y, pos2[2].y);
uvmat[2] = vec3(1, 1, 1);
mat3 uvInv = inverse(transpose(uvmat));
vec3 uCoeffs = vec3(tex[0][0],tex[1][0],tex[2][0]) * uvInv;
vec3 vCoeffs = vec3(tex[0][1],tex[1][1],tex[2][1]) * uvInv;
float[3] uOther, vOther;
for(i=0; i<3; i++) {
uOther[i] = dot(uCoeffs,vec3(pos1[i].xy,1));
vOther[i] = dot(vCoeffs,vec3(pos1[i].xy,1));
}
insideOutside = 1;
for(i=0; i< gl_VerticesIn; i++){
gl_Position = gl_ModelViewProjectionMatrix * pos1[i];
texCoord1 = tex[i];
texCoord2 = vec2(uOther[i], vOther[i]);
EmitVertex();
}
EndPrimitive();
Here pos1 and pos2 contain the coordinates of the two control triangles. This part renders the triangle defined by pos1, but with texCoord2 set to the translated values from the pos2 triangle. Then the pos2 triangle needs to be rendered, similarly. Then the gap between these two triangles at each end needs to filled, with both sets of coordinates translated appropriately.
The calculation of the matrix inverse requires either GLSL 1.50 or it needs to be coded manually. It would be better to solve the equation for the translation without calculating the inverse. Either way, I don't expect this part to be particularly fast in the geometry shader.
You should be able to use technique of Loop and Blinn in the paper you mentioned.
Basically you'll need to offset each control point in the normal direction, both ways, to get the control points for two curves (inner and outer). Then follow the technique in Section 3.1 of Loop and Blinn - this breaks up sections of the curve to avoid triangle overlaps, and then triangulates the main part of the interior (note that this part requires the CPU). Finally, these triangles are filled, and the small curved parts outside of them are rendered on the GPU using Loop and Blinn's technique (at the start and end of Section 3).
An alternative technique that may work for you is described here:
Thick Bezier Curves in OpenGL
EDIT:
Ah, you want to avoid even the CPU triangulation - I should have read more closely.
One issue you have is the interface between the geometry shader and the fragment shader - the geometry shader will need to generate primitives (most likely triangles) that are then individually rasterized and filled via the fragment program.
In your case with constant thickness I think quite a simple triangulation will work - using Loop and Bling for all the "curved bits". When the two control triangles don't intersect it's easy. When they do, the part outside the intersection is easy. So the only hard part is within the intersection (which should be a triangle).
Within the intersection you want to shade a pixel only if both control triangles lead to it being shaded via Loop and Bling. So the fragment shader needs to be able to do texture lookups for both triangles. One can be as standard, and you'll need to add a vec2 varying variable for the second set of texture coordinates, which you'll need to set appropriately for each vertex of the triangle. As well you'll need a uniform "sampler2D" variable for the texture which you can then sample via texture2D. Then you just shade fragments that satisfy the checks for both control triangles (within the intersection).
I think this works in every case, but it's possible I've missed something.
I don't know how to exactly solve this, but it's very interesting. I think you need every different processing unit in the GPU:
Vertex shader
Throw a normal line of points to your vertex shader. Let the vertex shader displace the points to the bezier.
Geometry shader
Let your geometry shader create an extra point per vertex.
foreach (point p in bezierCurve)
new point(p+(0,thickness,0)) // in tangent with p1-p2
Fragment shader
To stroke your bezier with a special stroke, you can use a texture with an alpha channel. You can check the alpha channel on its value. If it's zero, clip the pixel. This way, you can still make the system think it is a solid line, instead of a half-transparent one. You could apply some patterns in your alpha channel.
I hope this will help you on your way. You will have to figure out things yourself a lot, but I think that the Geometry shading will speed your bezier up.
Still for the stroking I keep with my choice of creating a GL_QUAD_STRIP and an alpha-channel texture.