OpenGL separate program stages - opengl

I am exploring the relatively new feature GL_ARB_separate_program_object.What I understand is I have to create a pipeline object which should contain shaders from stages which are mapped to there via
glUseProgramStages
This make me think about 2 possibilites of using multiple shaders:
1.Creating Multiple pipelines with variant Vertex/Fragment shaders couples(not using other shader types for now) coming from one time mapping to each pipeline.
2.Creating single pipeline and in runtime switch the mapping to different shaders using
glUseProgramStages
I am mostly concerned with performance.Which option is more performance wise ?

Your question cannot really be answered, as it would vary with driver implementations and such. However, the facts and history of the functionality should be informative.
EXT_separate_shader_objects was the first incarnation of this functionality. The biggest difference between them was this: you could not use user-defined varyings with the EXT version. You had to use the old compatibility input/outputs like gl_TexCoord.
Issue #2 in the EXT_separate_shader_objects specification attempts to justify this incomprehensible oversightexplains the reasoning for this as follows:
It is undesirable from a performance standpoint to attempt to support "rendezvous by name" for arbitrary separate shaders because the separate shaders won't be naturally compiled to match their varying inputs and outputs of the same name without a special link step. Such a special link would introduce an extra validation overhead to binding separate shaders. The link itself would have to be deferred until glBegin time since separate shaders won't match when transitioning from one set of consistent shaders to another. This special link would still create errors or undefined behavior when the names of input and output varyings matched but their types did not match.
This suggests that the reason not to rely on name matching, besides incompetence, was performance related (if you can't tell, I don't think very highly of EXT_SSO). The performance of "rendezvous by name" comes from having to do it at every draw call, rather than being able to do it once.
ARB_separate_shader_objects encapsulates the collection of programs in an object. Therefore, the object can store all of the "rendezvous" data. The first draw call may be slower, but subsequent uses of the same PPO will be fast, as long as you don't attach new programs to it.
So I would take that as evidence that PPOs should have programs set on them and then left alone. In general, modifying the attachments objects should be avoided whenever possible. That's why you're encouraged not to go adding or removing textures/renderbuffers from FBOs.

Related

Handing of C code in C++ (Vulkan)

I am trying to write a rendering engine in C++ based on Vulkan. Vulkan is written in C, as a result it has some interesting conventions.
A recurring pattern I see in tutorials/code snippets from Vulkan apps is that most code is in 1 very big class. (right now my vulkan class is already about 2000 lines too). But to make a proper rendering engine, I will need to compartmentalize my code till some degree.
One of the aforementioned interesting bits is that it has something called a Logical Device, which is an abstract reference to the graphics card.
It is used everywhere, to create and allocate things in the following way:
Create structs with creation info
Create variable that the code will output into
Call the actual vkCreateSomething or vkAllocateSomething function, pass in the logical device,
the creation info and the reference to the variable to output to and check if it was a success.
on its own there is nothing wrong with this style I'd say. It's just that it's not really handy at all in OOP because it relies on the logical device being available everywhere.
How would I deal with this problem? Service locators and singletons are considered to be horrible solutions by many (which I can understand), so that seems like something I'd rather avoid.
Are there design patterns that deal with this?
The logical device is an actual dependency.
It has state, and its state needs to be available to work with the hardware.
You can use it as an argument to your operations, a value stored in pretty much every class, a global, or a monadic-esque "final" argument where every operation just returns something still needing the device to run on. You can replace a (pointer/reference to) it with a function returning a (pointer/reference to) it.
Consider if pure OOP is what you want to do; vulkan and rendering is more about operations than things being operated on. I would want to mix some functional programming patterns in, which makes the monad-like choice more reasonable.
Compose operations on buffers/data. These return operations, which also take buffers and data. The composition operation specifies which arguments are new inputs, and which are consumed by the next step. Doing this you can (at compile time) set up a type-safe graph of work to do, all without running anything.
The resulting composed operation would then have a setup (where you bind the logical device and anything you can do "early" before you need to have the expensive buffers ready), and an execute phase (where you feed it the expensive buffers and it generates output).
Or as another approach, find a compiler with coroutine support from c++2a and write it async yet procedurally.
Vulkan is a OOP API. It is not class-based, because it is C99 not C++. That can easily be fixed by using the official Vulkan-Hpp. You can consume it as vulkan.hpp which is part of the semi-official LunarG Vulkan SDK.
The usage would not be that different from vulkan.h though: you would probably have a member pointer/reference to a Device instance, or would have a VkDevice handle member in each object that needs it. Some higher level object would handle the lifetime of the Logical Device (e.g. your RenderingEngine class or such). The difference would be almost only esthetical: you would use device->command(...) instead of vkCommand(device, ...). vulkan.hpp does not seem to use proper RAII through constructors/destructors which is a shame.
Alternatively the user of your engine can manage the device. Though unlike OpenGL there is not much use for this. The user can make its own VkInstance and VkDevice if it also wishes to use Vulkan for something.
A recurring pattern I see in tutorials/code snippets from Vulkan apps is that most code is in 1 very big class.
That's not really specific to Vulkan. If you think about it, pretty much all C++ applications are one big class doing everything (only differences being how much the programmer bothers to delegate from it to some other class instances).

Is there an advantage to generating multiple indexes (names) at once in OpenGL?

Functions such as glGenBuffers and glGenVertexArrays have an n parameter that allows for multiple indexes names to be generated.
My question is: What is the advantage of generating multiple indexes names at once versus generating them 1 at a time? Is there one?
Background: I'm working on constructing an object-oriented framework but I don't know whether or not I should add functionality for index name wizardry (at the cost of more complex interfaces) or just auto-generate individual indexes names as needed per object.
In practice it makes no difference. If you look at the part of the API that deals with shaders, you'll see, that this does return only one index at a time. In fact a lot of experienced OpenGL coders prefer to wrap the glGen… functions in myglCreate… functions that return exactly one name of the desired object class.
I completely agree with #datenwolf's recommendation, but wanted to provide some more reasoning. I would only call glGen*() for multiple names if it's the most convenient thing to do. If you already have an array of names anyway, and want to create them all at once, then go ahead and use a single call. But it's definitely not worth jumping through any hoops, and I can't imagine a realistic scenario where it would help performance.
The glGen*() calls only create object names (which is the official OpenGL terminology, even though many people call them "ids"), not objects. The actual object is only created the first time you bind a name. So there's no possible gain from creating multiple objects at once, the only (insignificant) gain is from generating multiple names at once.
I can think of two related reasons why generating the names isn't much more than a drop in the ocean:
It doesn't happen at a high frequency. What you should typically worry about are calls that are made very frequently, like state update calls that are made between draw calls, and the draw calls themselves.
Name generation typically happens at a moderate frequency during setup, and rarely later. Maybe you'll create a new buffer or texture every once in a while, but this is mostly insignificant compared to how often you bind textures or buffers.
It is tied to much more expensive operations. You only generate as many names as you need for creating the actual objects. And creating the object is much more expensive than generating the name. If you picture everything that needs to happen for creating a typical object (various memory allocations, state setup, etc), generating the name is an insignificant part.

Why do glBindRenderbuffer and glRenderbufferStorage each take a "target" parameter?

It takes a target parameter, but the only viable target is
GL_RENDERBUFFER​.
http://www.opengl.org/wiki/Renderbuffer_Object
https://www.khronos.org/opengles/sdk/docs/man/xhtml/glBindRenderbuffer.xml
http://www.opengl.org/wiki/GlRenderbufferStorage
(I'm just learning OpenGL, and already found these two today; maybe I can expect this seemingly-useless target parameter to be common in many functions?)
There is bit of the rationale behind the target parameter in the issue 30 of the original EXT_framebuffer_object extension specification. (I generally recommend people to read the relevant extensions specs even for features which have become core GL features, since those specs have often more details, and sometimes contain bits of reasoning of the ARB (or vendors) for doing things one way or the other, especially in the "issues" section.):
(30) Do the calls to deal with renderbuffers need a target
parameter? It seems unlikely this will be used for anything.
RESOLUTION: resolved, yes
Whether we call it a "target" or not, there is some piece
of state in the context to hold the current renderbuffer
binding. This is required so that we can call routines like
RenderbufferStorage and {Get}RenderbufferParameter() without
passing in an object name. It is also possible we may
decide to use the renderbuffer target parameter to
distinguish between multisample and non multisample buffers.
Given those reasons, the precedent of texture objects, and
the possibility we may come up with some other renderbuffer
target types in the future, it seems prudent and not all
that costly to just include the target type now.
It's frequently the case that core OpenGL only defines one legal value for certain parameters, but extensions add others. Whether or not there are any more values defined in extensions today, clearly the architects wanted to leave that door open to future extensions.

Multiple shaders vs multiple techniques in DirectX

I'm going through all the Rastertek DirectX tutorials which by the way are very good, and the author tends to use multiple shaders for different things. In one of the later tutorials he even introduces a shader manager class.
Based on some other sources though I believe that it would be more efficient to use a single shader with multiple techniques instead. Are multiple shaders in the tutorials used for simplicity or are there some scenarios where using multiple shaders would be better then a single big one?
I guess in the tutorials they use them for simplicity.
Grouping them in techniques or separately is a design decision. There are scenarios where having multiple shaders is beneficial as you can combine them as you like.
As of DirectX 11 in Windows 8, D3DX Library is deprecated so you will find out that it changes. You can see an example of this in the source code of DirectX Tool Kit: http://directxtk.codeplex.com/ and how they handled their effects.
Normally you will have different Vertex Shader, Pixel Shaders, etc in memory; techniques tend to join them as one, so when you compile the Shader File, for that technique a specific Vertex and Pixel Shader is compiled. Your Effect Objects is handling what Vertex/Pixel Shader the device is been set when an X Technique with a Y Pass is chosen.
You could do this manually, for example, only compile the pixel shader and set it to the device.
Mostly answer would be : it depends.
Effects framework gives a big advantage that you can set your whole pipeline in one go using Pass->Apply, which can make things really easy, but can lead to pretty slow code if not used properly, which is probably why microsoft decided to deprecate it, but you can do as bad or even worse using multiple shaders, directxtk being a pretty good example of that actually (it's ok only for phone development).
In most cases effect framework will incur a few extra api calls that you could avoid using separate shaders (which i agree if you're draw call bound can be significant, but then you should look at optimizing that part with culling/instancing techniques). Using separate shaders you have to handle all state/constant buffer management yourself, and probably do it in a more efficient way if you know what you are doing.
What I really like about fx framework is the very nice reflection, and the use of semantics, which at a design stage can be really useful (for example, if you do float4x4 tP : PROJECTION, your engine can automatically bind camera projection to the shader).
Also layout validation at compile time between shader stages is really handy for authoring (fx framework).
One big advantage of separate shaders is you can easily swap only the stages you need, so you can save a decent amount of permutations, without touching the rest of the pipeline.
It is never a good idea to have multiple fx files loaded. Combine your fx files if you can and use globals when you can if it don't need to be in your VInput struct.
This way you can get the effects you need and pass it what you set up in your own Shader class to handle the rest include the technique passes.
Make yourself an abstract ShaderClass and an abstract ModelClass.
More precisely, have your shaders initialized within your Graphics class separate from your models.
If you create a TextureShader class with your texture.fx file, then there is no need to initialized another instance of it;
rather share the TextureShader object with the appropriate model(s) then
create a Renderer struct/class to hold on the both the Shader pointer and the (what ever)Model pointer using virtual when you need to.

Strategies for managing the OpenGL state machine

I'm currently getting to grips with OpenGL. I started out with GLUT but decided to "graduate" to the SFML libraries. SFML actually provides even less GL utilities than GLUT, but is portable and provides some other functionalities. So it's really just me, GL and GLU. Yes, I'm a sucker for punishment.
I wanted to ask about strategies that people have for managing things such as matrix changes, colour changes, material changes etc.
Currently I am rendering from a single thread following a "Naked Objects" design philosophy. ie. Every graphical object has a Render() function which does the work of drawing itself. These objects may themselves be aggregates of further objects, or aggregates of graphical primitives. When a particular Render() is called it has no information about what transformations/ material changes have been called before it (a good thing, surely).
As things have developed I have settled on certain strategies such as making every function promise to push then pop the matrices if they perform any transformations. With other settings, I explicitly set anything that needs setting before calling glBegin() and take nothing for granted. Problems creep in when one render function makes some changes to less common state variables, and I am starting to consider using some RAII to enforce the reversal of all state changes made in a scope. Using OpenGL sometimes reminds me alot of assembly programming.
To keep this all manageable, and to help with debugging, I find that I am practically developing my own openGL wrapper, so I figured it would be good to hear about strategies that others have used, or thoughts and considerations on the subject. Or maybe it's just time to switch to something like a scene graph library?
Update : 13/5/11
Having now looked into rendering with vertex/normal/colour arrays and VBO's I have decided to consolidate all actual openGL communication into a separate module. The rendering process will consist of getting a load of GL independent spatial/material data out of my objects and then conveying all this information to openGL in an interpretable format. This means all raw array handling and state manipulation will be consolidated into one area. It's adds an extra indirection, and a little computation overhead to the rendering process, but it means that I can use a single VBO / array for all my data and then pass it all at once, once per frame to openGL.
So it's really just me, GL and GLU
I see nothing bad in that. I'd even get rid of GLU, if possible.
With other settings, I explicitly set
anything that needs setting before
calling glBegin() and take nothing for
granted.
Also this is a good strategy, but of course you should keep expensive state switches to a minimum. Instead of immediate mode (glBegin / glEnd) you should migrate to using vertex arrays and if available vertex buffer objects.
Problems creep in when one render
function makes some changes to less
common state variables, and I am
starting to consider using some RAII
to enforce the reversal of all state
changes made in a scope.
Older versions of OpenGL provide you the attribute stack with accessing functions glPushAttrib / glPopAttrib and glPushClientAttrib / glPopClientAttrib for the client states.
But yes, the huge state space of older OpenGL versions was one of the major reasons to slimming down OpenGL-3; what's been covered by a lot of fixed function pipeline states now is configured and accessed through shaders, where each shader encapsulates what would have been dozens of OpenGL state variable values.
Using OpenGL sometimes reminds me alot
of assembly programming.
This not a suprise at all, as the very first incarnation of OpenGL was designed assuming some abstract machine (the implementation) on which OpenGL calls are kind of the Operation Codes of that machine.
First, try not to use glBegin/glEnd calls for new development. They are deprecated in OpenGL 3 and simply don't work in OpenGL ES (iOS, WebOS, Android). Instead, use vertex arrays and VBOs to consolidate your drawing.
Second, instead of writing your own wrapper, take a look at some recent open source ones to see how they do things. For example, check out the Visualization Library (http://www.visualizationlibrary.com/jetcms/). It's a fairly thin wrapper around OpenGL, so it's worth a look.