What is the most efficient method of rendering sprites in DirectX 10?

What is the most efficient method of rendering sprites in DirectX 10? - c++

I am currently experimenting with various ways of displaying 2D sprites in DirectX 10. I began by using the ID3DX10Sprite interface to batch draw my sprites in a single call. Eventually, however, I wanted a little more control over how my sprites were rendered, so I decided to look into quad-based sprite rendering (ie each sprite being represented by a quad with a texture applied).
I started out simple: I created a single vertex buffer consisting of 4 vertices that was applied once before the sprites were drawn. I then looped through my sprites, setting the appropriate properties to be passed into the shader, and making a draw call for each sprite, like so: d3dDevice->Draw( 4, 0);. Though it worked, the draw call for every sprite bugged me, so I looked for a more efficient method.
After searching about, I learned about object instancing, and decided to try it out. Everything went well until I tried implementing the most important part of sprites--textures. In short, though I had a texture array (declared at the top of my shader like so Texture2D textures[10];) that could be successfully sampled within my pixel shader using literals/constants as indexes, I could not figure out how to control which textures were applied to which instances via a texture index.
The idea would be for me to pass in a texture index per instance, that could then be used to sample the appropriate texture in the array within the pixel shader. However, after searching around more, I could not find an example of how it could be done (and found many things suggesting that it could not be done without moving to DirectX 11).
Is that to say that the only way to successfully render sprites via object instancing in DirectX 10 is to render them in batches based on texture? So, for example, if my scene consists of 100 sprites with 20 different textures (each texture referenced by 5 sprites), then it would take 20 separate draw calls to display the scene, and I would only be sending 5 sprites at a time.
In the end, I am rather at a loss. I have done a lot of searching, and seem to be coming up with conflicting information. For example, in this article in paragraph 6 it states:
Using DirectX 10, it is possible to apply different textures in the array to different instances of the same object, thus making them look different
In addition, on page 3 of this whitepaper, it mentions the option to:
Read a custom texture per instance from a texture array
However, I cannot seem to find a concrete example of how the shader can be setup to access a texture array using a per instance texture index.
In the end, the central question is: What is the most efficient method of rendering sprites using DirectX 10?
If the answer is instancing, then is it possible to control which texture is applied to each specific instance within the shader--thereby making it possible to send in much larger batches of sprites along with their appropriate texture index with only a single draw call? Or must I be content with only instancing sprites with the same texture at a time?
If the answer is returning to the use of the provided DX10 Sprite interface, then is there a way for me to have more control over how it is rendered?
As a side note, I have also looked into using a Geometry Shader to create the actual quad, so I would only have to pass in a series of points instead of managing a vertex and instance buffer. Again, though, unless there is a way to control which textures are applied to the generated quads, then I'm back to only batching sprites by textures.

There's a few ways (as usual) to do what you describe.
Please note that using
Texture2D textures[10];
will not allow you to use a variable index for lookup in Pixel Shader (since technically this declaration will allocate a slot per texture).
So what you need is to create a Texture2DArray instead. This is a bit like a volume texture, but the z component is a full number and there's no sampling on it.
You will need to generate this texture array though. Easy way is on startup you do one full screen quad draw call to draw each texture into a slice of the array (you can create a RenderTargetView for a specific slice). Shader will be a simple passtrough here.
To create a Texture Array (code is in SlimDX but, options are similar):
var texBufferDesc = new Texture2DDescription
{
ArraySize = TextureCount,
BindFlags = BindFlags.RenderTarget | BindFlags.ShaderResource,
CpuAccessFlags = CpuAccessFlags.None,
Format = format,
Height = h,
Width = w,
OptionFlags = ResourceOptionFlags.None,
SampleDescription = new SampleDescription(1,0),
Usage = ResourceUsage.Default,
};
Then shader resource view is like this:
ShaderResourceViewDescription srvd = new ShaderResourceViewDescription()
{
ArraySize = TextureCount,
FirstArraySlice = 0,
Dimension = ShaderResourceViewDimension.Texture2DArray,
Format = format,
MipLevels = 1,
MostDetailedMip = 0
};
Finally, to get a render target for a specific slice:
RenderTargetViewDescription rtd = new RenderTargetViewDescription()
{
ArraySize = 1,
FirstArraySlice = SliceIndex,
Dimension = RenderTargetViewDimension.Texture2DArray,
Format = this.Format
};
Bind that to your passtrough shader, set desired texture as input and slice as output and draw a full screen quad (or full screen triangle).
Please note that this texture can also be saved in dds format (so it saves you to regenerate every time you start your program).
Looking up your Texture is like:
Texture2DArray myarray;
In Pixel Shader:
myarray.Sample(mySampler, float2(uv,SliceIndex);
Now about rendering sprites, you also have the option of GS expansion.
So you create a vertex buffer containing only the position/size/textureindex/whatever else you need one one vertex per sprite.
Send a draw call with n sprites (Topology needs to be set to point list).
Passtrough the data from vertex shader to geometry shader.
Expand your point into quad in geometry shader, you can find an example which is ParticlesGS in Microsoft SDK doing that, it's a bit overkill for your case since you only need the rendering part for it, not the animation. If you need some cleaned code let me know I'll quickly make a dx10 compatible sample (In my case I use StructuredBuffers instead of VertexBuffer)
Doing a pre-made Quad and passing the above data in Per Instance VertexBuffer is also possible, but if you have a high number of sprites it will easily blow up your graphics card (by high I mean something like over 3 million particles, which is not much by nowadays standards, but if you're under half a million sprites you'll be totally fine ;)

Include the texture index within the instance buffer and use this to select the correct texture from the texture array per instance:
struct VS
{
float3 Position: POSITION;
float2 TexCoord: TEXCOORD0;
float TexIndex: TexIndex; // From the instance buffer not the vertex buffer
}
Then pass this value on through to the pixel shader
struct PS
{
float4 Positon: SV_POSITION;
float3 TexCoord: TEXCOORD0;
}
..
vout.TexCoord = float3(vin.TexCoord, vin.TexIndex);

Related

Libgdx shader that affects whole screen

I'm making a game in Libgdx.
The only way I have ever known how to use shaders is to have the batch affect the given textures one after another. This is what I normally do in my code:
shader = new ShaderProgram(Gdx.files.internal("shaders/shader.vert"), Gdx.files.internal("shaders/shader.frag"));
batch.setShader(shader);
And that's about all of the needed code.
Anyways, I do not want this separation between textures. However, I can't find any way to affect the whole screen at once with a shader, like the whole screen is just one big texture. To me, it seems like the most logical way to use a shader.
So, does anyone know how to do something like this?

Draw all textures (players, actors, landscape, ...) with the same batch and, if you want to affect also the background with the same shader, draw a still texture with the size of the screen in the background and draw it with the same batch.

Quite easy with FBO objects, you can get "the whole screen as just one big texture" like you said in your question:
First of all, before any rendering, create yout FBO object and begin it:
FrameBuffer fbo = new FrameBuffer(Format.RGBA8888, Width, Height, false);
fbo.begin();
Then do all of your normal rendering:
Gdx.gl.glClearColor(0.2f, 0.2f, 0.2f, 1);
Gdx.gl.glClear(GL20.GL_COLOR_BUFFER_BIT);
...
Batch b = new SpriteBatach(...
//Whatever rendering code you have
Finally save that FBO into a texture or sprite, do any transformation needed on it, and prepare and use your shader on it.
fbo.end();
SpriteBatch b = new SpriteBatch();
Sprite s = new Sprite(fbo.getColorBufferTexture());
s.flip(false,true); //Coord systems in buffer differs from screen
b.setShader(your_shader);
b.begin();
your_shader.setUniformMatrix("u_projTrans",camera.combined); //if you have camera
viewport.apply(); //if you have viewport
b.draw(s,0,0,viewportWidth,viewportHeight);
b.end();
b.setShader(null);
And this is all!
Essentially what you are doing is to "render" all your assets and game scene and stages into a buffer, than, saving that buffer image into a texture and finally rendering that texture with the shader effect you want.
As you may notice, this is highly inefficient, since you are copying all your screen to a buffer. Also note that some older drivers only support power of 2 sizes for the FBO, so you may have to have that in mind, check here for more information on the topic.

OpenGL Multiple Render Targets with multiple gl_Position output

I'm looking for a MRT where I can write to my buffers at different position.
Example
Buffer 0 :
gl_Position[0] = vec4(uv,0.,1.);
gl_FragData[0] = vec4(1.);
Buffer 1 :
gl_Position[1] = MVP * pos;
gl_FragData[1] = vec4(0.);
Is it possible to have multiple output in a vertex shader ?
I can't find any resources about that..

Is it possible to have multiple output in a vertex shader ?
No, but that doesn't mean you can't get the effect of what you want. Well, you didn't describe in any real detail what you wanted, but this is as close as OpenGL can provide.
What you want seems rather like layered rendering. It is the ability of a Geometry Shader to generate primitives that go to different layers. So you can generate one triangle that renders to one layer, then generate a second triangle that goes to a different layer.
Of course, that raises a question: what's a layer? Well, that has to do with layered framebuffers. See, if you attach a layered image to a framebuffer (an array texture or cubemap texture), each array layer/cubemap face represents a different 2D layer that can be rendered to. The Geometry Shader can send each output primitive to a specific layer in the layered framebuffer. So if you have 3 array layers in an image, your GS can output a primitive to layer 0, 1, or 2, and that primitive will only be rendered to that particular image in the array texture.
Depth buffers can be layered as well, and you must use a layered depth buffer if you want to have depth testing work at all with layered rendering. All aspects of the primitive's rendering are governed by the layer it gets send to. So when the fragment shader is run, the outputs will go only to the layer it was rendered to. When the depth test is done, the read for that test will only read from that layer of the depth buffer. And so on, including blending.
Of course, using layered framebuffers means that all of the layers in a particular image attachment have to be from the same texture. And therefore they must have the same Image Format. So there are limitations. But overall, it more or less does what you asked.

Draw multiple shapes in one vbo

I want to render multiple 3D cubes from one vbo. Each cube has a uniform color.
At this time, I create a vbo where each vertex has a color information.
Is it posible to upload only one color for a one shape (list of verticies)?
I'm also want to mix GL_TRIANGLES and GL_LINES in the glDrawElements-method of the same shader. Is it posible?
//Edit : I only have OpenGL 2.1. Later I want to build this project on Android.
//Edit 2:
I want to render a large count of cubes (up to 150.000). One cube has 24 verticies of geometry and color and 34 indices. Now my idea is to create some vbo's (maybe 50) and share out the cubes to the vbo's. I hope that this minimizes the overhead.

Drawing lots of cubes
Yes, if you want to draw a bunch of cubes, you can specify the color for each cube once.
Create a VBO containing the vertexes for one cube.
// cube = 36 vertexes with glDrawArrays(GL_TRIANGLES)
vbo1 = [v1] [v2] [v3] ... [v36]
Create another VBO with the view matrix and color for each cube, and use an attribute divisor of 1. (You can use the same vbo, but I would use a separate one.)
vbo2 = [cube 1 mat, color] [cube 2 mat, color] ... [cube N mat, color]
Call glDrawElementsInstanced() or glDrawArraysInstanced(). This will draw the cube over and over again.
Alternatively, you can use glUniform() for each cube, but this will limit the number of cubes you can draw. The above method will let you draw thousands, easily.
Mixing GL_TRIANGLES and GL_LINES
You will have to call glDraw????() once for each type of primitive. You can use the same shader for both times, if you like.

Regarding your questions :
Is it possible to upload only one color for one shape ?
Yes , you can use a uniform instead of a vertex attribute(ofc this means changes in more places). However, you will need to set the uniform for each shape, and have a different drawcall for each differently colored shape .
Is it possible to mix GL_TRIANGLES and GL_LINES in the glDrawElements ?
Yes and no. Yes , but you will need a new drawcall (which is obvious). You cannot do on the same drawcall some shapes with GL_TRIANGLES and some shapes with GL_LINES.
In pseudocode this will look like this :
draw shapes 1,2,10 from the vbo using color red and GL_TRIANGLES
draw shapes 3,4,6 from the vbo using color blue and GL_LINES
draw shapes 7,8,9 from the vb using color blue and GL_TRIANGLES

With OpenGL 2.1, I don't think there's a reasonable way of specifying the color only once per cube, and still draw everything in a single draw call.
The most direct approach is that, instead of having the color attribute in a VBO, you specify it directly before the draw call. Assuming that you're using generic vertex attributes, where you would currently have:
glEnableVertexAttribArray(colorLoc);
glVertexAttripPointer(colorLoc, ...);
you do this:
glDisableVertexAttribArray(colorLoc);
glVertexAttrib3f(colorLoc, r, g, b);
where glDisableVertexAttribArray() is only needed if the array was previously enabled for the location.
The big disadvantage is that you can only draw cubes with the same color in one draw call. In the extreme case, that's one draw call per cube. Of course if you have multiple cubes with the same color, you could still batch those into a single draw call.
You wonder whether this is more efficient than having a color for each vertex in the VBO? Impossible to say in general. You'll always get the same answer in cases like this: Try both, and benchmark. I'm skeptical that you will find it beneficial. In my experience, it's fairly rare for fetching vertex data to be a major performance bottleneck. So cutting out one attribute will likely no give you much of a gain. On the other hand, making many small draw calls absolutely can (and often will) hurt performance.
There is one option you can use that is sort of a hybrid. I'm not necessarily recommending it, but just in the interest of brainstorming. If you use a fairly limited number of colors, you can use a single scalar attribute in the VBO that encodes a "color index". Then in the vertex shader, you can use a texture lookup to translate the "color index" to the actual color.
The really good options are beyond OpenGL 2.1. #DietrichEpp nicely explained instanced rendering, which is an elegant solution for cases like this.
And no, you can not have lines and triangles in the same draw call. Even the most flexible draw calls in OpenGL 4.x, like glDrawElementsIndirect(), still take only one primitive type.

How many depthtextures can i bind to a framebuffer?

I am trying to create shadow maps of many objects in a sceneRoom with their shadows being projected on the sceneRoom. Untill now i've been able to project the shadows of the sceneRoom on itself, but i want to project the shadows of other Objects in the sceneRoom on the sceneRoom's floor.
is it possible to create multiple depth textures in one framebuffer? or should i use several Framebuffers where each has one depth texture?

There is only one GL_DEPTH_ATTACHMENT point, so you can only have at most one attached depth buffer at any time. So you have to use some other method.

No, there is only one attachment point (well, technically two if you count GL_DEPTH_STENCIL_ATTACHMENT) for depth in an FBO. You can only attach one thing to the depth, but that does not mean you are limited to a single image.
You can use an array texture to store multiple depth images and then attach this array texture to GL_DEPTH_ATTACHMENT.
However, the only way to draw into an explicit array level in this texture would be to use a Geometry Shader to do layered rendering. Since it sounds like each one of these depth images you are interested in are actually completely different sets of geometry, this does not sound like the approach you want. If you used a Geometry Shader to do this, you would process the same set of geometry for each layer.
One thing you could consider is actually using a single depth buffer, but packing your shadow maps into an atlas. If each of your shadow maps is 512x512, you could store 4 of them in a single texture with dimensions 1024x1024 and adjust texture coordinates (and viewport when you draw into the atlas) appropriately. The reason you might consider doing this is because changing the render target (FBO state) tends to be the most expensive thing you would do between draw calls in a series of depth-only draws. You might change a few uniforms or vertex pointers, but those are dirt cheap to change.

Avoiding glBindTexture() calls?

My game renders lots of cubes which randomly have 1 of 12 textures. I already Z order the geometry so therefore I cant just render all the cubes with texture1 then 2 then 3 etc... because that would defeat z ordering. I already keep track of the previous texture and in they are == then I do not call glbindtexture, but its still way too many calls to this. What else can I do?
Thanks

Ultimate and fastest way would be to have an array of textures (normal ones or cubemaps). Then dynamically fetch the texture slice according to an id stored in each cube instance data/ or cube face data (if you want a different texture on a per cube face basis) using GLSL built-in gl_InstanceID or gl_PrimitiveID.
With this implementation you would bind your texture array just once.
This would of course required used of gpu_shader4 and texture_array extensions:
http://developer.download.nvidia.com/opengl/specs/GL_EXT_gpu_shader4.txt
http://developer.download.nvidia.com/opengl/specs/GL_EXT_texture_array.txt
I have used this mechanism (using D3D10, but principle applies too) and it worked very well.
I had to map on sprites (3D points of a constant screen size of 9x9 or 15x15 pixels IIRC) differents textures indicating each a different meaning for the user.
Edit:
If you don't feel comfy with all shader stuff, I would simply sort cubes by textures, and don't Z order the geometry. Then measure performances gains.
Also I would try to add a pre-Z pass where you render all your cubes in Z buffer only, then render normal scene, and see if it speed up things (if fragments bound, it could help).

You can pack your textures into one texture and offset the texture coordinates accordingly
glMatrixMode(GL_TEXTURE) will also allow you to perform transformations on the texture space (to avoid changing all the texture coords)

Also from NVIDIA:
Bindless Graphics

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js