I am trying to create a spectogram efficently. Currently I am doing everything on CPU using a texture buffer by looping through the whole texture buffer and pushing the new data to the "queue". However, this costs me alot of CPU time. I want to add new column of pixel data to the texture, move old data to right, so the new data appear on left side while the old data moves to right. This will create a waterfall/sidescrolling effect if I do it each frame.
I am using glTexSubImage2D() to add new data, but this will not advance old data to right. How can I achieve this by using OpenGL?
I don't see a need to move any data around. Simply treat the texture as circular in the horizontal direction, basically a circular buffer of columns. Then take scare of the scrolling during rendering by choosing the texture coordinates accordingly.
Say you want to display n columns at a time. Create a texture of width n, and in each step k store the data in column k % n of the texture:
glTexSubImage(GL_TEXTURE_2D, 0, k % n, 0, 1, height, ...);
Then use texture coordinates in the range 1 + (k % n) / n to (k % n) / n in the horizontal direction, with the texture wrap mode set to GL_REPEAT. Or pass an offset to your shader, and add it to the texture coordinates in the GLSL code.
And of course, if you have all data ahead of time, and it's not very large, you can simply store all of it in a texture right from the start, and scroll through it by shifting the texture coordinates.
If you do this the way you say you want to do this, it will be expensive. You will need to copy the data from the texture back onto the cpu (or keep a copy on the cpu), and then add your data onto it there, then use glTexSubImage2D to copy the whole new image back again.
An alternative if you already know all the data is to place it all in the texture, then slowly move the texture to the right. If you need to you could make a black square to cover parts of the texture you don't want visible.
You could also go in between and create multiple textures, a new one each time you get enough data, and move them in succession.
it can be done by the fragment shader (or pixel shader).This shader is executed by the GPU of your system.There are different shading languages are available.one from the nVidia( cg Shader ) and others are GLSL or HLSL provided by the Microsoft. With shaders its possible( They are used for this kind of purposes).
And definitely your CPU time will be reduced because its executed by the GPU.
Related
I am using SDL2 to create a context for OpenGL. I use SDL_image to load the images, and I bind them to OpenGL textures. But because the coordinate system isn't the same the textures are flipped.
I found two ways to correct this:
Modify the texture after loading it
Advantage: Only done once per texture
Disadvantage: Done using the CPU which slows down the loading of each texture
Apply a rotation of 180° on the Y and Z axis when rendering
Advantage: Using super fast functions
Disadvantage: Needs to be done multiple times per frame
Is there another way to flip back the textures after they have been loaded with SDL_Image? And if not, which method is usually used?
There are a bunch of options. Some that come to mind:
Edit original assets
You can flip the image files upside down with an image processing tool, and use the flipped images as your assets. They will look upside down when viewed in an image viewer, but will then turn out correct when used as textures.
This is the ideal solution if you're in full control of the images. It obviously won't work if you get images from external sources at runtime.
Flip during image load
Some image loading libraries allow you to flip the image during loading. From the documentation of SOIL_image I could find, I did not see this option there. But you might be able to find an alternate library that supports it. And of course you can do this if you write your own image loading.
This is a good solution. The overhead is minimal sice you do the flipping while you're touching the data anyway. One common approach is that you read the data row by row, and store in the texture in the opposite order, using glTexSubImage2D().
Flip between loading and first use
You can create a flipped copy of the texture after you already loaded it. The typical way to do this would be by drawing a screen sized quad while sampling the original texture and rendering to an FBO that has the resulting flipped texture as a rendering target. Or, more elegant, use glBlitFramebuffer().
This is not very appealing because it involves copying the memory. While it should be quite efficient if you let the GPU create the copy, extra copying is always undesirable. Even if it happens only once for each texture, it can increase your startup/loading time.
Apply transformation to texture coordinates
You can apply a transformation to the texture coordinates in either the vertex or fragment shader. You're talking about rotations in your question, but the transformation you need is in fact trivial. You basically just map the y of the texture coordinate to 1.0 - y, and leave the x unchanged.
This adds a small price to shader execution. But the operation is very simple and fast compared to the texture sampling operation it goes along with. In reality, the added overhead is probably insignificant. While I don't think it's very pretty, it's a perfectly fine solution.
Invert the texture coordinates
This is similar to the previous option, but instead of inverting the texture coordinates in the shader, you already specify them inverted in the vertex attribute data.
This is often trivial to do. For example, it is very common to texture quads by using texture coordinates of (0, 0), (1, 0), (0, 1), (1, 1) for the 4 corners. Instead, you simply replace 0 with 1 and 1 with 0 in the second components of the texture coordinates.
Or say you load a model containing texture coordinates from a file. You simply replace each y in the texture coordinates by 1.0f - y during reading, and before storing away the texture coordinates for later rendering.
IMHO, this is often the best solution. It's very simple to do, and has basically no performance penalty.
I would disagree with most of the previous answer's point, except for flipping the image either on load, or before first use.
The reason being that if you are following data driven software development practices, you should never allow code to dictate the nature of data. The software should be designed to support the data accurately. Anything else is not fit for purpose.
Modifying texture coordinates is hackery, despite it's ease of use. What happens if you decide at some later stage, to use a different image library which doesn't flip the image? Now your image will be inverted again during rendering.
Instead, deal with the problem at the source, and flip the image during load or before first use (I advocate on load, as it can be integrated into the code that loads the image via SDL_Image, and therefore is more easily maintained).
To flip an image, I'll post some simple pseudo code which will illustrate how to do it:
function flip_image( char* bytes, int width, int height, int bytes_per_pixel):
char buffer[bytes_per_pixel*width]
for ( i = 0 -> height/2 ) loop
offset = bytes + bytes_per_pixel*width * i
copy row (offset -> offset + bytes_per_pixel*width) -> buffer
offset2 bytes + bytes_per_pixel * height * width;
copy row (offset2 -> offset2 + bytes_per_pixel*width) -> (offset -> offset + bytes_per_pixel*width)
copy row(buffer -> buffer + width * bytes_per_pixel ) -> offset
end loop
Here is a visual illustration of one iteration of this code loop:
Copy current row N to buffer
Copy row (rows - N) to row N
Copy buffer to row (rows - N)
Increment N and repeat until N == rows/2
However, this will only work on images which have an even number of rows, which is fine as opengl doesn't like texture with non-power of two dimensions.
It should also be noted that if the image loaded does not have power of two width, SDL_Image pads it. Therefore, the "width" passed to the function should be the pitch of the image, not it's width.
I have the following problem (no code yet):
We have a data set of 4000 x 256 with a 16 bit resolution, and I need to code a program to display this data.
I wanted to use DirectX or OpenGL to do so, but I don't know what the proper approach is.
Do I create a buffer with 4000 x 256 triangles with the resolution being the y axis, or would I go ahead and create a single quad and then manipulate the data by using tesselation?
When would I use a big vertex buffer over tesselation and vice versa?
It really depends on a lot of factors.
You want to render a map of about 1million pixels\vertices. Depending on your hardware this could be doable with the most straight forward technique.
Out of my head I can think of 3 techniques:
1) Create a grid of 4000x256 vertices and set their height according to the height map image of your data.
You set the data once upon creation. The shaders will just draw the static buffer and a apply a single transform matrix(world\view\projection) to all the vertices.
2) Create a grid of 4000x256 vertices with height 0 and translate each vertex's height inside the vertex shader by the sampled height map data.
3) The same as 2) only you add a tessellation phase.
The advantage of doing tessellation is that you can use a smaller vertex buffer AND you can dynamically tessellate in run time.
This mean you can make part of your grid more tessellated and part of it less tessellated. For instance maybe you want to tessellate more only where the user is viewing the grid.
btw, you can't tesselate one quad into a million quads, there is a limit how much a single quad can tessellate. But you can tessellate it quite a lot, in any case you will gain several factors of reduced grid size.
If you never used DirectX or OpenGL I would go with 1. See if it's fast enough and only if it's not fast enough go with 2 and last go to 3.
The fact that you know the theory behind 3D graphics rendering doesn't mean it will be easy for you to learn DirectX or OpenGL. They are difficult to understand and learn because they are quite complex as an API.
If you want you can take a look at some tessellation stuff I did using DirectX11:
http://pompidev.net/2012/09/25/tessellation-simplified/
http://pompidev.net/2012/09/29/tessellation-update/
I am currently experimenting with various ways of displaying 2D sprites in DirectX 10. I began by using the ID3DX10Sprite interface to batch draw my sprites in a single call. Eventually, however, I wanted a little more control over how my sprites were rendered, so I decided to look into quad-based sprite rendering (ie each sprite being represented by a quad with a texture applied).
I started out simple: I created a single vertex buffer consisting of 4 vertices that was applied once before the sprites were drawn. I then looped through my sprites, setting the appropriate properties to be passed into the shader, and making a draw call for each sprite, like so: d3dDevice->Draw( 4, 0);. Though it worked, the draw call for every sprite bugged me, so I looked for a more efficient method.
After searching about, I learned about object instancing, and decided to try it out. Everything went well until I tried implementing the most important part of sprites--textures. In short, though I had a texture array (declared at the top of my shader like so Texture2D textures[10];) that could be successfully sampled within my pixel shader using literals/constants as indexes, I could not figure out how to control which textures were applied to which instances via a texture index.
The idea would be for me to pass in a texture index per instance, that could then be used to sample the appropriate texture in the array within the pixel shader. However, after searching around more, I could not find an example of how it could be done (and found many things suggesting that it could not be done without moving to DirectX 11).
Is that to say that the only way to successfully render sprites via object instancing in DirectX 10 is to render them in batches based on texture? So, for example, if my scene consists of 100 sprites with 20 different textures (each texture referenced by 5 sprites), then it would take 20 separate draw calls to display the scene, and I would only be sending 5 sprites at a time.
In the end, I am rather at a loss. I have done a lot of searching, and seem to be coming up with conflicting information. For example, in this article in paragraph 6 it states:
Using DirectX 10, it is possible to apply different textures in the array to different instances of the same object, thus making them look different
In addition, on page 3 of this whitepaper, it mentions the option to:
Read a custom texture per instance from a texture array
However, I cannot seem to find a concrete example of how the shader can be setup to access a texture array using a per instance texture index.
In the end, the central question is: What is the most efficient method of rendering sprites using DirectX 10?
If the answer is instancing, then is it possible to control which texture is applied to each specific instance within the shader--thereby making it possible to send in much larger batches of sprites along with their appropriate texture index with only a single draw call? Or must I be content with only instancing sprites with the same texture at a time?
If the answer is returning to the use of the provided DX10 Sprite interface, then is there a way for me to have more control over how it is rendered?
As a side note, I have also looked into using a Geometry Shader to create the actual quad, so I would only have to pass in a series of points instead of managing a vertex and instance buffer. Again, though, unless there is a way to control which textures are applied to the generated quads, then I'm back to only batching sprites by textures.
There's a few ways (as usual) to do what you describe.
Please note that using
Texture2D textures[10];
will not allow you to use a variable index for lookup in Pixel Shader (since technically this declaration will allocate a slot per texture).
So what you need is to create a Texture2DArray instead. This is a bit like a volume texture, but the z component is a full number and there's no sampling on it.
You will need to generate this texture array though. Easy way is on startup you do one full screen quad draw call to draw each texture into a slice of the array (you can create a RenderTargetView for a specific slice). Shader will be a simple passtrough here.
To create a Texture Array (code is in SlimDX but, options are similar):
var texBufferDesc = new Texture2DDescription
{
ArraySize = TextureCount,
BindFlags = BindFlags.RenderTarget | BindFlags.ShaderResource,
CpuAccessFlags = CpuAccessFlags.None,
Format = format,
Height = h,
Width = w,
OptionFlags = ResourceOptionFlags.None,
SampleDescription = new SampleDescription(1,0),
Usage = ResourceUsage.Default,
};
Then shader resource view is like this:
ShaderResourceViewDescription srvd = new ShaderResourceViewDescription()
{
ArraySize = TextureCount,
FirstArraySlice = 0,
Dimension = ShaderResourceViewDimension.Texture2DArray,
Format = format,
MipLevels = 1,
MostDetailedMip = 0
};
Finally, to get a render target for a specific slice:
RenderTargetViewDescription rtd = new RenderTargetViewDescription()
{
ArraySize = 1,
FirstArraySlice = SliceIndex,
Dimension = RenderTargetViewDimension.Texture2DArray,
Format = this.Format
};
Bind that to your passtrough shader, set desired texture as input and slice as output and draw a full screen quad (or full screen triangle).
Please note that this texture can also be saved in dds format (so it saves you to regenerate every time you start your program).
Looking up your Texture is like:
Texture2DArray myarray;
In Pixel Shader:
myarray.Sample(mySampler, float2(uv,SliceIndex);
Now about rendering sprites, you also have the option of GS expansion.
So you create a vertex buffer containing only the position/size/textureindex/whatever else you need one one vertex per sprite.
Send a draw call with n sprites (Topology needs to be set to point list).
Passtrough the data from vertex shader to geometry shader.
Expand your point into quad in geometry shader, you can find an example which is ParticlesGS in Microsoft SDK doing that, it's a bit overkill for your case since you only need the rendering part for it, not the animation. If you need some cleaned code let me know I'll quickly make a dx10 compatible sample (In my case I use StructuredBuffers instead of VertexBuffer)
Doing a pre-made Quad and passing the above data in Per Instance VertexBuffer is also possible, but if you have a high number of sprites it will easily blow up your graphics card (by high I mean something like over 3 million particles, which is not much by nowadays standards, but if you're under half a million sprites you'll be totally fine ;)
Include the texture index within the instance buffer and use this to select the correct texture from the texture array per instance:
struct VS
{
float3 Position: POSITION;
float2 TexCoord: TEXCOORD0;
float TexIndex: TexIndex; // From the instance buffer not the vertex buffer
}
Then pass this value on through to the pixel shader
struct PS
{
float4 Positon: SV_POSITION;
float3 TexCoord: TEXCOORD0;
}
..
vout.TexCoord = float3(vin.TexCoord, vin.TexIndex);
I just watching my animated sprite code, and get some idea.
Animation was made by altering tex coords. It have buffer object, which holds current frame texture coords, as new frame requested, new texture coords feed up in buffer by glBufferData().
And what if we pre-calculate all animation frames texture coords, put them in BO and create Index Buffer Object with just a number of frame, which we need to draw
GLbyte cur_frames = 0; //1,2,3 etc
Now then as we need to update animation, all we need is update 1 byte (instead of 4 /quad vertex count/ * 2 /s, t/ * sizeof(GLfloat) bytes for quad drawing with TRIANGLE_STRIP) frame of our IBO with glBufferData, we don't need hold any texture coords after init of our BO.
I am missing something? What are contras?
Edit: of course your vertex data may be not gl_float just for example.
As Tim correctly states, this depends on your application, let us talk some numbers, you mention both IBO's and inserting texture coordinates for all frames into one VBO, so let us take a look at the impact of each.
Suppose a typical vertex looks like this:
struct vertex
{
float x,y,z; //position
float tx,ty; //Texture coordinates
}
I added a z-component but the calculations are similar if you don't use it, or if you have more attributes. So it is clear this attribute takes 20 bytes.
Let's assume a simple sprite: a quad, consisting of 2 triangles. In a very naive mode you just send 2x3 vertices and send 6*20=120 bytes to the GPU.
In comes indexing, you know you have actually only four vertices: 1,2,3,4 and two triangles 1,2,3 and 2,3,4. So we send two buffers to the GPU: one containing 4 vertices (4*20=80 byte) and one containing the list of indices for the triangles ([1,2,3,2,3,4]), let's say we can do this in 2 byte (65535 indices should be enough), so this comes down to 6*2=12 byte. In total 92 byte, we saved 28 byte or about 23%. Also, when rendering the GPU is likely to only process each vertex once in the vertex shader, it saves us some processing power also.
So, now you want to add all texture coordinates for all animations at once. First thing you have to note is that a vertex in indexed rendering is defined by all it's attributes, you can't split it in an index for positions and an index for texture coordinates. So if you want to add extra texture coordinates, you will have to repeat the positions. So each 'frame' that you add will add 80 byte to the VBO and 12 byte to the IBO. Suppose you have 64 frames, you end up with 64*(80+12)=5888byte. Let's say you have 1000 sprites, then this would become about 6MB. That does not seem too bad, but note that it scales quite rapidly, each frame adds to the size, but also each attribute (because they have to be repeated).
So, what does it gain you?
You don't have to send data to the GPU dynamically. Note that updating the whole VBO would require sending 80 bytes or 640 bits. Suppose you need to do this for 1000 sprites per frame at 30 frames per second, you get to 19200000bps or 19.2Mbps (no overhead included). This is quite low (e.g. 16xPCI-e can handle 32Gbps), but it could be worth wile if you have other bandwidth issues (e.g. due to texturing). Also, if you construct your VBO's carefully (e.g. separate VBO's or non-interleaved), you could reduce it to only updating the texture-part, which is only 16 byte per sprite in the above example, this could reduce bandwidth even more.
You don't have to waste time computing the next frame position. However, this is usually just a few additions and few if's to handle the edges of your textures. I doubt you will gain much CPU power here.
Finally, you also have the possibility to simply split the animation image over a lot of textures. I have absolutely no idea how this scales, but in this case you don't even have to work with more complex vertex attributes, you just activate another texture for each frame of animation.
edit: another method could be to pass the frame number in a uniform and do the calculations in your fragment shader, before sampling. Setting a single integer uniform should be that much of an overhead.
For a modern GPU, accessing/unpacking single bytes is not necessarily faster than accessing integer types or even vectors (register sizes & load instructions, etc.). You can just save memory and therefore memory bandwidth, but I wouldn't expect this to give much of a difference in relation to all other vertex attribute array accesses.
I think, the fastest way to supply a frame index for animated sprites is either an uniform, or if multiple sprites have to be rendered with one draw call, the usage of instanced vertex attrib arrays. With the latter, you could provide a single index for fixed-size subsequences of vertices.
For example, when drawing 'sprite-quads', you'd have one frame index fetch per 4 vertices.
A third approach would be a buffer-texture, when using instanced rendering.
I recommend a global (shared) uniform for time/frame index calculation, so you can calculate the animation index on the fly within you shader, which doesn't require you to update the index buffer (which then just represents the relative animation state among sprites)
I've searched for a while and I've heard of different ways to do this, so I thought I'd come here and see what I should do,
From what I've gathered I should use.. glBitmap and 0s and 0xFF values in the array to make the terrain. Any input on this?
I tried switching it to quads, but I'm not sure that is efficient and the way its meant to be done.
I want the terrain to be able to have tunnels, such as worms. 2 Dimensional.
Here is what I've tried so far,
I've tried to make a glBitmap, so..
pixels = pow(2 * radius, 2);
ras = new GLubyte[pixels];
and then set them all to 0xFF, and drew it using glBitmap(x, y, 0, 0, ras);
This could be then checked for explosions and what not and whatever pixels could be set to zero. Is this a plausible approach? I'm not too good with opengl, can I put a texture on a glBitmap? From what I've seen it I don't think you can.
I would suggest you to use the stencil buffer. You mark destroyed parts of the terrain in the stencil buffer and then draw your terrain with stencil testing enabled with a simple quad without manually testing each pixel.
OK, this is a high-level overview, and I'm assuming you're familiar with OpenGL basics like buffer objects already. Let me know if something doesn't make sense or if you'd like more details.
The most common way to represent terrain in computer graphics is a heightfield: a grid of points that are spaced regularly on the X and Y axes, but whose Z (height) can vary. A heightfield can only have one Z value per (X,Y) grid point, so you can't have "overhangs" in the terrain, but it's usually sufficient anyway.
A simple way to draw a heightfield terrain is with a triangle strip (or quads, but they're deprecated). For simplicity, start in one corner and issue vertices in a zig-zag order down the column, then go back to the top and do the next column, and so on. There are optimizations that can be done for better performance, and more sophisticated ways of constructing the geometry for better appearance, but that'll get you started.
(I'm assuming a rectangular terrain here since that's how it's commonly done; if you really want a circle, you can substitute 𝑟 and 𝛩 for X and Y so you have a polar grid.)
The coordinates for each vertex will need to be stored in a buffer object, as usual. When you call glBufferData() to load the vertex data into the GPU, specify a usage parameter of either GL_STREAM_DRAW if the terrain will usually change from one frame to the next, or GL_DYNAMIC_DRAW if it will change often but not (close to) every frame. To change the terrain, call glBufferData() again to copy a different set of vertex data to the GPU.
For the vertex data itself, you can specify all three coordinates (X, Y, and Z) for each vertex; that's the simplest thing to do. Or, if you're using a recent enough GL version and you want to be sophisticated, you should be able to calculate the X and Y coordinates in the vertex shader using gl_VertexID and the dimensions of the grid (passed to the shader as a uniform value). That way, you only have to store the Z values in the buffer, which means less GPU memory and bandwidth consumed.