I'm currently writing a gravity-simulation and I have a small problem displaying the particles with OpenGL.
To get "round" particles, I create a small float-array like this:
for (int n = 0; n < 16; n++)
for (int m = 0; m < 16; m++)
{
AlphaData[n * 16 + m] = ((n - 8) * (n - 8) + (m - 8) * (m - 8) < 64);
}
I then put this in a GL_TEXTURE_2D with format GL_RED. In the fragment shader (via glDrawArraysInstanced), I draw the particles like this:
color = vec4(ParticleColor.rgb, texture(Sampler, UV).r);
This works as it should, producing a picture like this (particles enlarged for demonstration):
As you can see, no artifacts. Every particle here is the same size, so every smaller one you see on a "larger" particle is in the background and should not be visible. When I turn on depth-testing with
glEnable(GL_DEPTH_TEST);
glDepthFunc(GL_LESS);
I get something like this:
So for the most part, this looks correct ("smaller" particles being behind the "bigger" ones). But I now have artifacts from the underlying quads. Weirdly not ALL particles have this behavior.
Can anybody tell me, what I'm doing wrong? Or do depth-testing and blending not work nicely together?
I'm not sure, what other code you might need for a diagnosis (everything else seems to work correctly), so just tell me, if you need additional code.
I'm using a perspective projection here (of course for particles in 3D-space).
You're in a special case where your fragments are either fully opaque or fully transparent, so it's possible to get depth-testing and blending to work at the same time. The actual problem is, that for depth testing even a fully transparent fragment will store it's depth value. You can prevent the writing by explicitly discarding the fragment in the shader. Something like:
color = vec4(ParticleColor.rgb, texture(Sampler, UV).r);
if (color.a == 0.0)
discard;
Note, that conditional branching might introduce some additional overhead, but I wouldn't expect too many problems in your case.
For the general case with semi-transparent fragments, blending and depth-testing at the same time will not work. In order for blending to produce the correct result, you have to depth sort your geometry prior to rendering and render from back to front.
Related
So I wanted to store all my meshes in one large VBO. The problem is, how do you do have just one draw call, but let every mesh have its own model to world matrix?
My idea was to submit an array of matrices to a uniform before drawing. In the VBO I would make the color of every first vertex of a mesh negative (So I'd be using the signing bit to check whether a vertex was the first of a mesh).
Okay, so I can detect when a new mesh has started and I have an array of matrices ready and probably a uniform called 'index'. But how do I increase this index by one every time I encounter a new mesh?
Can you modify a uniform from within the shader? If so, how?
Can you modify a uniform from within the shader?
If you could, it wouldn't be uniform anymore, would it?
Furthermore, what you're wanting to do cannot be done even with Image Load/Store or SSBOs, both of which allow shaders to write data. It won't work because vertex shader invocations are not required to be executed sequentially. Many happen at the same time, and there's no way for any shader invocation to know that it will happen "after" the "first vertex" in a mesh.
The simplest way to deal with this is the obvious solution. Render each mesh individually, but set the uniforms for each mesh before each draw call. Without changing buffers between draws, of course. Uniform changes, while not exactly cheap, aren't the most expensive state changes that exist.
There are more complicated drawing methods that could allow you more performance. But that form is adequate for most needs. You've already done the hard part: you removed the need for any state change (textures, buffers, vertex formats, etc) except uniform state.
There are two approaches to minimize draw calls - instancing and batching. The first (instancing) allows you to draw multiple copies of same meshes in one draw call, but it depends on the API (is available from OpenGL 3.1). Batching is similar to instancing but allows you to draw different meshes. Both of these approaches have restrictions - meshes should be with the same materials and shaders.
If you would to draw different meshes in one VBO then instancing is not an option. So, batching requires keeping all meshes in 'big' VBO with applied world transform. It not a problem with static meshes, but have some discomfort with animated. I give you some pseudocode with batching implementation
struct SGeometry
{
uint64_t offsetVB;
uint64_t offsetIB;
uint64_t sizeVB;
uint64_t sizeIB;
glm::mat4 oldTransform;
glm::mat4 transform;
}
std::vector<SGeometry> cachedGeometries;
...
void CommitInstances()
{
uint64_t vertexOffset = 0;
uint64_t indexOffset = 0;
for (auto instance in allInstances)
{
Copy(instance->Vertexes(), VBO);
for (uint64_t i = 0; i < instances->Indices().size(); ++i)
{
auto index = instances->Indices()[i];
index += indexOffset;
IBO[i] = index;
}
cachedGeometries.push_back({vertexOffset, indexOffset});
vertexOffset += instance->Vertexes().size();
indexOffset += instance->Indices().size();
}
Commit(VBO);
Commit(IBO);
}
void ApplyTransform(glm::mat4 modelMatrix, uint64_t instanceId)
{
const SGeometry& geom = cachedGeometries[i];
glm::mat4 inverseOldTransform = glm::inverse(geom.oldTransform);
VertexStream& stream = VBO->GetStream(Position, geom.offsetVB);
for (uint64_t i = 0; i < geom.sizeVB; ++i)
{
glm::vec3 pos = stream->Get(i);
// We need to revert absolute transformation before applying new
pos = glm::vec3(inverseOldNormalTransform * glm::vec4(pos, 1.0f));
pos = glm::vec3(normalTransform * glm::vec4(pos, 1.0f));
stream->Set(i);
}
// .. Apply normal transformation
}
GPU Gems 2 has a good article about geometry instancing http://www.amazon.com/GPU-Gems-Programming-High-Performance-General-Purpose/dp/0321335597
(Edit) I made working geometry picking with framebuffer. My goal is draw huge scene in one draw call, but I need to draw to multisample color texture attachment (GL_COLOR_ATTACHMENT0) and draw to (eddited) non-multisample picking texture attachment (GL_COLOR_ATTACHMENT1). The problem is if I use multisample texture to pick, picking is corrupted because of multi-sampling.
I write geometry ID to fragment shader like this:
//...
// Given geometry id
uniform int in_object_id;
// Drawed to screen (GL_COLOR_ATTACHMENT0)
out vec4 out_frag_color0;
// Drawed to pick texture (GL_COLOR_ATTACHMENT1)
out vec4 out_frag_color1;
// ...
void main() {
out_frag_color0 = ...; // Calculating lighting and other stuff
//...
const int max_byte1 = 256;
const int max_byte2 = 65536;
const float fmax_byte = 255.0;
int a1 = in_object_id % max_byte1;
int a2 = (in_object_id / max_byte1) % max_byte1;
int a3 = (in_object_id / max_byte2) % max_byte1;
//out_frag_color0 = vec4(a3 / fmax_byte, a2 / fmax_byte, a1 / fmax_byte, 1);
out_frag_color1 = vec4(a3 / fmax_byte, a2 / fmax_byte, a1 / fmax_byte, 1);
}
(Point of that code is use RGB space for store geometry ID which is then read back a using for changing color of cube)
This happens when I move cursor by one pixel to left:
Because of alpha value of cube pixel:
Without multisample is works well. But multisampling multiplies my output color and geometry id is then corrupted, so it selects random cube with multiplied value.
(Edit) I can't attach one multisample texture target to color0 and non-multisample texture target to color1, it's not supported. How can I do this in one draw call?
Multisampling is not my friend I am not sure If I understand it well (whole framebuffering). Anyway, this way to pick geometries looks horrible for me (I meant calculating ID to color). Am I doing it well? How can I solve multisample problem? Is there better way?
PS: Sorry for low english. :)
Thanks.
You can't do multisampled and non-multisampled rendering in a single draw call.
As you already found, using two color targets in an FBO, with only one of them being multisampled, is not supported. From the "Framebuffer Completeness" section in the spec:
The value of RENDERBUFFER_SAMPLES is the same for all attached renderbuffers; the value of TEXTURE_SAMPLES is the same for all attached textures; and, if the attached images are a mix of renderbuffers and textures, the value of RENDERBUFFER_SAMPLES matches the value of TEXTURE_SAMPLES.
You also can't render to multiple framebuffers at the same time. There is always one single current framebuffer.
The only reasonable option I can think of is to do picking in a separate pass. Then you can easily switch the framebuffer/attachment to a non-multisampled renderbuffer, and avoid all these issues.
Using a separate pass for picking seems cleaner to me anyway. This also allows you to use a specialized shader for each case, instead of always producing two outputs even if one of them is mostly unused.
I think it is posible...
You have to set the picking texture to multisampled and after rendering the scene, you can render 2 triangles over the screen and inside another fragmentshader you can readout each sample... to do that you have to use the GLSL command:
texelFetch(sampler, pixelposition/*[0-texturesize]*/, /*important*/layernumber);
Then you can render it into a single-sampled texture and read the color via glReadPixel.
I haven't tested it now, but I think it works
I have a strange issue where my normals just do not work when I render terrain. My terrain renders just fine, so I left out all the code for calculating the terrain points from a height map and how I calculated the indices. I know I should be using shaders, but I want to get this fixed first before I move on. I am assuming the issue comes from something obvious I have overlooked in my normals generation code, which is as follows:
for (currentind = 0; currentind < indices.size() - 3; currentind+=3)
{
indtopt = indices[currentind] * 3;
point1.vects[0]=terrainpoints[indtopt];//x
point1.vects[1]=terrainpoints[indtopt+1];//y
point1.vects[2]=terrainpoints[indtopt+2];//z
indtopt = indices[currentind+1] * 3;
//second indice
//points of that indice
point2.vects[0]=terrainpoints[indtopt];//x
point2.vects[1]=terrainpoints[indtopt+1];//y
point2.vects[2]=terrainpoints[indtopt+2];//z
indtopt = indices[currentind+2] *3;
//third indice
//points of that indice
point3.vects[0]=terrainpoints[indtopt];//x
point3.vects[1]=terrainpoints[indtopt+1];//y
point3.vects[2]=terrainpoints[indtopt+2];//z
//--------------------------------------------------
point4.vects[0]=(point2.vects[0]-point1.vects[0]);
point4.vects[1]=(point2.vects[1]-point1.vects[1]);
point4.vects[2]=(point2.vects[2]-point1.vects[2]);
point5.vects[0]=(point3.vects[0]-point2.vects[0]);
point5.vects[1]=(point3.vects[1]-point2.vects[1]);
point5.vects[2]=(point3.vects[2]-point2.vects[2]);
//-------------------------------------------------
//cross product
point6.vects[0]=point4.vects[1]*point5.vects[2] - point4.vects[2]*point5.vects[1];
point6.vects[1]=point4.vects[2]*point5.vects[0] - point4.vects[0]*point5.vects[2];
point6.vects[2]=point4.vects[0]*point5.vects[1] - point4.vects[1]*point5.vects[0];
point6 = point6.normalize();
ternormals[currentind]=point6.vects[0];
ternormals[currentind+1]=point6.vects[1];
ternormals[currentind+2]=point6.vects[2];
}
Below is a picture of what the issue is in both wireframe and triangle renders:
I can post more code if need be, but I just wanted to keep this post short, so I tried to find where I thought the issue might be.
Well, for every "dark" band you're accidently flipping the normal, probably because the surface tangent vectors are passed into the cross product in the wrong order
a × b = - (b × a)
If your terrain is made of triangle strips, then you've got a bidirectional ordering, which means, that you have to either flip the operands or negate the result of the cross product for every odd row.
I am using the D3DXSPRITE method to draw my map tiles to the screen, i just added a zoom function which zooms in when you hold the up arrow, but noticed you can now see gaps between the tiles, here's some screen shots
normal size (32x32) per tile
zoomed in (you can see white gaps between the tiles)
zoomed out (even worst!)
Here's the code snipplet which I translate and scale the world with.
D3DXMATRIX matScale, matPos;
D3DXMatrixScaling(&matScale, zoom_, zoom_, 0.0f);
D3DXMatrixTranslation(&matPos, xpos_, ypos_, 0.0f);
device_->SetTransform(D3DTS_WORLD, &(matPos * matScale));
And this is my drawing of the map, (tiles are in a vector of a vector of tiles.. and I haven't done culling yet)
LayerInfo *p_linfo = NULL;
RECT rect = {0};
D3DXVECTOR3 pos;
pos.x = 0.0f;
pos.y = 0.0f;
pos.z = 0.0f;
for (short y = 0;
y < BottomTile(); ++y)
{
for (short x = 0;
x < RightTile(); ++x)
{
for (int i = 0; i < TILE_LAYER_COUNT; ++i)
{
p_linfo = tile_grid_[y][x].Layer(i);
if (p_linfo->Visible())
{
p_linfo->GetTextureRect(&rect);
sprite_batch->Draw(
p_engine_->GetTexture(p_linfo->texture_id),
&rect, NULL, &pos, 0xFFFFFFFF);
}
}
pos.x += p_engine_->TileWidth();
}
pos.x = 0;
pos.y += p_engine_->TileHeight();
}
Your texture indices are wrong. 0,0,32,32 is not the correct value- it should be 0,0,31,31. A zero-based index into your texture atlas of 256 pixels would yield values of 0 to 255, not 0 to 256, and a 32x32 texture should yield 0,0,31,31. In this case, the colour of the incorrect pixels depends on the colour of the next texture along the right and the bottom.
That's the problem of magnification and minification. Your textures should have invisible border populated with part of adjacent texture. Then magnification and minification filters will use that border to calculate color of edge pixels rather than default (white) color.
I think so.
I also had a similar problem with texture mapping. What worked for me was changing the texture address mode in the sampler state description; texture address mode is used to control what direct3d does with texture coordinates outside of the ([0.0f, 1.0f]) range: i changed the ADDRESS_U, ADDRESS_V, ADDRESS_W members to D3D11_TEXTURE_ADDRESS_CLAMP which basically clamps all out-of-range values for the texture coordinates into the [0.0f, 1.0f] range.
After a long time searching and testing people solutions I found this rules are the most complete rules that I've ever read.
pixel-perfect-2d from Official Unity WebSite
plus with my own experience i found out that if sprite PPI is 72(for example), you should try to use more PPI for that Image(96 maybe or more).It actually make sprite more dense and make no space for white gaps to show up.
Welcome to the world of floating-point. Those gaps exist due to imperfections using floating-point numbers.
You might be able to improve the situation by being really careful when doing your floating-point math but those seams will be there unless you make one whole mesh out of your terrain.
It's the rasterizer that given the view and projection matrix as well as the vertex positions is slightly off. You maybe able to improve on that but I don't know how successful you'll be.
Instead of drawing different quads you can index only the visible vertexes that make up your terrain and instead use texture tiling techniques to paint different stuff on there. I believe that won't get you the ugly seam because in that case, there technically isn't one.
I am using Nvidia CG and Direct3D9 and have the question about the following code.
It compiles, but doesn't "loads" (using cgLoadProgram wrapper) and the resulting failure is described simplyas D3D failure happened.
It's a part of the pixel shader compiled with shader model set to 3.0
What may be interesting is that this shader loads fine in the following cases:
1) Manually unrolling the while statement (to many if { } statements).
2) Removing the line with the tex2D function in the loop.
3) Switching to shader model 2_X and manually unrolling the loop.
Problem part of the shader code:
float2 tex = float2(1, 1);
float2 dtex = float2(0.01, 0.01);
float h = 1.0 - tex2D(height_texture1, tex);
float height = 1.00;
while ( h < height )
{
height -= 0.1;
tex += dtex;
// Remove the next line and it works (not as expected,
// of course)
h = tex2D( height_texture1, tex );
}
If someone knows why this can happen or could test the similiar code in non-CG environment or could help me in some other way, I'm waiting for you ;)
Thanks.
I think you need to determine the gradients before the loop using ddx/ddy on the texture coordinates and then use tex2D(sampler2D samp, float2 s, float2 dx, float2 dy)
The GPU always renders quads not pixels (even on pixel borders - superfluous pixels are discarded by the render backend). This is done because it allows it to always calculate the screen space texture derivates even when you use calculated texture coordinates. It just needs to take the difference between the values at the pixel centers.
But this doesn't work when using dynamic branching like in the code in the question, because the shader processors at the individual pixels could diverge in control flow. So you need to calculate the derivates manually via ddx/ddy before the program flow can diverge.