I am trying to do a Sum of absolute difference within my shader and write back the single result back to a uniform float in a in unity.
In the shader I have 2 geometries with the same number of vertices that map one to one.
// substract vertices
float norm = 10;
float error=infereCrater.vertex.y-v.vertex.y;
error = error*error*norm;
o.debugColor = float3(error,1-error ,0.0f);
//////
o.posWorld =mul(_Object2World,v.vertex);
o.normalWorld = normalize(mul(float4(v.normal,0.0),_World2Object).xyz);
o.tangentWorld = normalize(mul(float4(v.tangent,0.0),_World2Object).xyz);
o.binormalWorld = cross(o.normalWorld,o.tangentWorld);
o.tex = v.texcoord;
o.pos = mul(UNITY_MATRIX_MVP,v.vertex);
TRANSFER_VERTEX_TO_FRAGMENT(o);
return o;
}
I am available to calculate the error for each individual vertex and change the color of the surface based on the difference.
I hit a road block where I don't know how to sync all the threads and start adding up the values.
Is there a way to call another vertex shader after the first one is done?
How can the vertex shader read the values of adjacent vertex to it? (don't think its possible because in local memory of thread)
Or its possible to have a global array, to store the difference values, copy this to the CPU (which I don't want because of latency) and add them in the CPU?
I don't want to use compute shader because I am not in Windows
Related
I have many Esri Grid files (https://en.wikipedia.org/wiki/Esri_grid#ASCII) and I would like to render them in 3D without losing precision, I am using OpenSceneGraph.
The problem is this grids are around 1000x1000 (or more) points, so when I extract the vertices, then compute the triangles to create the geometry, I end up having millions of them and the interaction with the scene is impossible (frame rate drops to 0).
I've tried several approches:
Triangle list
Basically, as I read the file, I fill an array with 3 vertices per triangle (this leads to duplication);
osg::ref_ptr<osg::Geode> l_pGeodeSurface = new osg::Geode;
osg::ref_ptr<osg::Geometry> l_pGeometrySurface = new osg::Geometry;
osg::ref_ptr<osg::Vec3Array> l_pvTrianglePoints = osg::Vec3Array;
osg::ref_ptr<osg::Vec3Array> l_pvOriginalPoints = osg::Vec3Array;
... // Read the file and fill l_pvOriginalPoints
for(*triangle inside the file*)
{
... // Compute correct triangle indices (l_iP1, l_iP2, l_iP3)
// Push triangle vertices inside the array
l_pvTrianglePoints->push_back(l_pvOriginalPoints->at(l_iP1));
l_pvTrianglePoints->push_back(l_pvOriginalPoints->at(l_iP2));
l_pvTrianglePoints->push_back(l_pvOriginalPoints->at(l_iP3));
}
l_pGeometrySurface->setVertexArray(l_pvTrianglePoints);
l_pGeometrySurface->addPrimitiveSet(new osg::DrawArrays(GL_TRIANGLES, 0, 3, l_pvTrianglePoints->size()));
Indexed triangle list
Same as before, but the array contains the every vertices just once and I create a second array of indices (basically i tell osg how to build triangles, no duplication)
osg::ref_ptr<osg::Geode> l_pGeodeSurface = new osg::Geode;
osg::ref_ptr<osg::Geometry> l_pGeometrySurface = new osg::Geometry;
osg::ref_ptr<osg::DrawElementsUInt> l_pIndices = new osg::DrawElementsUInt(osg::PrimitiveSet::TRIANGLES, *number of indices*);
osg::ref_ptr<osg::Vec3Array> l_pvOriginalPoints = osg::Vec3Array;
... // Read the file and fill l_pvOriginalPoints
for(i = 0; i < *number of indices*; i++)
{
... // Compute correct triangle indices (l_iP1, l_iP2, l_iP3)
// Push vertices indices inside the array
l_pIndices->at(i) = l_iP1;
l_pIndices->at(i+1) = l_iP2;
l_pIndices->at(i+2) = l_iP3;
}
l_pGeometrySurface->setVertexArray(l_pvOriginalPoints );
l_pGeometrySurface->addPrimitiveSet(l_pIndices.get());
Instancing
this was a bit of an experiment, since I've never used shaders, I tought I could instance a single triangle, then manipulate its coordinates in a vertex shader for every triangle in my scene, using transformation matrices (passing the matrices as a uniform array, one for triangle). I ended up with too many uniforms just with a grid 20x20.
I used these links as a reference:
https://learnopengl.com/Advanced-OpenGL/Instancing,
https://books.google.it/books?id=x_RkEBIJeFQC&pg=PT265&lpg=PT265&dq=osg+instanced+geometry&source=bl&ots=M8ii8zn8w7&sig=ACfU3U0_92Z5EGCyOgbfGweny4KIUfqU8w&hl=en&sa=X&ved=2ahUKEwj-7JD0nq7qAhUXxMQBHcLaAiUQ6AEwAnoECAkQAQ#v=onepage&q=osg%20instanced%20geometry&f=false
None of the above solved my issue, what else can I try? Am I missing something in terms of rendering techinques? I thought it was fairly simple task, but I'm kind of stuck.
I feel like you should consider taking a step back. If you're visualizing GIS-based terrain data, osgEarth is really designed for doing this and has fairly efficient LOD tools for large terrains. Do you need the data always represented at maximum full LOD or are you looking for dynamic LOD to improve frame rate?
Depending on your goals and requirements you might want to look at some more advanced terrain rendering techniques, like rightfield tracing, etc. If the terrain is always static, you can precompute quadtrees and Signed Distance Functions and trace against the heightfield.
I am doing a Marching cube algorithm in a Compute shader. The vertices generated by the compute stage will be input to the vertex stage.
Compute -> Vertices -> Render
There is no way of knowing how many vertices that the compute stage will output, so I need a storage buffer looking something like this:
layout(set = 1, binding = 0) buffer Count{
int value;
} count;
layout(set = 2, binding = 0) buffer Mesh {
vec4 vertices[1<<15];
} mesh;
The vertices do not need a roundtrip to the CPU, but the count is a variable used by the vkCmdDraw command. So I need to put the count buffer in host visible memory, map that memory and do a memcpy after the compute stage. Is this a good way of solving this problem or is there some other way where I don't have to read back data to the CPU?
Well, this is exactly what vkCmdDrawIndirect is for. The vertex count is stored in a Vkuffer, which makes the CPU round-trip unnecessary.
So I wanted to store all my meshes in one large VBO. The problem is, how do you do have just one draw call, but let every mesh have its own model to world matrix?
My idea was to submit an array of matrices to a uniform before drawing. In the VBO I would make the color of every first vertex of a mesh negative (So I'd be using the signing bit to check whether a vertex was the first of a mesh).
Okay, so I can detect when a new mesh has started and I have an array of matrices ready and probably a uniform called 'index'. But how do I increase this index by one every time I encounter a new mesh?
Can you modify a uniform from within the shader? If so, how?
Can you modify a uniform from within the shader?
If you could, it wouldn't be uniform anymore, would it?
Furthermore, what you're wanting to do cannot be done even with Image Load/Store or SSBOs, both of which allow shaders to write data. It won't work because vertex shader invocations are not required to be executed sequentially. Many happen at the same time, and there's no way for any shader invocation to know that it will happen "after" the "first vertex" in a mesh.
The simplest way to deal with this is the obvious solution. Render each mesh individually, but set the uniforms for each mesh before each draw call. Without changing buffers between draws, of course. Uniform changes, while not exactly cheap, aren't the most expensive state changes that exist.
There are more complicated drawing methods that could allow you more performance. But that form is adequate for most needs. You've already done the hard part: you removed the need for any state change (textures, buffers, vertex formats, etc) except uniform state.
There are two approaches to minimize draw calls - instancing and batching. The first (instancing) allows you to draw multiple copies of same meshes in one draw call, but it depends on the API (is available from OpenGL 3.1). Batching is similar to instancing but allows you to draw different meshes. Both of these approaches have restrictions - meshes should be with the same materials and shaders.
If you would to draw different meshes in one VBO then instancing is not an option. So, batching requires keeping all meshes in 'big' VBO with applied world transform. It not a problem with static meshes, but have some discomfort with animated. I give you some pseudocode with batching implementation
struct SGeometry
{
uint64_t offsetVB;
uint64_t offsetIB;
uint64_t sizeVB;
uint64_t sizeIB;
glm::mat4 oldTransform;
glm::mat4 transform;
}
std::vector<SGeometry> cachedGeometries;
...
void CommitInstances()
{
uint64_t vertexOffset = 0;
uint64_t indexOffset = 0;
for (auto instance in allInstances)
{
Copy(instance->Vertexes(), VBO);
for (uint64_t i = 0; i < instances->Indices().size(); ++i)
{
auto index = instances->Indices()[i];
index += indexOffset;
IBO[i] = index;
}
cachedGeometries.push_back({vertexOffset, indexOffset});
vertexOffset += instance->Vertexes().size();
indexOffset += instance->Indices().size();
}
Commit(VBO);
Commit(IBO);
}
void ApplyTransform(glm::mat4 modelMatrix, uint64_t instanceId)
{
const SGeometry& geom = cachedGeometries[i];
glm::mat4 inverseOldTransform = glm::inverse(geom.oldTransform);
VertexStream& stream = VBO->GetStream(Position, geom.offsetVB);
for (uint64_t i = 0; i < geom.sizeVB; ++i)
{
glm::vec3 pos = stream->Get(i);
// We need to revert absolute transformation before applying new
pos = glm::vec3(inverseOldNormalTransform * glm::vec4(pos, 1.0f));
pos = glm::vec3(normalTransform * glm::vec4(pos, 1.0f));
stream->Set(i);
}
// .. Apply normal transformation
}
GPU Gems 2 has a good article about geometry instancing http://www.amazon.com/GPU-Gems-Programming-High-Performance-General-Purpose/dp/0321335597
I'm attempting to hack and modify several rendering features of an old opengl fixed pipeline game, by hooking into OpenGl calls, and my current mission is to implement shader lighting. I've already created an appropriate shader program that lights most of my objects correctly, but this game's terrain is drawn with no normal data provided.
The game calls:
void glVertexPointer(GLint size, GLenum type, GLsizei stride, const GLvoid * pointer);
and
void glDrawElements(GLenum mode, GLsizei count, GLenum type, const GLvoid * indices);`
to define and draw the terrain, thus I have these functions both hooked, and I hope to loop through the given vertex array at the pointer, and calculate normals for each surface, on either every DrawElements call or VertexPointer call, but I'm having trouble coming up with an approach to do so - specifically, how to read, iterate over, and understand the data at the pointer. In this case, the usual parameters for the glVertexPointer calls are size = 3, type = GL_float, stride = 16, pointer = some pointer. Hooking glVertexPointer, I don't know how I could iterate through the pointer and grab all the vertices for the mesh, considering I don't know the total count of all the vertices, nor do I understand how the data is structured at the pointer given the stride - and similarly how i should structure the normal array
Would it be a better idea to try to calculate the normals in drawelements for each specified index in the indice array?
Depending on your vertex array building procedure, indices would be the only relevant information for building your normals.
Difining normal average for one vertex is simple if you add a normal field in your vertex array, and sum all the normal calculations parsing your indices array.
You have than to divide each normal sum by the number of repetition in indices, count that you can save in a temporary array following vertex indices (incremented each time a normal is added to the vertex)
so to be more clear:
Vertex[vertexCount]: {Pos,Normal}
normalCount[vertexCount]: int count
Indices[indecesCount]: int vertexIndex
You may have 6 normals per vertex so add a temporary array of normal array to averrage those for each vertex:
NormalTemp[vertexCount][6] {x,y,z}
than parsing your indice array (if it's triangle):
for i=0 to indicesCount step 3
for each triangle top (t from 0 to 2)
NormalTemp[indices[i + t]][normalCount[indices[i+t]]+1] = normal calculation with cross product of vectors ending with other summits or this triangle
normalCount[indices[i+t]]++
than you have to divide your sums by the count
for i=0 to vertexCount step 1
for j=0 to NormalCount[i] step 1
sum += NormalTemp[i][j]
normal[i] = sum / normacount[i]
While I like and have voted up the j-p's answer I would still like to point out that you could get away with calculating one normal per face and just using for all 3 vertices. It would be faster, and easier, and sometimes even more accurate.
My goal was to color the vertexes according to their order
EDIT: long time goal: access to preceding and following vertexes to simulate gravity behavior
i've used following code
#version 120
#extension GL_EXT_geometry_shader4 : enable
void main( void ) {
for( int i = 0 ; i < gl_VerticesIn ; i++ ) {
gl_FrontColor = vec4(float(i)/float(gl_VerticesIn),0.0,0.0,1.0);
gl_Position = gl_PositionIn[i];
EmitVertex();
}
}
but all vertexes are drawn black, it seem that i is always evaluated as 0, am i missing something or doing it wrong?
EDIT: figured the meta-problem out: how to feed all me model geometry into single geometry shader call, so the mainloop iterates over all the vertexes, not for every triangle.
You don't let a single geometry shader invocation iterate over all your vertexes, it is called for every original primitive (point, line, triangle, ...).
The solution is much easier: In the vertex shader (that is actually called for every vertex) you can read the special variable gl_VertexID, which contains the vertex's index. That index is either just a counter incremented for every vertex (if using glDrawArrays) and reset by every draw call, or the index from the index array (if using glDrawElements).
EDIT: Regarding the long time goal. Not directly but you might use a texture buffer for that. This basically enables you to get direct linear array-access to a buffer object (in your case the vertex buffer) which you can then just index with this vertex index. But there might also be other ways to accomplish that, which may suffice for another question.