Do the vertex and fragment run over all points or just the ones left after clipping? - opengl

Im trying to wrap my head around the GPU pipeline and the performance implications...
I create a coordinate system and put a million vertices on it, all of them are now in memory usable by the GPU. I assume this is the performance hit on this step: moving all the floating values into the GPU memory, implying the points where already created.
Then I transform my million points coordinates into clipping coordinates. Here I’m applying a transformation to each point.
As a result of this transformation some points are now outside of the clip coordinates, let’s say only a thousands points are in. Does the vertex shader run on the thousand or all million points? What about the fragment shader? And the building of triangles? Transformation into the final device coordinates only takes the thousand points?
My guess is that the vertex runs on all but the fragment only on the interpolation of the visible vertices.
Is the only optimization possible then to just include as little vertices as possible in the first place? If I’m looking at a full 3D world with buildings, trees, roads... and then zoom at just one rock I’m running all the shaders on all the objects anywhay... so the only solution would be to not put those trees and buildings in the first place? Or can I have this world on GPU memory but just compute the rock? Could I apply the transformation of coordinates to just the rock somehow? Where in the pipeline a technique like GPU culling, level of detail or dynamic tessellation takes place?

Vertex shaders are executed for each vertex submitted with glDrawArrays and glDrawElements function families, perhaps even multiple times per vertex. The transformed vertices are then assembled into primitives and clipped—if it's outside the viewport then their processing is done. To reduce the overhead of processing vertices of objects outside the viewport multiple techniques are employed. The simplest one is "frustum culling"—submit the object for rendering only if its bounding box intersects the camera frustum.
Fragment shaders are executed for each fragment ("pixel") in the framebuffer that passes the depth test. One way to reduce their count is to render front to back—so that only the front-visible fragments are ever calculated.


How to multiply vertices with model matrix outside the vertex shader

I am using OpenGL ES2 to render a fairly large number of mostly 2d items and so far I have gotten away by sending a premultiplied model/view/projection matrix to the vertex shader as a uniform and then multiplying my vertices with the resulting MVP in there.
All items are batched using texture atlases and I use one MVP per batch. So all my vertices are relative to the translation of that MVP.
Now I want to have rotation and scaling for each of the separate items, which means I need a different model for each of them. So I modified my vertex to include the model (16 floats!) and added a mat4 attribute in my shader and it all works well. But I'm kinda dissapointed with this solution since it dramatically increased the vertex size.
So as I was staring at my screen trying to think of a different solution I thought about transforming my vertices to world space before I send them over to the shader. Or even to screen space if its possible. The vertices I use are unnormalized coordinates in pixels.
So the question is, is such a thing possible? And if yes how do you do it? I can't think why it shouldn't be since its just maths but after a fairly long search on google, it doesn't look like a lot of people are actually doing this...
Strange cause if it is indeed possible, it would be quite a major optimization in cases like this one.
If the number of matrices per batch are limited then you can pass all those matrices as uniforms (preferably in a UBO) and expand the vertex data with an index which specifies which matrix you need to use.
This is similar to GPU skinning used for skeletal animation.

How do I get started with a GPU voxelizer?

I've been reading various articles about how to write a GPU voxelizer. From my understanding the process goes like this:
Inspect the triangles individually and decide the axis that displays the triangle in the largest way. Call this the dominant axis.
Render the triangle on its dominant axis and sample the texels that come out.
Write that texel data onto a 3D texture and then do what you will with the data
Disregarding conservative rasterization, I have a lot of questions regarding this process.
I've gotten as far as rendering each triangle, choosing a dominant axis and orthogonally projecting it. What should the values of the orthogonal projection be? Should it be some value based around the size of the voxels or how large of an area the map should cover?
What am I supposed to do in the fragment shader? How do I write to my 3D texture such that it stores the voxel data? From my understanding, due to choosing the dominant axis we can't have more than a depth of 1 voxel for each fragment. However, since we projected orthogonally I don't see how that would reflect onto the 3D texture.
Finally, I am wondering on where to store the texture data. I know it's a bad idea to store data CPU side since you have to pass it all in to use it on the GPU, however the sourcecode I am kind of following chooses to store all its texture on the CPU side, such as those for a light map. My assumption is that data that will only be used on the GPU should be stored there and data used on both should be stored on the CPU side of things. So, from this I store my data on the CPU side. Is that correct?
My main sources have been: OpenGL Insights A SVO using a voxelizer. The issue is that the shader code is not in the github.
In my own implementation, the whole scene is positioned and scaled into one unit cube centered on world origin. The modelview-project matrices are straightforward then. And the viewport is simply the desired voxel resolution.
I use 2-pass approach to output those voxel fragments: the 1st pass calculate the number of output voxel fragments by accumulating a single variable using atomic counter. Then I use the info to allocate a linear buffer.
In the 2nd pass the rasterized voxel fragments are stored into the allocated linear buffer, using atomic counter to avoid write conflict.

Difference between tessellation shaders and Geometry shaders

I'm trying to develop a high level understanding of the graphics pipeline. One thing that doesn't make much sense to me is why the Geometry shader exists. Both the Tessellation and Geometry shaders seem to do the same thing to me. Can someone explain to me what does the Geometry shader do different from the tessellation shader that justifies its existence?
The tessellation shader is for variable subdivision. An important part is adjacency information so you can do smoothing correctly and not wind up with gaps. You could do some limited subdivision with a geometry shader, but that's not really what its for.
Geometry shaders operate per-primitive. For example, if you need to do stuff for each triangle (such as this), do it in a geometry shader. I've heard of shadow volume extrusion being done. There's also "conservative rasterization" where you might extend triangle borders so every intersected pixel gets a fragment. Examples are pretty application specific.
Yes, they can also generate more geometry than the input but they do not scale well. They work great if you want to draw particles and turn points into very simple geometry. I've implemented marching cubes a number of times using geometry shaders too. Works great with transform feedback to save the resulting mesh.
Transform feedback has also been used with the geometry shader to do more compute operations. One particularly useful mechanism is that it does stream compaction for you (packs its varying amount of output tightly so there are no gaps in the resulting array).
The other very important thing a geometry shader provides is routing to layered render targets (texture arrays, faces of a cube, multiple viewports), something which must be done per-primitive. For example you can render cube shadow maps for point lights in a single pass by duplicating and projecting geometry 6 times to each of the cube's faces.
Not exactly a complete answer but hopefully gives the gist of the differences.
Moving/rotating shapes in the vertex shader

I'm writing a program that draws a number of moving/rotating polygons using OpenGL. Each polygon has a location in world coordinates while its vertices are expressed in local coordinates (relative to polygon location). Each polygon also has a rotation.
The only way I can think of doing this is calculate vertex positions by translation/rotation in each frame and push them to the GPU be drawn, but I was wondering if this could be performed in the vertex shader.
I thought I might express vertex locations in local coordinates and then add location and rotation attributes to each vertex, but then it occurred to me that this won't be any better than pushing new vertex positions on each frame.
Should I do this kind of calculation on the CPU, or is there a way to do it efficiently in the vertex shader?
The vertex shader is indeed responsible for transforming your geometry. However, the vertex shader is run for every single vertex of your scene. If you do transformations inside the vertex shader, you'll do the same calculation over and over again which yields the same result every time (as opposed to simply multiplying the model view projection matrix with the vertex coordinate). So in terms of efficiency you're best off doing that on the CPU side.
If the models are small, like in your case, I don't expect there to be too much of a difference, because you still have to set the coordinates where the polygons are supposed to be drawn somehow. In this case doing the calculations once on the CPU side is still the best, given that it does the calculation once independent of the vertex count of your polygons, as well as that it will probably result in clearer code since it's easier to see what you're doing.
These calculations are usually done on CPU only. As doing them on CPU is efficient in general. your best shot is to send these rotation matrices in as uniform and do multiplication on GPU. Sending uniforms is not very expensive operation in general so u should be be worrying about that.

How should I use glNormal() for a vertex shared between a triangle and a quad?

Let there be a vertex which is part of a triangle, and of a quad.
To my best understanding, the normal of that vertex is the average of the normal of the quad and the normal of the triangle.
The triangle is drawn before the quad. When should I call glNormal and with what vector?
Should I call glNormal 2 times, each time with the same vector (the average normal vector)?
Should I call glNormal the last time the vertex is drawn, with the average normal vector?
To my best understanding, the normal of that vertex is the average of
the normal of the quad and the normal of the triangle.
Ideally, the normal vector should be orthogonal to the surface that you are rendering, on any point. However, the GL only supports rendering surfaces only as polygonal models (at least directly). So there are two principal possibilities:
The polygonal representation does exactly represent the object you want to visualize. A simple example would be a cube.
The polygonal represantation is just an (picewise linear) approximation of the surface you want to visualize. Think of smooth surfaces.
In case 1, you need one nomral per triangle (as the normal is unchaning for a flat surface defined by a triangle). However, this means that either for neighboring triangles who share an edge or corner, the normals will have to be different. From GL's point of view, each of the trianlges use different vertices, even if those vertices share the position in space. A vertex is the set of all attributes, not just the position. For the cube, that means that you will need not just 8 different vertices, but 24, so you have 3 at each corner.
In case 2, you do want to cover up the polygonal structure of the model as good as possible. One aspect of this is using smooth shading techniques. Averaging the normales of adjacent traingles at each vertex is one heuristic of doing so. In this case, neighboring primitives actually can share vertices, as the normal and the position of some corner point is the same for any triangle connected to it.
This heuristic has some drawbacks, especially if your surface does contain both smooth parts and "sharp edges" you want to preserve. There are some improved heuristics which try to detect sharp edges and splitting vertices to allow different normals for the connected triangles to not shooth such edges. But all such heuristics might fail in some cases - ideally, the normals are provided when the model is created in the first place.
The triangle is drawn before the quad. When should I call glNormal and
with what vector?
OpenGL is a state machine, meaning that things you set kepp that way until you channge them again - and setting normals is no exception. The second thing to note is that normals are a vertex attribute. So for every vertex, every arrtibute has always some value (but depending on the rest of your GL state, not all of these attributes are used when rendering).
Since you use the fixed-function GL, normals are builtin vertex attributes - so every vertex you issue in some way has some value as its normal attribute - in immediate mode rendering with glBegin()/End(), it will be the one you set with the most recent glNormal() call (or it will have the initial default value if you never called glNormal()).
So to answer you question:
YOu have to set that normal before you issue the glVertex() call for that particular vertex for the first time, and you have to re-issue that normal command for the second time drawing with "this" vertex (which technically is a different vertex anyway) if you did change it inbetween when specifying some other vertices.
To my best understanding, the normal of that vertex is the average of the normal of the quad and the normal of the triangle.
No. The normal of a plane is a vector pointing 'out of' the plane at a 90 degree angle. In OpenGL, this is used in shading calculations, and to support various effects, OpenGL lets you specify whatever normal you want instead of calculating it from the primitive. For flat lighting, the normal should be set to the mathematical definition of the normal for each primitive, while for smooth lighting, the normal should be set to the average normal of all primitives that share the vertex.
glNormal sets a value in OpenGL that is read whenever you call glVertex, and is persistent until you call glNormal again. So this code
specifies 4 vertices, each with a normal of (0,0,1).