How to do particle binning in OpenGL? - opengl

Currently I'm creating a particle system and I would like to transfer most of the work to the GPU using OpenGL, for gaining experience and performance reasons. At the moment, there are multiple particles scattered through the space (these are currently still created on the CPU). I would more or less like to create a histogram of them. If I understand correctly, for this I would first translate all the particles from world coordinates to screen coordinates in a vertex shader. However, now I want to do the following:
So, for each pixel a hit count of how many particles are inside. Each particle will also have several properties (e.g. a colour) and I would like to sum them for every pixel (as shown in the lower-right corner). Would this be possible using OpenGL? If so, how?

The best tool I recomend for having the whole data (if it fits on GPU memory) is the use of SSBO.
Nevertheless, you need data after transforming them (e.g. by a projection). Still SSBO is your best option:
In the fragment shader you read the properties of already handled particles (let's say, the rendered pixel) and write modified properties (number of particles at this pixel, color, etc) to the same index in the buffer.
Due to parallel nature of GPU, several instances coming from different particles may be doing concurrently the work for the same index. Thus you need to handle this on your own. Read Memory model and Atomic operations
Another approach, but limited, is using Blending
The idea is that each fragment increments the actual color value of the frame buffer. This can be done using GL_FUNC_ADD for glBlendEquationSeparate and using as fragment-output-color a value of 1/255 (normalized integer) for each RGB/a component.
Limitations come from the [0-255] range: Only up to 255 particles in the same pixel, the rest amount is clamped to this range and so "lost".
You have four components RGBA, thus four properties can be handled. But can have several renderbuffers in a FBO.
You can read the FBO by glReadPixels. Use glReadBuffer first with a GL_COLOR_ATTACHMENTi if you use a FBO instead of the default frame buffer.

Related

How would I store vertex, color, and index data separately? (OpenGL)

I'm starting to learn openGL (working with version 3.3) with intent to get a small 3d falling sand simulation up, akin to this:
https://www.youtube.com/watch?v=R3Ji8J2Kprw&t=41s
I have a little experience with setting up a voxel environment like Minecraft from some Udemy tutorials for Unity, but I want to build something simple from the ground up and not deal with all the systems already laid on top of things with Unity.
The first issue I've run into comes early. I want to build a system for rendering quads, because instancing a ton of cubes is ridiculously inefficient. I also want to be efficient with storage of vertices, colors, etc. Thus far in the opengl tutorials I've worked with the way to do this is to store each vertex in a float array with both position and color data, and set up the buffer object to read every set of six entries as three floats for position and three for color, using glVertexAttribPointer. The problem is that for each neighboring quad, the same vertices will be repeated because if they are made of different "blocks" they will be different colors, and I want to avoid this.
What I want to do instead to make things more efficient is store the vertices of a cube in one int array (positions will all be ints), then add each quad out of the terrain to an indices array (which will probably turn into each chunk's mesh later on). The indices array will store each quad's position, and a separate array will store each quad's color. I'm a little confused on how to set this up since I am rather new to opengl, but I know this should be doable based on what other people have done with minecraft clones, if not even easier since I don't need textures.
I just really want to get the framework for the chunks, blocks, world, etc, up and running so that I can get to the fun stuff like adding new elements. Anyone able to verify that this is a sensible way to do this (lol) and offer guidance on how to set this up in the rendering, I would very much appreciate.
Thus far in the opengl tutorials I've worked with the way to do this is to store each vertex in a float array with both position and color data, and set up the buffer object to read every set of six entries as three floats for position and three for color, using glVertexAttribPointer. The problem is that for each neighboring quad, the same vertices will be repeated because if they are made of different "blocks" they will be different colors, and I want to avoid this.
Yes, and perhaps there's a reason for that. You seem to be trying to save.. what, a few bytes of RAM? Your graphics card has 8GB of RAM on it, what it doesn't have is a general processing unit or an unlimited bus to do random lookups in other buffers for every single rendered pixel.
The indices array will store each quad's position, and a separate array will store each quad's color.
If you insist on doing it this way, nothing's stopping you. You don't even need the quad vertices, you can synthesize them in a geometry shader.
Just fill in a buffer with X|Y|Width|Height|Color(RGB) with glVertexAttribPointer like you already know, then run a geometry shader to synthesize two triangles for each entry in your input buffer (a quad), then your vertex shader projects it to world units (you mentioned integers, so you're not in world units initially), and then your fragment shader can color each rastered pixel according to its color entry.
ridiculously inefficient
Indeed, if that sounds ridiculously inefficient to you, it's because it is. You're essentially packing your data on the CPU, transferring it to the GPU, unpacking it and then processing it as normal. You can skip at least two of the steps, and even more if you consider that vertex shader outputs get cached within rasterized primitives.
There are many more variations of this insanity, like:
store vertex positions unpacked as normal, and store an index for the colors. Then store the colors in a linear buffer of some kind (texture, SSBO, generic buffer, etc) and look up each color index. That's even more inefficient, but it's closer to the algorithm you were suggesting.
store vertex positions for one quad and set up instanced rendering with a multi-draw command and a buffer to feed individual instance data (positions and colors). If you also have textures, you can use bindless textures for each quad instance. It's still rendering multiple objects, but it's slightly more optimized by your graphics driver.
or just store per-vertex data in a buffer and render it. Done. No pre-computations, no unlimited expansions, no crazy code, you have your vertex data and you render it.

(Modern) OpenGL Different Colored Faces on a Cube - Using Shaders

A cube with different colored faces in intermediate mode is very simple. But doing this same thing with shaders seems to be quite a challenge.
I have read that in order to create a cube with different coloured faces, I should create 24 vertices instead of 8 vertices for the cube - in other words, (I visualies this as 6 squares that don't quite touch).
Is perhaps another (better?) solution to texture the faces of the cube using a real simple texture a flat color - perhaps a 1x1 pixel texture?
My texturing idea seems simpler to me - from a coder's point of view.. but which method would be the most efficient from a GPU/graphic card perspective?
I'm not sure what your overall goal is (e.g. what you're learning to do in the long term), but generally for high performance applications (e.g. games) your goal is to reduce GPU load. Every time you switch certain states (e.g. change textures, render targets, shader uniform values, etc..) the GPU stalls reconfiguring itself to meet your demands.
So, you can pass in a 1x1 pixel texture for each face, but then you'd need six draw calls (usually not so bad, but there is some prep work and potential cache misses) and six texture sets (can be very bad, often as bad as changing shader uniform values).
Suppose you wanted to pass in one texture and use that as a texture map for the cube. This is a little less trivial than it sounds -- you need to express each texture face on the texture in a way that maps to the vertices. Often you need to pass in a texture coordinate for each vertex, and due to the spacial configuration of the texture this normally doesn't end up meaning one texture coordinate for one spatial vertex.
However, if you use an environmental/reflection map, the complexities of mapping are handled for you. In this way, you could draw a single texture on all sides of your cube. (Or on your sphere, or whatever sphere-mapped shape you wanted.) I'm not sure I'd call this easier since you have to form the environmental texture carefully, and you still have to set a different texture for each new colors you want to represent -- or change the texture either via the GPU or in step with the GPU, and that's tricky and usually not performant.
Which brings us back to the canonical way of doing as you mentioned: use vertex values -- they're fast, you can draw many, many cubes very quickly by only specifying different vertex data, and it's easy to understand. It really is the best way, and how GPUs are designed to run quickly.
Additionally..
And yes, you can do this with just shaders... But it'd be ugly and slow, and the GPU would end up computing it per each pixel.. Pass the object space coordinates to the fragment shader, and in the fragment shader test which side you're on and output the corresponding color. Highly not recommended, it's not particularly easier, and it's definitely not faster for the GPU -- to change colors you'd again end up changing uniform values for the shaders.

How do I get started with a GPU voxelizer?

I've been reading various articles about how to write a GPU voxelizer. From my understanding the process goes like this:
Inspect the triangles individually and decide the axis that displays the triangle in the largest way. Call this the dominant axis.
Render the triangle on its dominant axis and sample the texels that come out.
Write that texel data onto a 3D texture and then do what you will with the data
Disregarding conservative rasterization, I have a lot of questions regarding this process.
I've gotten as far as rendering each triangle, choosing a dominant axis and orthogonally projecting it. What should the values of the orthogonal projection be? Should it be some value based around the size of the voxels or how large of an area the map should cover?
What am I supposed to do in the fragment shader? How do I write to my 3D texture such that it stores the voxel data? From my understanding, due to choosing the dominant axis we can't have more than a depth of 1 voxel for each fragment. However, since we projected orthogonally I don't see how that would reflect onto the 3D texture.
Finally, I am wondering on where to store the texture data. I know it's a bad idea to store data CPU side since you have to pass it all in to use it on the GPU, however the sourcecode I am kind of following chooses to store all its texture on the CPU side, such as those for a light map. My assumption is that data that will only be used on the GPU should be stored there and data used on both should be stored on the CPU side of things. So, from this I store my data on the CPU side. Is that correct?
My main sources have been: https://www.seas.upenn.edu/~pcozzi/OpenGLInsights/OpenGLInsights-SparseVoxelization.pdf OpenGL Insights
https://github.com/otaku690/sparsevoxeloctree A SVO using a voxelizer. The issue is that the shader code is not in the github.
In my own implementation, the whole scene is positioned and scaled into one unit cube centered on world origin. The modelview-project matrices are straightforward then. And the viewport is simply the desired voxel resolution.
I use 2-pass approach to output those voxel fragments: the 1st pass calculate the number of output voxel fragments by accumulating a single variable using atomic counter. Then I use the info to allocate a linear buffer.
In the 2nd pass the rasterized voxel fragments are stored into the allocated linear buffer, using atomic counter to avoid write conflict.

Rendering visualization of spectrogram efficiently

I'm trying to find a clever way to render a large spectrogram (say, fullscreen). A spectrogram is a coordinate-system, where the x-axis is time, the y-axis is frequency and the colour intensity is the magnitude of the frequency component, and it looks like this (youtube).
What's interesting to note is that each frame, a new column (1 pixel wide) is new, but the whole rest of the spectrum is the same, only shifted left one pixel. Currently I'm just writing to a circular software buffer acting like an image, and drawing that - but it is obviously slow at high framerates and screensizes.
Is there any obvious solution to this problem, using OpenGL (or some software trick - has to be cross-platform, though)? Perhaps through some use of buffer on the GPU memory, with a shader that fills it (admittedly, i have a very vague understanding of OpenGL beyond drawing simple stuff)? It revolves around keeping the old data on the GPU memory as i see it.
Use a single channel texture for the waterfall (this is what you're drawing, a waterfall plot) in which you update one column or row at a time using glTexSubImage. By using GL_WRAP mode you can simply advance the texture coordinates beyond the bounds of the texture and it will, well, wrap. By moving the texture opposing to the update you can get the waterfall effect (i.e. moving spectrogram, with the updates coming in at the right edge).
To give the whole thing color, use the texture's values as an index into a transfer function LUT texture using a fragment shader.
You can use the GPU library for spectrogram calculations: nnAudio
https://github.com/KinWaiCheuk/nnAudio

Texture Image processing on the GPU?

I'm rendering a certain scene into a texture and then I need to process that image in some simple way. How I'm doing this now is to read the texture using glReadPixels() and then process it on the CPU. This is however too slow so I was thinking about moving the processing to the GPU.
The simplest setup to do this I could think of is to display a simple white quad that takes up the entire viewport in an orthogonal projection and then write the image processing bit as a fragment shader. This will allow many instances of the processing to run in parallel as well as to access any pixel of the texture it requires for the processing.
Is this a viable course of action? is it common to do things this way?
Is there maybe a better way to do it?
Yes, this is the usual way of doing things.
Render something into a texture.
Draw a fullscreen quad with a shader that reads that texture and does some operations.
Simple effects (e.g. grayscale, color correction, etc.) can be done by reading one pixel and outputting one pixel in the fragment shader. More complex operations (e.g. swirling patterns) can be done by reading one pixel from offset location and outputting one pixel. Even more complex operations can be done by reading multiple pixels.
In some cases multiple temporary textures would be needed. E.g. blur with high radius is often done this way:
Render into a texture.
Render into another (smaller) texture, with a shader that computes each output pixel as average of multiple source pixels.
Use this smaller texture to render into another small texture, with a shader that does proper Gaussian blur or something.
... repeat
In all of the above cases though, each output pixel should be independent of other output pixels. It can use one more more input pixels just fine.
An example of processing operation that does not map well is Summed Area Table, where each output pixel is dependent on input pixel and the value of adjacent output pixel. Still, it is possible to do those kinds on the GPU (example pdf).
Yes, it's the normal way to do image processing. The color of the quad doesn't really matter if you'll be setting the color for every pixel. Depending on your application, you might need to careful about pixel sampling issues (i.e. ensuring that you sample from exactly the correct pixel on the source texture, rather than halfway between two pixels).