Currently WebGL on my browser only supports 1024 * 4 bytes uniform. However, I can put the data in a texture instead. What's the difference between storing data in an uniform array and in a texture? What's the size of uniform sampler2D?
Textures are sampled, uniforms are not. When you call texture2D(someSampler, someUV) the GPU can compute a value by choosing from 2 of N mips, then reading up to 8 pixels, 4 from each mip, and linearly interpolating between them. (up to 16 pixels if it's a 3D texture in WebGL2).
Because of this it's slower to read from a texture than from a uniform. You could say "well I turned off linear interpolation" but the texture is still read with a sampler and there are a limited number of samplers (like even top end GPUs only have 32 samplers)
Related
If I understand correctly, if I was to set TEXTURE_MIN_FILTER to NEAREST then there's not much difference between sampler2DArray/TEXTURE_2D_ARRAY and sampler3D/TEXTURE_3D
The differences seem to be
GenerateMipmap will blend cross layers with 3D textures but not 2D arrays
the Z coordinate passed to texture in GLSL is 0 to 1 with 3D textures but an 0 to N (depth) in 2D arrays.
If filtering is not NEAREST 3D will blend across layers, 2D array will not.
Correct?
Incorrect. There's one more difference: mipmap sizes.
In a 3D texture, the width, height, and depth all decrease at lower mipmap sizes. In a 2D array texture, every mipmap level has the same number of array layers; only width and height decrease.
It's not just a matter of blending and some texture coordinate oddities; the very size of the texture data is different. It is very much a different kind of texture, as different from 3D textures as 2D textures are from 1D textures.
This is also why you cannot create a view texture of a 3D texture that is a 2D array, or vice-versa.
Apart from the answer already given, there is another difference worth noting: The size limits are also quite different. A single layer of an array texture may be as big as an standard 2D texture, and there is an extra limit on the number of layers, while for 3D textures, there is a limit constraining the maximum size in all dimensions.
For example, OpenGL 4.5 guarantees the following minimal values:
GL_MAX_TEXTURE_SIZE 16384
GL_MAX_ARRAY_TEXTURE_LAYERS 2048
GL_MAX_3D_TEXTURE_SIZE 2048
So a 16384 x 16384 x 16 array texture is fine (and should also fit into memory for every GL 4.5 capable GPU found in the real world), while a 3D texture of the same dimensions would be unsupported on most of todays implementations (even though the complete mipmap pyramid would consume less memory in the 3D texture case).
P.S. Yes, I posted this question on Computer Graphics Stack Exchange. But posting there also in hope more people will see
Intro
I'm trying to render multi-channel images (more than 4 channels, for the purposes of feeding it to a Neural Network). Since OpenGL doesn't support it natively, I have multiple 4-channel render buffers, into which I render a corresponding portion of channels.
For example, I need multi-channel image of size 512 x 512 x 16, in OpenGL I have 4 render buffers of size 512 x 512 x 4. Now the problem is that the Neural Network expects the data with strides 512 x 512 x 16, i.e. 16 values of channels of one pixel are followed by 16 values of channels from the next pixel. However currently I can efficiently read my 4 render buffers via 4 calls to glReadPixels, basically making the data having strides 4 x 512 x 512 x 4. Manual reordering of data on the client side will not suffice me as it's too slow.
Main question
I've got an idea to render to a single 4-channel render buffer of size 512*4 x 512 x 4, because stride-wise it's equivalent to 512 x 512 x 16, we just treat a combination of 4 pixels in a row as a single pixel of 16-channel output image. Let's call it an "interleaved rendering"
But this requires me to magically adjust my fragment shader, so that every group of consequent 4 fragments would have exactly the same interpolation of vertex attributes. Is there any way to do that?
This bad illustration with 1 render buffer of 1024 x 512 4-channel image, is an example of how it should be rendered. With that I can in 1 call glReadPixels extract the data with stride 512 x 512 x 8
EDIT: better pictures
What I have now (4 render buffers)
What I want to do natively in OpenGL (this image is done in Python offline)
But this requires me to magically adjust my fragment shader, so that every group of consequent 4 fragments would have exactly the same interpolation of vertex attributes.
No, it would require a bit more than that. You have to fundamentally change how rasterization works.
Rendering at 4x the width is rendering at 4x the width. That means stretching the resulting primitives, relative to a square area. But that's not the effect you want. You need the rasterizer to rasterize at the original resolution, then replicate the rasterization products.
That's not possible.
From the comments:
It just got to me, that I can try to get a 512 x 512 x 2 image of texture coordinates from vertex+fragment shaders, then stitch it with itself to make 4 times wider (thus we'll get the same interpolation) and from that form the final image
This is a good idea. You'll need to render whatever interpolated values you need to the original size texture, similar to how deferred rendering works. So it may be more than just 2 values. You could just store the gl_FragCoord.xy values, and then use them to compute whatever you need, but it's probably easier to store the interpolated values directly.
I would suggest doing a texelFetch when reading the texture, as you can specify exact integer texel coordinates. The integer coordinates you need can be computed from gl_FragCoord as follows:
ivec2 texCoords = ivec2(int(gl_FragCoord.x * 0.25f), int(gl_FragCoord.y));
I'm using a texture array to render Minecraft-style voxel terrain. It's working fantastic, but I noticed recently that GL_MAX_ARRAY_TEXTURE_LAYERS is alot smaller than GL_MAX_TEXTURE_SIZE.
My textures are very small, 8x8, but I need to be able to support rendering from an array of hundreds to thousands of them; I just need GL_MAX_ARRAY_TEXTURE_LAYERS to be larger.
OpenGL 4.5 requires GL_MAX_ARRAY_TEXTURE_LAYERS be at least 2048, which might suffice, but my application is targeting OpenGL 3.3, which only guarantees 256+.
I'm drawing up blanks trying to figure out a prudent workaround for this limitation; dividing up the rendering of terrain based on the max number of supported texture layers does not sound trivial at all to me.
I looked into whether ARB_sparse_texture could help, but GL_MAX_SPARSE_ARRAY_TEXTURE_LAYERS_ARB is the same as GL_MAX_ARRAY_TEXTURE_LAYERS; that extension is just a workaround for VRAM usage rather than layer usage.
Can I just have my GLSL shader access from an array of sampler2DArray? GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS has to be at least 80+, so 80+ * 256+ = 20480+ and that would enough layers for my purposes. So, in theory could I do something like this?
const int MAXLAYERS = 256;
vec3 texCoord;
uniform sampler2DArray[] tex;
void main()
{
int arrayIdx = int(texCoord.z + 0.5f) / MAXLAYERS 256
float arrayOffset = texCoord.z % MAXLAYERS;
FragColor = texture(tex[arrayIdx],
vec3(texCoord.x, texCoord.y, arrayOffset));
}
It would be better to ditch array textures and just use a texture atlas (or use an array texture with each layer containing lots of sub-textures, but as I will show, that's highly unnecessary). If you're using textures of such low resolution, you probably aren't using linear interpolation, so you can easily avoid bleed-over from neighboring texels. And even if you have trouble with bleed-over, it can easily be fixed by adding some space between the sub-textures.
Even if your sub-textures need to be 10x10 to avoid bleed-over, a 1024x1024 texture (the minimum size GL 3.3 requires) gives you 102x102 sub-textures, which is 10'404 textures. Which ought to be plenty. And if its not, then make it an array texture with however many layers you need.
Arrays of samplers will not work for your purpose. First, you cannot declare an unsized uniform array of any kind. Well you can, but you have to redeclare it with a size at some point in your shader, so there's no much point to the unsized declaration. The only unsized arrays you can have are in SSBOs, as the last element of the SSBO.
Second, even with a size, the index you use for arrays of opaque types must be a dynamically uniform. And since you're trying to draw all of the faces of the cubes in one draw calls, and each face can have select from a different layer, there is no intent for this expression's value to be dynamically uniform.
Third, even if you did this with bindless texturing, you would run into the same problem: unless you're on NVIDIA hardware, the sampler you pick must be a dynamically uniform sampler. Which requires the index into the array of samplers to be dynamically uniform. Which yours is not.
If I understand correctly, if I was to set TEXTURE_MIN_FILTER to NEAREST then there's not much difference between sampler2DArray/TEXTURE_2D_ARRAY and sampler3D/TEXTURE_3D
The differences seem to be
GenerateMipmap will blend cross layers with 3D textures but not 2D arrays
the Z coordinate passed to texture in GLSL is 0 to 1 with 3D textures but an 0 to N (depth) in 2D arrays.
If filtering is not NEAREST 3D will blend across layers, 2D array will not.
Correct?
Incorrect. There's one more difference: mipmap sizes.
In a 3D texture, the width, height, and depth all decrease at lower mipmap sizes. In a 2D array texture, every mipmap level has the same number of array layers; only width and height decrease.
It's not just a matter of blending and some texture coordinate oddities; the very size of the texture data is different. It is very much a different kind of texture, as different from 3D textures as 2D textures are from 1D textures.
This is also why you cannot create a view texture of a 3D texture that is a 2D array, or vice-versa.
Apart from the answer already given, there is another difference worth noting: The size limits are also quite different. A single layer of an array texture may be as big as an standard 2D texture, and there is an extra limit on the number of layers, while for 3D textures, there is a limit constraining the maximum size in all dimensions.
For example, OpenGL 4.5 guarantees the following minimal values:
GL_MAX_TEXTURE_SIZE 16384
GL_MAX_ARRAY_TEXTURE_LAYERS 2048
GL_MAX_3D_TEXTURE_SIZE 2048
So a 16384 x 16384 x 16 array texture is fine (and should also fit into memory for every GL 4.5 capable GPU found in the real world), while a 3D texture of the same dimensions would be unsupported on most of todays implementations (even though the complete mipmap pyramid would consume less memory in the 3D texture case).
I am working on a project that requires drawing a lot of data as it is acquired by an ADC...something like 50,000 lines per frame on a monitor 1600 pixels wide. It runs great on a system with a 2007-ish Quadro FX 570, but basically can't keep up with the data on machines with Intel HD 4000 class chips. The data load is 32 channels of 200 Hz data received in batches of 5 samples per channel 40 times per second. So, in other words, the card only needs to achieve 40 frames per second or better.
I am using a single VBO for all 32 channels with space for 10,000 vertices each. The VBO is essentially treated like a series of ring buffers for each channel. When the data comes in, I decimate it based on the time scale being used. So, basically, it tracks the min/max for each channel. When enough data has been received for a single pixel column, it sets the next two vertices in the VBO for each channel and renders a new frame.
I use glMapBuffer() to access the data once, update all of the channels, use glUnmapBuffer, and then render as necessary.
I manually calculate the transform matrix ahead of time (using an orthographic transform calculated in a non-generic way to reduce multiplications), and the vertex shader looks like:
#version 120
varying vec4 _outColor;
uniform vec4 _lBound=vec4(-1.0);
uniform vec4 _uBound=vec4(1.0);
uniform mat4 _xform=mat4(1.0);
attribute vec2 _inPos;
attribute vec4 _inColor;
void main()
{
gl_Position=clamp(_xform*vec4(_inPos, 0.0, 1.0), _lBound, _uBound);
_outColor=_inColor;
}
The _lBound, _uBound, and _xform uniforms are updated once per channel. So, 32 times per frame. The clamp is used to limit certain channels to a range of y-coordinates on the screen.
The fragment shader is simply:
#version 120
varying vec4 _outColor;
void main()
{
gl_FragColor=_outColor;
}
There is other stuff being render to the screen; channel labels, for example, using quads and a texture atlas; but profiling in gDEBugger seems to indicate that the line rendering takes the overwhelming majority of time per frame.
Still, 50,000 lines does not seem like a horrendously large number to me.
So, after all of that, the question is: are there any tricks to speeding up line drawing? I tried rendering them to the stencil buffer and then clipping a single quad, but that was slower. I thought about drawing the lines to a texture, the drawing a quad with the texture. But, that does not seem scalable or even faster due to uploading large textures constantly. I saw a technique that stores the y values in a single row texture, but that seems more like memory optimization rather than speed optimization.
Mapping a VBO might slow you down due to the driver might require to sync the GPU with the CPU. A more performant way is to just throw your data onto the GPU, so the CPU and GPU can run more independently.
Recreate the VBO every time, do create it with STATIC_DRAW
If you need to map your data, do NOT map as readable (GL_WRITE_ONLY)
Thanks, everyone. I finally settled on blitting between framebuffers backed by renderbuffers. Works well enough. Many suggested using textures, and I may go that route in the future if I eventually need to draw behind the data.
If you're just scrolling a line graph (GDI style), just draw the new column on the CPU and use glTexSubImage2D to update a single column in the texture. Draw it as a pair of quads and update the st coordinates to handle scrolling/wrapping.
If you need to update all the lines all the time, use a VBO created with GL_DYNAMIC_DRAW and use glBufferSubData to update the buffer.