Can an OpenGL shader do a mix of nearest and linear scaling? - opengl

I'm porting some old OpenGL 1.2 bitmap font rendering code to modern OpenGL (at least OpenGL 3.2+), and I'm wondering if I can use a GLSL shader to achieve what I've been doing manually.
When I want to draw the string "123", scaled to particular size, I do the following steps with the sprites below.
I draw the sprite to the screen, scaled 2x with GL_NEAREST. However, to get a black outline, I actually draw the sprite several times.
x + 1, y + 0, BLACK
x + 0, y + 1, BLACK
x - 1, y + 0, BLACK
x + 0, y - 1, BLACK
x + 0, y + 0, COLOR (RED)
After the sprites have been drawn to the screen, I copy the screen to a texture, via glCopyTexSubImage2D.
I draw that texture back to the screen, but with GL_LINEAR.
The end result is a more visually appealing form of scaling pixel sprites. When upscaling small pixel sprites to arbitrary dimensions, using just GL_NEAREST (bottom-right) or just GL_LINEAR (bottom-left) gives an effect I don't like. Pixel doubling with GL_NEAREST, and then do the remaining scaling with GL_LINEAR, gives a result that I prefer (top).
I'm pretty sure GLSL can do the black outline (thus saving me from having to do lots of draws), but could it also do the combination of GL_NEAREST and GL_LINEAR scaling?

You could achieve the effect of "2x nearest-neighbour upscaling followed by linear sampling" by pretending to sample a 4-texel neighbourhood from the upscaled texture while in reality sampling them from the original one. Then you'll have to implement bilinear interpolation manually. If you were targeting OpenGL 4+, textureGather() would be useful, though do keep this issue in mind. In my proposed solution below, I'll be using 4 texelFetch() calls, rather than textureGather(), as textureGather() would complicate things quite a bit.
Suppose you have an unscaled texture with black borders around the glyphs already present. Let's assume you have a normalized texture coordinate of vec2 pn = ... into that texture, where pn.x and pn.y are between 0 and 1. The following code should achieve the desired effect, though I haven't tested it:
ivec2 origTexSize = textureSize(sampler, 0);
int upscaleFactor = 2;
// Floating point texel coordinate into the upscaled texture.
vec2 ptu = pn * vec2(origTexSize * upscaleFactor);
// Decompose "ptu - 0.5" into the integer and fractional parts.
vec2 ptuf;
vec2 ptui = modf(ptu - 0.5, ptuf);
// Integer texel coordinates into the upscaled texture.
ivec2 ptu00 = ivec2(ptui);
ivec2 ptu01 = ptu00 + ivec2(0, 1);
ivec2 ptu10 = ptu00 + ivec2(1, 0);
ivec2 ptu11 = ptu00 + ivec2(1, 1);
// Integer texel coordinates into the original texture.
ivec2 pt00 = clamp(ptu00 / upscaleFactor, ivec2(0), origTexSize - 1);
ivec2 pt01 = clamp(ptu01 / upscaleFactor, ivec2(0), origTexSize - 1);
ivec2 pt10 = clamp(ptu10 / upscaleFactor, ivec2(0), origTexSize - 1);
ivec2 pt11 = clamp(ptu11 / upscaleFactor, ivec2(0), origTexSize - 1);
// Sampled colours.
vec4 clr00 = texelFetch(sampler, pt00, 0);
vec4 clr01 = texelFetch(sampler, pt01, 0);
vec4 clr10 = texelFetch(sampler, pt10, 0);
vec4 clr11 = texelFetch(sampler, pt11, 0);
// Bilinear interpolation.
vec4 clr0x = mix(clr00, clr01, ptuf.y);
vec4 clr1x = mix(clr10, clr11, ptuf.y);
vec4 clrFinal = mix(clr0x, clr1x, ptuf.x);

Related

OpenGL compute shader normal map generation poor performance

I have an height cube map and I want to generate a normal cube map texture from it. My height cube map is just a 2048x2048 image that I load at the beginning of the application for each face of the cube, and I can modify in real time a "maximum height" value which is used as a multiplicator when retrieving a pixel in the height map.
Initially I was calculating the normals in the vertex shader, but it gave me bad lighting results so I decided to move the calculations in the fragment shader.
As the height map does not change every frame (only when I modify the "maximum height" value), I want to generate a normal map texture from it, using a compute shader because I don't need any rasterization, but it gives me very poor performances.
With the fragment shader I ran at 200FPS but using the compute shader I run at 40 FPS.
Here is how I bind my images and start the compute work:
_computeShaderProgram.use();
glUniform1f(_computeShaderProgram.getUniformLocation("maxHeight"), maxHeight);
glBindImageTexture(
0,
static_cast<GLuint>(heightMap),
0,
GL_TRUE,
0,
GL_READ_ONLY,
GL_RGBA32F
);
glBindImageTexture(
1,
static_cast<GLuint>(normalMap),
0,
GL_TRUE,
0,
GL_WRITE_ONLY,
GL_RGBA32F
);
// Start compute work
// I only compute for one face of the cube map
glDispatchCompute(normalMap.getWidth() / 16, normalMap.getWidth() / 16, 1);
glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT);
And the compute shader:
#version 430 core
#extension GL_ARB_compute_shader : enable
layout(local_size_x = 16, local_size_y = 16, local_size_z = 1) in;
layout(rgba32f, binding = 0) readonly uniform imageCube heightMap;
layout(rgba32f, binding = 1) writeonly uniform imageCube normalMap;
uniform float maxHeight;
float getHeight(ivec3 heightMapCoord) {
vec4 heightMapValue = imageLoad(heightMap, heightMapCoord);
return heightMapValue.r * maxHeight;
}
void main() {
ivec3 textCoord = ivec3(gl_GlobalInvocationID);
// Calculate height of neighbors
float leftCubePosHeight = getHeight(textCoord + ivec3(-1, 0, 0));
float rightCubePosHeight = getHeight(textCoord + ivec3(1, 0, 0));
float topCubePosHeight = getHeight(textCoord + ivec3(0, -1, 0));
float bottomCubePosHeight = getHeight(textCoord + ivec3(0, 1, 0));
// Calculate normal using central differences method
vec3 horizontal = vec3(2.0, rightCubePosHeight - leftCubePosHeight, 0.0);
vec3 vertical = vec3(0.0, bottomCubePosHeight - topCubePosHeight, 2.0);
vec3 normal = normalize(cross(vertical, horizontal));
imageStore(normalMap, textCoord, vec4(normal, 1.0));
}
I tried with different work groups sizes (width, width / 8, width / 16, width / 32) and local sizes (1, 8, 16, 32) but the performance is always poor, around 40 FPS or 20 FPS for work group with a size of the full width.
I know I can use shared memory for threads in the same work group to prevent fetching the same texture coordinate 4 times but later I will have height map generated procedurally and will be larger than 2048x2048 I think.
What is the difference between the fragment shader and the compute shader that make it so slow ? Am I doing something wrong ?
Is there any other solutions to generate this normal map ?
EDIT:
The fps I gave above are not right because I was generating 1/16 of the normal map (when I had 40FPS), and I also used the central differences technique to calculate the normals, which is cheap but does not give good lighting results, so I switched to Sobel technique, which is a little more expensive.
I made some tests to know which technique could give the best performance.
Each frame I generate the normal map (this will not be the case later, but it's just to test the performance). Here are my tests:
CPU side single thread: 1.5FPS
Compute shader with local sizes of 1 and one worker group for each image pixel: 4FPS
Compute shader with local sizes of 16 and one worker group for each 16x16 image pixels block: 11FPS
Fragment shader using framebuffer and MRT with 6 color attachments (one for each face of the normal map): 12.5FPS
This is a little laggy when I modify the max height (which generate the normal map again), but I think it's okay as I won't modify it a lot.

Wrapping texture co-ordinates on a variable-size quad?

Here's my situation: I need to draw a rectangle on the screen for my game's Gui. I don't really care how big this rectangle is or might be, I want to be able to handle any situation. How I'm doing it right now is I store a single VAO that contains only a very basic quad, then I re-draw this quad using uniforms to modify the size and position of it on the screen each time.
The VAO contains 4 vec4 vertices:
0, 0, 0, 0;
1, 0, 1, 0;
0, 1, 0, 1;
1, 1, 1, 1;
And then I draw it as a GL_TRIANGLE_STRIP. The XY of each vertex is it's position, and the ZW is it's texture co-ordinates*. I pass in the rect for the gui element I'm currently drawing as a uniform vec4, which offsets the vertex positions in the vertex shader like so:
vertex.xy *= guiRect.zw;
vertex.xy += guiRect.xy;
And then I convert the vertex from screen pixel co-ordinates into OpenGL NDC co-ordinates:
gl_Position = vec4(((vertex.xy / screenSize) * 2) -1, 0, 1);
This changes the range from [0, screenWidth | screenHeight] to [-1, 1].
My problem comes in when I want to do texture wrapping. Simply passing vTexCoord = vertex.zw; is fine when I want to stretch a texture, but not for wrapping. Ideally, I want to modify the texture co-ordinates such that 1 pixel on the screen is equal to 1 texel in the gui texture. Texture co-ordinates going beyond [0, 1] is fine at this stage, and is in fact exactly what I'm looking for.
I plan to implement texture atlasses for my gui textures, but managing the offsets and bounds of the appropriate sub-texture will be handled in the fragment shader - as far as the vertex shader is concerned, our quad is using one solid texture with [0, 1] co-ordinates, and wrapping accordingly.
*Note: I'm aware that this particular vertex format isn't neccesarily useful for this particular case, I could be using vec2 vertices instead. For the sake of convenience I'm using the same vertex format for all of my 2D rendering, and other objects ie text actually do need those ZW components. I might change this in the future.
TL/DR: Given the size of the screen, the size of a texture, and the location/size of a quad, how do you calculate texture co-ordinates in a vertex shader such that pixels and texels have a 1:1 correspondence, with wrapping?
That is really very easy math: You just need to relate the two spaces in some way. And you already formulated a rule which allows you to do so: a window space pixel is to map to a texel.
Let's assume we have both vec2 screenSize and vec2 texSize which are the unnormalized dimensions in pixels/texels.
I'm not 100% sure what exactly you wan't to achieve. There is something missing: you actaully did not specify where the origin of the texture shall lie. Should it always be but to the bottom left corner of the quad? Or should it be just gloablly at the bottom left corner of the viewport? I'll assume the lattter here, but it should be easy to adjust this for the first case.
What we now need is a mapping between the [-1,1]^2 NDC in x and y to s and t. Let's first map it to [0,1]^2. If we have that, we can simply multiply the coords by screenSize/texSize to get the desired effect. So in the end, you get
vec2 texcoords = ((gl_Position.xy * 0.5) + 0.5) * screenSize/texSize;
You of course already have caclulated (gl_Position.xy * 0.5) + 0.5) * screenSize implicitely, so this could be changed to:
vec2 texcoords = vertex.xy / texSize;

GLSL Distance Field transparency

I am after smooth texture based outline effect in OpenGL. So far I tried mostly all kinds of edge detection algorithms which result mostly in crude and jagged outlines. Then I read about Distance Field. I found an example which does pretty nice distance field. Here is the GLSL code:
#version 420
layout(binding=0) uniform sampler2D colorMap;
flat in vec4 diffuseOut;
in vec2 uvsOut;
out vec4 outputColor;
const float ALPHA_THRESHOLD = 0.9;
const float NUM_SPOKES = 36.0; // Number of radiating lines to check in.
const float ANGULAR_STEP =360.0 / NUM_SPOKES;
const int ZERO_VALUE =128; // Color channel containing 0 => -128, 128 => 0, 255 => +127
int in_StepSize=15; // Distance to check each time (larger steps will be faster, but less accurate).
int in_MaxDistance=30; // Maximum distance to search out to. Cannot be more than 127!
vec4 distField(){
vec2 pixel_size = 1.0 / vec2(textureSize(colorMap, 0));
vec2 screenTexCoords = gl_FragCoord.xy * pixel_size;
int distance;
if(texture(colorMap, screenTexCoords).a == 0.0)
{
// Texel is transparent, search for nearest opaque.
distance = ZERO_VALUE + 1;
for(int i = in_StepSize; i < in_MaxDistance; i += in_StepSize)
{
if(find_alpha_at_distance(screenTexCoords, float(i) * pixel_size, 1.0))
{
i = in_MaxDistance + 1; // BREAK!
}
else
{
distance = ZERO_VALUE + 1 + i;
}
}
}
else
{
// Texel is opaque, search for nearest transparent.
distance = ZERO_VALUE;
for(int i = in_StepSize; i <= in_MaxDistance; i += in_StepSize)
{
if(find_alpha_at_distance(screenTexCoords, float(i) * pixel_size, 0.0))
{
i = in_MaxDistance + 1; // BREAK!
}
else
{
distance = ZERO_VALUE - i;
}
}
}
return vec4(vec3(float(distance) / 255.0) * diffuseOut.rgb, 1.0 - texture(colorMap, screenTexCoords).a);
}
void main()
{
outputColor= distField();
}
The result of this shader covers the whole screen using the diffuse color for filling the screen area outside the Distance Field outline.Here is how it looks like :
What I need is to leave all the area which has the solid red fill outside the distance field as transparent.
I came to the solution by using Distance Field gray scale 8 bit alpha map.Stefan Gustavson
describes in detail how to do it.Basically one needs to generate the distance field version of the original texture.Then this texture is rendered with the primitive normally in the first pass into an FBO.In the second pass the alpha blending mode should be on.The texture from the first pass in used with the screen quad.At this stage the the fragment shader samples the alpha from that texture.This results in both smooth edges and alpha transparency around the edges.
Here is the result:
Based on the screenshot I'm assuming you're rendering a fullscreen quad? If that's the case Tim just provided the answer, try:
glEnable( GL_BLEND );
glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
Before you render the quad. Obviously if you're going to render non-transparent stuff too, I advise you to render those first so you won't get depth buffer problems. When you're done drawing the transparent stuff, call:
glDisable( GL_BLEND );
To turn alphablending off again.

How do I get my textures to bind properly for multitexturing?

I'm trying to render colored text to the screen. I've got a texture containing a black (RGBA 0, 0, 0, 255) representation of the text to display, and I've got another texture containing the color pattern I want to render the text in. This should be a fairly simple multitexturing exercise, but I can't seem to get the second texture to work. Both textures are Rectangle textures, because the integer coordinate values are easier to work with.
Rendering code:
glActiveTextureARB(GL_TEXTURE0_ARB);
glEnable(GL_TEXTURE_RECTANGLE_ARB);
glBindTexture(GL_TEXTURE_RECTANGLE_ARB, TextHandle);
glActiveTextureARB(GL_TEXTURE1_ARB);
glEnable(GL_TEXTURE_RECTANGLE_ARB);
glBindTexture(GL_TEXTURE_RECTANGLE_ARB, ColorsHandle);
glBegin(GL_QUADS);
glMultiTexCoord2iARB(GL_TEXTURE0_ARB, 0, 0);
glMultiTexCoord2iARB(GL_TEXTURE1_ARB, colorRect.Left, colorRect.Top);
glVertex2f(x, y);
glMultiTexCoord2iARB(GL_TEXTURE0_ARB, 0, textRect.Height);
glMultiTexCoord2iARB(GL_TEXTURE1_ARB, colorRect.Left, colorRect.Top + colorRect.Height);
glVertex2f(x, y + textRect.Height);
glMultiTexCoord2iARB(GL_TEXTURE0_ARB, textRect.Width, textRect.Height);
glMultiTexCoord2iARB(GL_TEXTURE1_ARB, colorRect.Left + colorRect.Width, colorRect.Top + colorRect.Height);
glVertex2f(x + textRect.Width, y + textRect.Height);
glMultiTexCoord2iARB(GL_TEXTURE0_ARB, textRect.Width, 0);
glMultiTexCoord2iARB(GL_TEXTURE1_ARB, colorRect.Left + colorRect.Width, colorRect.Top);
glVertex2f(x + textRect.Width, y);
glEnd;
Vertex shader:
void main()
{
gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
gl_TexCoord[0] = gl_MultiTexCoord0;
gl_TexCoord[1] = gl_MultiTexCoord1;
}
Fragment shader:
uniform sampler2DRect texAlpha;
uniform sampler2DRect texRGB;
void main()
{
float alpha = texture2DRect(texAlpha, gl_TexCoord[0].st).a;
vec3 rgb = texture2DRect(texRGB, gl_TexCoord[1].st).rgb;
gl_FragColor = vec4(rgb, alpha);
}
This seems really straightforward, but it ends up rendering solid black text instead of colored text. I get the exact same result if the last line of the fragment shader reads gl_FragColor = texture2DRect(texAlpha, gl_TexCoord[0].st);. Changing the last line to gl_FragColor = texture2DRect(texRGB, gl_TexCoord[1].st); causes it to render nothing at all.
Based on this, it appears that calling texture2DRect on texRGB always returns (0, 0, 0, 0). I've made sure that GL_MULTISAMPLE is enabled, and bound the texture on unit 1, but for whatever reason I don't seem to actually get access to it inside my fragment shader. What am I doing wrong?
The overalls look fine. It is possible that your texcoords for unit 1 are messed up, causing sampling outside the colored portion of your texture.
Is your color texture fully filled with color ?
What do you mean by "causes it to render nothing at all." ? This should not happen except if your alpha channel in color texture is set to 0.
Did you try with the following code, to override the alpha channel ?
gl_FragColor = vec4( texture2DRect(texRGB, gl_TexCoord[1].st).rgb, 1.0 );
Are you sure the the font outline texture contains a valid alpha values? You said that the texture is black and white, but you are using the alpha value! Instead of using the a component, try to use the r one.
Blending affects fragment shader output: it blends ths fragment color with the corresponding one.

Texture wrong value in fragment shader

I'm loading a custom data into 2D texture GL_RGBA16F:
glActiveTexture(GL_TEXTURE0);
int Gx = 128;
int Gy = 128;
GLuint grammar;
glGenTextures(1, &grammar);
glBindTexture(GL_TEXTURE_2D, grammar);
glTexStorage2D(GL_TEXTURE_2D, 1, GL_RGBA16F, Gx, Gy);
float* grammardata = new float[Gx * Gy * 4](); // set default to zero
*(grammardata) = 1;
glTexSubImage2D(GL_TEXTURE_2D,0,0,0,Gx,Gy,GL_RGBA,GL_FLOAT,grammardata);
int grammarloc = glGetUniformLocation(p_myGLSL->getProgramID(), "grammar");
if (grammarloc < 0) {
printf("grammar missing!\n");
exit(0);
}
glUniform1i(grammarloc, 0);
When I read the value of uniform sampler2D grammar in GLSL, it returns 0.25 instead of 1. How do I fix the scaling problem?
if (texture(grammar, vec2(0,0) == 0.25) {
FragColor = vec4(0,1,0,1);
} else
{
FragColor = vec4(1,0,0,1);
}
By default texture interpolation is set to the following values:
GL_TEXTURE_MIN_FILTER = GL_NEAREST_MIPMAP_LINEAR,
GL_TEXTURE_MAG_FILTER = GL_LINEAR
GL_WRAP[R|S|T] = GL_REPEAT
This means, in cases where the mapping between texels of the texture and pixels on the screen does not fit, the hardware interpolates will interpolate for you. There can be two cases:
The texture is displayed smaller than it actually is: In this case interpolation is performed between two mipmap levels. If no mipmaps are generated, these are treated as beeing 0, which could lead to 0.25.
The texture is displayed larger than it actually is (and I think this will be the case here): Here, the hardware does not interpolate between mipmap levels, but between adjacent texels in the texture. The problem now comes from the fact, that (0,0) in texture coordinates is NOT the center of pixel [0,0], but the lower left corner of it.
Have a look at the following drawing, which illustrates how texture coordinates are defined (here with 4 texels)
tex-coord: 0 0.25 0.5 0.75 1
texels |-----0-----|-----1-----|-----2-----|-----3-----|
As you can see, 0 is on the boundary of a texel, while the first texels center is at (1/(2 * |texels|)).
This means for you, that with wrap mode set to GL_REPEAT, texture coordinate (0,0) will interpolate uniformly between the texels [0,0], [-1,0], [-1,-1], [0,-1]. Since -1 == 127 (due to repeat) and everything except [0,0] is 0, this results in
([0,0] + [-1,0] + [-1,-1] + [0,-1]) / 4 =
1 + 0 + 0 + 0 ) / 4 = 0.25