Wrapping texture co-ordinates on a variable-size quad? - c++

Here's my situation: I need to draw a rectangle on the screen for my game's Gui. I don't really care how big this rectangle is or might be, I want to be able to handle any situation. How I'm doing it right now is I store a single VAO that contains only a very basic quad, then I re-draw this quad using uniforms to modify the size and position of it on the screen each time.
The VAO contains 4 vec4 vertices:
0, 0, 0, 0;
1, 0, 1, 0;
0, 1, 0, 1;
1, 1, 1, 1;
And then I draw it as a GL_TRIANGLE_STRIP. The XY of each vertex is it's position, and the ZW is it's texture co-ordinates*. I pass in the rect for the gui element I'm currently drawing as a uniform vec4, which offsets the vertex positions in the vertex shader like so:
vertex.xy *= guiRect.zw;
vertex.xy += guiRect.xy;
And then I convert the vertex from screen pixel co-ordinates into OpenGL NDC co-ordinates:
gl_Position = vec4(((vertex.xy / screenSize) * 2) -1, 0, 1);
This changes the range from [0, screenWidth | screenHeight] to [-1, 1].
My problem comes in when I want to do texture wrapping. Simply passing vTexCoord = vertex.zw; is fine when I want to stretch a texture, but not for wrapping. Ideally, I want to modify the texture co-ordinates such that 1 pixel on the screen is equal to 1 texel in the gui texture. Texture co-ordinates going beyond [0, 1] is fine at this stage, and is in fact exactly what I'm looking for.
I plan to implement texture atlasses for my gui textures, but managing the offsets and bounds of the appropriate sub-texture will be handled in the fragment shader - as far as the vertex shader is concerned, our quad is using one solid texture with [0, 1] co-ordinates, and wrapping accordingly.
*Note: I'm aware that this particular vertex format isn't neccesarily useful for this particular case, I could be using vec2 vertices instead. For the sake of convenience I'm using the same vertex format for all of my 2D rendering, and other objects ie text actually do need those ZW components. I might change this in the future.
TL/DR: Given the size of the screen, the size of a texture, and the location/size of a quad, how do you calculate texture co-ordinates in a vertex shader such that pixels and texels have a 1:1 correspondence, with wrapping?

That is really very easy math: You just need to relate the two spaces in some way. And you already formulated a rule which allows you to do so: a window space pixel is to map to a texel.
Let's assume we have both vec2 screenSize and vec2 texSize which are the unnormalized dimensions in pixels/texels.
I'm not 100% sure what exactly you wan't to achieve. There is something missing: you actaully did not specify where the origin of the texture shall lie. Should it always be but to the bottom left corner of the quad? Or should it be just gloablly at the bottom left corner of the viewport? I'll assume the lattter here, but it should be easy to adjust this for the first case.
What we now need is a mapping between the [-1,1]^2 NDC in x and y to s and t. Let's first map it to [0,1]^2. If we have that, we can simply multiply the coords by screenSize/texSize to get the desired effect. So in the end, you get
vec2 texcoords = ((gl_Position.xy * 0.5) + 0.5) * screenSize/texSize;
You of course already have caclulated (gl_Position.xy * 0.5) + 0.5) * screenSize implicitely, so this could be changed to:
vec2 texcoords = vertex.xy / texSize;

Related

Limiting texture coordinate at edge of polygon

I've written a GLSL shader to emulate a vintage arcade game's indexed color tile-based graphics. I made a couple of shaders, one that does this with point sprites, and another using polygons. The point sprite shader converts gl_PointCoord to a pixel coordinate within each tile like so:
vec2 pixelFloat = gl_PointCoord * tileSizeInPixels;
ivec2 pixel = ivec2(int(pixelFloat.x), int(pixelFloat.y));
// pixel is now used in conjunction with a tile 'ID' uniform
// to locate indexed colors with a texture lookup from a
// large texture representing the game's ROM, with GL_NEAREST filtering.
// very clever 😋
The polygon shader instead uses an attribute buffer to pass pixel coordinates (which range {0.0 … 32.0} for a 32-pixel square tile, for example). After conversion to int, each fragment within the tile sees pixel coordinate values ranging x {0 … 31} y {0 … 31}, except:
This worked fine apart from artefacts sometimes showing at the edge of the tile with the higher numbered pixel coordinate at certain resolutions. I guessed that would be due to the fragment being at just the right location to be right on the maximum value of either gl_PointCoord or the vertex attribute value of 32.0, causing that fragment to sample the wrong tile.
These artefacts went away when I clamped the pixel ivec like this:
vec2 pixelFloat = gl_PointCoord * tileSizeInPixels;
ivec2 pixel = ivec2(
min(int(pixelFloat.x), tileSizeInPixels - 1),
min(int(pixelFloat.y), tileSizeInPixels - 1));
which solved the problem and didn't introduce any new artefacts.
My question is: Is there some way of controlling the interpolation of gl_PointCoord or my pixel coordinate attribute such that we can guarantee the interpolated value will range
minimum value <= interpolated value < maximum value
as opposed to
minimum value <= interpolated value <= maximum value
Is there some way I can avoid using min() here?
NB: GL_CLAMP_* is not an option here, as the pixel coordinate is used to look up the pixel's index color from a much larger texture, which is essentially the game's sprite ROM loaded into a single large texture buffer.

OpenGL compute shader normal map generation poor performance

I have an height cube map and I want to generate a normal cube map texture from it. My height cube map is just a 2048x2048 image that I load at the beginning of the application for each face of the cube, and I can modify in real time a "maximum height" value which is used as a multiplicator when retrieving a pixel in the height map.
Initially I was calculating the normals in the vertex shader, but it gave me bad lighting results so I decided to move the calculations in the fragment shader.
As the height map does not change every frame (only when I modify the "maximum height" value), I want to generate a normal map texture from it, using a compute shader because I don't need any rasterization, but it gives me very poor performances.
With the fragment shader I ran at 200FPS but using the compute shader I run at 40 FPS.
Here is how I bind my images and start the compute work:
_computeShaderProgram.use();
glUniform1f(_computeShaderProgram.getUniformLocation("maxHeight"), maxHeight);
glBindImageTexture(
0,
static_cast<GLuint>(heightMap),
0,
GL_TRUE,
0,
GL_READ_ONLY,
GL_RGBA32F
);
glBindImageTexture(
1,
static_cast<GLuint>(normalMap),
0,
GL_TRUE,
0,
GL_WRITE_ONLY,
GL_RGBA32F
);
// Start compute work
// I only compute for one face of the cube map
glDispatchCompute(normalMap.getWidth() / 16, normalMap.getWidth() / 16, 1);
glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT);
And the compute shader:
#version 430 core
#extension GL_ARB_compute_shader : enable
layout(local_size_x = 16, local_size_y = 16, local_size_z = 1) in;
layout(rgba32f, binding = 0) readonly uniform imageCube heightMap;
layout(rgba32f, binding = 1) writeonly uniform imageCube normalMap;
uniform float maxHeight;
float getHeight(ivec3 heightMapCoord) {
vec4 heightMapValue = imageLoad(heightMap, heightMapCoord);
return heightMapValue.r * maxHeight;
}
void main() {
ivec3 textCoord = ivec3(gl_GlobalInvocationID);
// Calculate height of neighbors
float leftCubePosHeight = getHeight(textCoord + ivec3(-1, 0, 0));
float rightCubePosHeight = getHeight(textCoord + ivec3(1, 0, 0));
float topCubePosHeight = getHeight(textCoord + ivec3(0, -1, 0));
float bottomCubePosHeight = getHeight(textCoord + ivec3(0, 1, 0));
// Calculate normal using central differences method
vec3 horizontal = vec3(2.0, rightCubePosHeight - leftCubePosHeight, 0.0);
vec3 vertical = vec3(0.0, bottomCubePosHeight - topCubePosHeight, 2.0);
vec3 normal = normalize(cross(vertical, horizontal));
imageStore(normalMap, textCoord, vec4(normal, 1.0));
}
I tried with different work groups sizes (width, width / 8, width / 16, width / 32) and local sizes (1, 8, 16, 32) but the performance is always poor, around 40 FPS or 20 FPS for work group with a size of the full width.
I know I can use shared memory for threads in the same work group to prevent fetching the same texture coordinate 4 times but later I will have height map generated procedurally and will be larger than 2048x2048 I think.
What is the difference between the fragment shader and the compute shader that make it so slow ? Am I doing something wrong ?
Is there any other solutions to generate this normal map ?
EDIT:
The fps I gave above are not right because I was generating 1/16 of the normal map (when I had 40FPS), and I also used the central differences technique to calculate the normals, which is cheap but does not give good lighting results, so I switched to Sobel technique, which is a little more expensive.
I made some tests to know which technique could give the best performance.
Each frame I generate the normal map (this will not be the case later, but it's just to test the performance). Here are my tests:
CPU side single thread: 1.5FPS
Compute shader with local sizes of 1 and one worker group for each image pixel: 4FPS
Compute shader with local sizes of 16 and one worker group for each 16x16 image pixels block: 11FPS
Fragment shader using framebuffer and MRT with 6 color attachments (one for each face of the normal map): 12.5FPS
This is a little laggy when I modify the max height (which generate the normal map again), but I think it's okay as I won't modify it a lot.

While drawing in orthographic view, Is there any performance advantage of using glDrawElements

I am drawing some orthographic representations in bulk around one million in my model drawing.
(I will draw these things with some flag)
Camera is also implemented. rotation etc are possible.
All these orthograhic representations will change their positions when I rotate the model.
So that, it looks like, all these are in the same place on the model.
Now I would like to draw these orthographic things through graphics card, because, when these are huge in number, model rotation is very very slow.
I feel like there would not be any advantage, because, every time I have to recompute the postions based on the projection matrix.
1) Am I correct?
2) And also please let me know, how to improve performance when i am drawing bulk orthographic representations using opengl.
3) I also feel instancing will not work here, because for each orthographic rep is drawn between 2/3 positions. Am I correct ?
Usually, OpenGL does the projection calculation for you while drawing: The positions handed over to GL are world or model coordinates, and GL rendering uses the model-view-projection matrix (while rendering) to calculate the screen coordinates for the current projection etc. If the camera moves, the only thing that changes is the MVP matrix handed to GL.
This shouldn't really depend on the kind of projection you are using. So I don't think you need to / should update the positions in your array.
Here is my approach:
You create a vertex buffer that contains each vertex position 6 times and 6 texture coordinates (that you need anyways if you want to draw your representation with textures) from which you make a quad in the vertex shader. In that you would emulate the openGL projection and then offset the vertex by its texture coordinate to create the quad with constant size.
When constructing the model:
vector<vec3>* positionList = new vector<vec3>();
vector<vec2>* texCoordList = new vector<vec2>();
for (vector<vec3>::iterator it = originalPositions->begin(); it != originalPositions->end(); ++it) {
for (int i = 0; i < 6; i++) //each quad consists of 2 triangles, therefore 6 vertices
positionList->push_back(vec3(*it));
texCoordList->push_back(vec2(0, 0)); //corresponding texture coordinates
texCoordList->push_back(vec2(1, 0));
texCoordList->push_back(vec2(0, 1));
texCoordList->push_back(vec2(1, 0));
texCoordList->push_back(vec2(1, 1));
texCoordList->push_back(vec2(0, 1));
}
vertexCount = positionList->size();
glGenBuffers(1, &VBO_Positions); //Generate the buffer for the vertex positions
glBindBuffer(GL_ARRAY_BUFFER, VBO_Positions);
glBufferData(GL_ARRAY_BUFFER, positionList->size() * sizeof(vec3), positionList->data(), GL_STATIC_DRAW);
glGenBuffers(1, &VBO_texCoord); //Generate the buffer for texture coordinates, which we are also going to use as offset values
glBindBuffer(GL_ARRAY_BUFFER, VBO_texCoord);
glBufferData(GL_ARRAY_BUFFER, texCoordList->size() * sizeof(vec2), texCoordList->data(), GL_STATIC_DRAW);
Vertex Shader:
void main() {
fs_texCoord = vs_texCoord;
vec4 transformed = (transform * vec4(vs_position, 1));
transformed.xyz /= transformed.w; //This is how the openGL pipeline does projection
vec2 offset = (vs_texCoord * 2 - 1) * offsetScale;
//Map the texture coordinates from [0, 1] to [-offsetScale, offsetScale]
offset.x *= invAspectRatio;
gl_Position = vec4(transformed.xy + offset, 0, 1);
//We pass the new position to the pipeline with w = 1 so openGL keeps the position we calculated
}
}
Note that you need to adapt to the aspect ratio yourself, since there is no actual orthogonal matrix in this that would do this for you, which is this line:
offset.x *= invAspectRatio;

Rendering a circle with a pixel shader in DirectX

I would like to render a circle on to a triangle pair using a pixel shader written in HLSL. There is some pseudocode for this here, but I am running in to one problem after another while implementing it. This will be running on Windows RT, so I am limited to the DirectX 9.3 members of the Direct3D v11 API.
What is the general shape of the shader? For example, what parameters should be passed to the main method?
Any advice or working code would be greatly appreciated!
You should use the texture coordinates as inputs to render the parametric circle.
Here is a question that I asked about anti-aliasing a circle rendered in HLSL. Here is the code from that question with a minor change to make use of TEXCOORD0 more clear:
float4 PixelShaderFunction(float2 texCoord : TEXCOORD0) : COLOR0
{
float dist = texCoord.x * texCoord.x
+ texCoord.y * texCoord.y;
if(dist < 1)
return float4(0, 0, 0, 1);
else
return float4(1, 1, 1, 1);
}
It uses the formula for a circle which is r² = x² + y². It uses the constant 1 for the radius-squared (ie: a circle of radius 1, as measured in texture coordinates). All points inside the circle are coloured black, and those outside, white.
To use it, provide triangles with texture-coordinates in the range [-1, 1]. If you want to use the more traditional [0, 1] you will have to include some code to scale and offset coordinates in the shader, or you will get a quarter-circle.
Once you have this up and running, you can experiment to add other features (for example: anti-aliasing as per my linked question).

Opengl pixel perfect 2D drawing

I'm working on a 2d engine. It already works quite good, but I keep getting pixel-errors.
For example, my window is 960x540 pixels, I draw a line from (0, 0) to (959, 0). I would expect that every pixel on scan-line 0 will be set to a color, but no: the right-most pixel is not drawn. Same problem when I draw vertically to pixel 539. I really need to draw to (960, 0) or (0, 540) to have it drawn.
As I was born in the pixel-era, I am convinced that this is not the correct result. When my screen was 320x200 pixels big, I could draw from 0 to 319 and from 0 to 199, and my screen would be full. Now I end up with a screen with a right/bottom pixel not drawn.
This can be due to different things:
where I expect the opengl line primitive is drawn from a pixel to a pixel inclusive, that last pixel just is actually exclusive? Is that it?
my projection matrix is incorrect?
I am under a false assumption that when I have a backbuffer of 960x540, that is actually has one pixel more?
Something else?
Can someone please help me? I have been looking into this problem for a long time now, and every time when I thought it was ok, I saw after a while that it actually wasn't.
Here is some of my code, I tried to strip it down as much as possible. When I call my line-function, every coordinate is added with 0.375, 0.375 to make it correct on both ATI and nvidia adapters.
int width = resX();
int height = resY();
for (int i = 0; i < height; i += 2)
rm->line(0, i, width - 1, i, vec4f(1, 0, 0, 1));
for (int i = 1; i < height; i += 2)
rm->line(0, i, width - 1, i, vec4f(0, 1, 0, 1));
// when I do this, one pixel to the right remains undrawn
void rendermachine::line(int x1, int y1, int x2, int y2, const vec4f &color)
{
... some code to decide what std::vector the coordinates should be pushed into
// m_z is a z-coordinate, I use z-buffering to preserve correct drawing orders
// vec2f(0, 0) is a texture-coordinate, the line is drawn without texturing
target->push_back(vertex(vec3f((float)x1 + 0.375f, (float)y1 + 0.375f, m_z), color, vec2f(0, 0)));
target->push_back(vertex(vec3f((float)x2 + 0.375f, (float)y2 + 0.375f, m_z), color, vec2f(0, 0)));
}
void rendermachine::update(...)
{
... render target object is queried for width and height, in my test it is just the back buffer so the window client resolution is returned
mat4f mP;
mP.setOrthographic(0, (float)width, (float)height, 0, 0, 8000000);
... all vertices are copied to video memory
... drawing
if (there are lines to draw)
glDrawArrays(GL_LINES, (int)offset, (int)lines.size());
...
}
// And the (very simple) shader to draw these lines
// Vertex shader
#version 120
attribute vec3 aVertexPosition;
attribute vec4 aVertexColor;
uniform mat4 mP;
varying vec4 vColor;
void main(void) {
gl_Position = mP * vec4(aVertexPosition, 1.0);
vColor = aVertexColor;
}
// Fragment shader
#version 120
#ifdef GL_ES
precision highp float;
#endif
varying vec4 vColor;
void main(void) {
gl_FragColor = vColor.rgb;
}
In OpenGL, lines are rasterized using the "Diamond Exit" rule. This is almost the same as saying that the end coordinate is exclusive, but not quite...
This is what the OpenGL spec has to say:
http://www.opengl.org/documentation/specs/version1.1/glspec1.1/node47.html
Also have a look at the OpenGL FAQ, http://www.opengl.org/archives/resources/faq/technical/rasterization.htm, item "14.090 How do I obtain exact pixelization of lines?". It says "The OpenGL specification allows for a wide range of line rendering hardware, so exact pixelization may not be possible at all."
Many will argue that you should not use lines in OpenGL at all. Their behaviour is based on how ancient SGI hardware worked, not on what makes sense. (And lines with widths >1 are nearly impossible to use in a way that looks good!)
Note that OpenGL coordinate space has no notion of integers, everything is a float and the "centre" of an OpenGL pixel is really at the 0.5,0.5 instead of its top-left corner. Therefore, if you want a 1px wide line from 0,0 to 10,10 inclusive, you really had to draw a line from 0.5,0.5 to 10.5,10.5.
This will be especially apparent if you turn on anti-aliasing, if you have anti-aliasing and you try to draw from 50,0 to 50,100 you may see a blurry 2px wide line because the line fell in-between two pixels.