Declaring certain variables in shader causes it to stop working? (GLSL) - glsl

I'm using GLSL.
I have a simple fragment shader here:
"uniform sampler2D backBuffer;",
"uniform float r;",
"uniform float g;",
"uniform float b;",
"uniform float ratio;",
"void main() {",
" vec4 color;",
" float avg, dr, dg, db, multiplier;",
" color = texture2D(backBuffer, vec2(gl_TexCoord[0].x * 1,gl_TexCoord[0].y * 1));",
" avg = (color.r + color.g + color.b) / 3.0;",
" dr = avg * r;",
" dg = avg * g;",
" db = avg * b;",
" color.r = color.r * (gl_TexCoord[0].x * gl_TexCoord[0].y);",
" color.g = color.g * (gl_TexCoord[0].x * gl_TexCoord[0].y);",
" color.b = color.b * (gl_TexCoord[0].x * gl_TexCoord[0].y);",
" gl_FragColor = color;",
"}"
Now it works just fine.
However, for some very strange reason, adding any more variables such as a vec2 or float causes it to have no effect on my scene:
"uniform sampler2D backBuffer;",
"uniform float r;",
"uniform float g;",
"uniform float b;",
"uniform float ratio;",
"void main() {",
" vec4 color;",
" float avg, dr, dg, db, multiplier;",
" vec2 divisors;",
" color = texture2D(backBuffer, vec2(gl_TexCoord[0].x * 1,gl_TexCoord[0].y * 1));",
" avg = (color.r + color.g + color.b) / 3.0;",
" dr = avg * r;",
" dg = avg * g;",
" db = avg * b;",
" color.r = color.r * (gl_TexCoord[0].x * gl_TexCoord[0].y);",
" color.g = color.g * (gl_TexCoord[0].x * gl_TexCoord[0].y);",
" color.b = color.b * (gl_TexCoord[0].x * gl_TexCoord[0].y);",
" gl_FragColor = color;",
"}"
In this one I added a vec2 called divisors, that's all I did and the shader no longer does anything to the pixels.
Why is this? Is there something I do not understand about variable declaration in GLSL?
Thanks

I notice that each line is a quoted string separated by commas. In C/C++ you would usually just juxtapose quoted strings when creating a single big string, so I wonder if you are doing something strange like initializing an array of strings and not taking into account that its size has changed after adding a new line?

Related

Slower read/write to shared memory in CUDA than in Compute Shader [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 months ago.
Improve this question
I am currently comparing the implementation of a n-body simulation in the GPU using CUDA and OpenGL (Compute Shaders) for a project, but I run into a problem using shared memory.
First I implemented the version with no shared memory as follows:
CUDA
#include "helper_math.h"
//...
__device__ float dist2(float3 A, float3 B)
{
float3 C = A - B;
return dot(C, C);
}
__global__ void n_body_vel_calc(float3* positions, float3* velocities,
unsigned numParticles, float mass, float deltaTime)
{
unsigned i = blockDim.x * blockIdx.x + threadIdx.x;
if (i >= numParticles)
return;
const float G = 6.6743e-11f;
float3 cur_position = positions[i];
float3 force = make_float3(0.0f, 0.0f, 0.0f);
for (unsigned j = 0; j < numParticles; ++j)
{
if (i == j)
continue;
float3 neighbor_position = positions[j];
float inv_distance2 = 1.0f / dist2(cur_position, neighbor_position);
float3 direction = normalize(neighbor_position - cur_position);
force += G * mass * mass * inv_distance2 * direction;
}
float3 acceleration = force / mass;
velocities[i] += acceleration * deltaTime;
}
OpenGL
// glBufferStorage(GL_SHADER_STORAGE_BUFFER, ..., ..., ...);
#version 460
layout(local_size_x=128) in;
layout(location = 0) uniform int numParticles;
layout(location = 1) uniform float mass;
layout(location = 2) uniform float dt;
layout(std430, binding=0) buffer pblock { vec3 positions[]; };
layout(std430, binding=1) buffer vblock { vec3 velocities[]; };
float dist2(vec3 A, vec3 B)
{
vec3 C = A - B;
return dot( C, C );
}
void main()
{
int i = int(gl_GlobalInvocationID);
if (i >= numParticles)
return;
const float G = 6.6743e-11f;
vec3 cur_position = positions[i];
vec3 force = vec3(0.0);
for (uint j = 0; j < numParticles; ++j)
{
if (i == j)
continue;
vec3 neighbor_position = positions[j];
float inv_distance2 = 1.0 / dist2(cur_position, neighbor_position);
vec3 direction = normalize(neighbor_position - cur_position);
force += G * mass * mass * inv_distance2 * direction;
}
vec3 acceleration = force / mass;
velocities[i] += acceleration * dt;
}
With the same number of threads per group, number of particles and the same number of times executing the kernel, the CUDA version takes 82 ms and OpengGL takes 70 ms. Weird thing that there speed is much different, but I can attribute that to GLSL having geometric operations optimized somehow.
My problem comes next, when I write the versions with shared memory, which should increase the performance by not reading from global memory multiple times.
CUDA
__global__ void n_body_vel_calc(float3* positions, float3 * velocities, unsigned workgroupSize,
unsigned numParticles, float mass, float deltaTime)
{
// size of array == workgroupSize
extern __shared__ float3 temp_tile[];
unsigned i = blockDim.x * blockIdx.x + threadIdx.x;
if (i >= numParticles)
return;
const float G = 6.6743e-11f;
float3 cur_position = positions[i];
float3 force = make_float3(0.0f, 0.0f, 0.0f);
for (unsigned tile = 0; tile < numParticles; tile += workgroupSize)
{
temp_tile[threadIdx.x] = positions[tile + threadIdx.x];
__syncthreads();
for (unsigned j = 0; j < workgroupSize; ++j)
{
if (i == j || ((tile + j) >= numParticles))
continue;
float3 neighbor_position = temp_tile[j];
float inv_distance2 = 1.0f / dist2(cur_position, neighbor_position);
float3 direction = normalize(neighbor_position - cur_position);
force += G * mass * mass * inv_distance2 * direction;
}
__syncthreads();
}
float3 acceleration = force / mass;
velocities[i] += acceleration * deltaTime;
}
OpenGL
#version 460
layout(local_size_x=128) in;
layout(location = 0) uniform int numParticles;
layout(location = 1) uniform float mass;
layout(location = 2) uniform float dt;
layout(std430, binding=0) buffer pblock { vec3 positions[]; };
layout(std430, binding=1) buffer vblock { vec3 velocities[]; };
// Shared variables
shared vec3 temp_tile[gl_WorkGroupSize.x];
void main()
{
int i = int(gl_GlobalInvocationID);
if (i >= numParticles)
return;
const float G = 6.6743e-11f;
vec3 cur_position = positions[i];
vec3 force = vec3(0.0);
for (uint tile = 0; tile < numParticles; tile += gl_WorkGroupSize.x)
{
temp_tile[gl_LocalInvocationIndex] = positions[tile + gl_LocalInvocationIndex];
groupMemoryBarrier();
barrier();
for (uint j = 0; j < gl_WorkGroupSize.x; ++j)
{
if (i == j || (tile + j) >= numParticles)
continue;
vec3 neighbor_position = temp_tile[j];
float inv_distance2 = 1.0 / dist2(cur_position, neighbor_position);
vec3 direction = normalize(neighbor_position - cur_position);
force += G * mass * mass * inv_distance2 * direction;
}
groupMemoryBarrier();
barrier();
}
vec3 acceleration = force / mass;
velocities[i] += acceleration * dt;
}
My principal problem comes next. With the same parameters as above, the CUDA version increases its execution time to 128 ms (greatly diminishing its performance), and the OpenGL one took 68 (a small improvement over the other version).
I have compiled the CUDA version with the toolkit version 11.7 and 10.0 with MSVC V143 and V142 and the results are more or less the same.
Why the OpenGL implementation is faster with shared memory, but the CUDA one its not? Am I missing something?

What is varying lowp float vT

I'm trying to understand graph example in QT and stuck on some stuff.
GLSL vertex and frag shader are used to draw a graph.
Here's the vertex shader code:
attribute highp vec4 pos;
attribute highp float t;
uniform lowp float size;
uniform highp mat4 qt_Matrix;
varying lowp float vT;
void main(void)
{
vec4 adjustedPos = pos;
adjustedPos.y += (t * size );
gl_Position = qt_Matrix * adjustedPos;
vT = t;
}
What is vT???
Also,
struct LineVertex {
float x, y, t;
inline void set(float xx, float yy, float tt) {x = xx; y = yy; t = tt;}
};
void LineNode::updateGeometry(const QRectF &bounds, const QList<qreal> &samples) {
m_geometry.allocate(samples.size() * 2);
qreal x = bounds.x();
qreal y = bounds.y();
qreal w = bounds.width();
qreal h = bounds.height();
qreal dx = w / (samples.size() - 1);
LineVertex *v = (LineVertex *) m_geometry.vertexData();
for(int i = 0; i < samples.size(); ++i) {
v[i*2 + 0].set(x + dx * i, y + samples.at(i) * h, 0);
v[i*2 + 1].set(x + dx * i, y + samples.at(i) * h, 1);
}
markDirty(QSGNode::DirtyGeometry);
}
I can't understand why in for loop
for(int i = 0; i < samples.size(); ++i) {
v[i*2 + 0].set(x + dx * i, y + samples.at(i) * h, 0);
v[i*2 + 1].set(x + dx * i, y + samples.at(i) * h, 1);
}
they are creating a pair of vertices with identical positions and adding t*size (which is 0 on first of the pair and 1 to second one)
Isn't that enough that x,y positions are tweaked properly
I tried to comment adjustedPos.y += (t * size ); and graph just disappered.

How to avoid extra calculations in fragment shader

im trying to fix this shader. the effects is a radial blur around a point position, passing from the cpu in a array. The calculations works fine for each point and generates de effect, but as you can see in this picture, for each loop the shader keep generate samples, and i dont know how to avoid. i only want the blur for each point in the array
#version 150
in vec2 varyingtexcoord;
uniform sampler2DRect tex0;
uniform int size;
float exposure = 0.79;
float decay = 0.9;
float density = .9;
float weight = .1;
int samples = 25;
out vec4 fragColor;
const int MAX_SAMPLES = 25;
const int N = 3;
uniform vec2 ligthPos [N];
int a = 1;
vec4 halo(vec2 pos){
float illuminationDecay = 1.2;
vec2 texCoord = varyingtexcoord;
vec2 current = pos.xy;
vec2 deltaTextCoord = texCoord - current;
deltaTextCoord *= 1.0 / float(samples) * density;
vec4 color = texture(tex0, texCoord);
for(int i=0; i < MAX_SAMPLES; i++){
texCoord -= deltaTextCoord;
vec4 sample = texture(tex0, texCoord);
sample *= illuminationDecay * weight;
color += sample;
illuminationDecay *= decay;
}
return color;
}
void main(){
vec4 accum = vec4(0.0);
for(int e = 0; e < N;e++){
vec2 current =ligthPos[e];
accum += halo(current);
}
fragColor = (accum) * exposure;
}
this is what happen:

Depth of field artefacts

I began to implement the depth of field in my application, but I ran into a problem. Artifacts appear in the form of a non-smooth transition between depths.
I'm doing the depth of field in the following way:
With the main scene rendering, I record the blur value in the alpha channel. I do this using this: fragColor.a = clamp(abs(focalDepth + fragPos.z) / focalRange, 0.0, 1.0), where focalDepth = 8, focalRange = 20.
After that I apply a two-step (horizontally and vertically) Gaussian blur with dynamic size and sigma, depending on the blur value (which I previously recorded in the alpha channel)(shader below)
But I have an artifact, where you see a clear transition between the depths.
The whole scene:
And with an increased scale:
My fragment blur shader:
#version 330
precision mediump float;
#define BLOOM_KERNEL_SIZE 8
#define DOF_KERNEL_SIZE 8
/* ^^^ definitions ^^^ */
layout (location = 0) out vec4 bloomFragColor;
layout (location = 1) out vec4 dofFragColor;
in vec2 texCoords;
uniform sampler2D image; // bloom
uniform sampler2D image2; // dof
uniform bool isHorizontal;
uniform float kernel[BLOOM_KERNEL_SIZE];
float dof_kernel[DOF_KERNEL_SIZE];
vec4 tmp;
vec3 bloom_result;
vec3 dof_result;
float fdof;
float dofSigma;
int dofSize;
void makeDofKernel(int size, float sigma) {
size = size * 2 - 1;
float tmpKernel[DOF_KERNEL_SIZE * 2 - 1];
int mean = size / 2;
float sum = 0; // For accumulating the kernel values
for (int x = 0; x < size; x++) {
tmpKernel[x] = exp(-0.5 * pow((x - mean) / sigma, 2.0));
// Accumulate the kernel values
sum += tmpKernel[x];
}
// Normalize the kernel
for (int x = 0; x < size; x++)
tmpKernel[x] /= sum;
// need center and right part
for (int i = 0; i < mean + 1; i++) dof_kernel[i] = tmpKernel[size / 2 + i];
}
void main() {
vec2 texOffset = 1.0 / textureSize(image, 0); // gets size of single texel
tmp = texture(image2, texCoords);
fdof = tmp.a;
dofSize = clamp(int(tmp.a * DOF_KERNEL_SIZE), 1, DOF_KERNEL_SIZE);
if (dofSize % 2 == 0) dofSize++;
makeDofKernel(dofSize, 12.0 * fdof + 1);
bloom_result = texture(image, texCoords).rgb * kernel[0]; // current fragment’s contribution
dof_result = tmp.rgb * dof_kernel[0];
if(isHorizontal) {
for(int i = 1; i < kernel.length(); i++) {
bloom_result += texture(image, texCoords + vec2(texOffset.x * i, 0.0)).rgb * kernel[i];
bloom_result += texture(image, texCoords - vec2(texOffset.x * i, 0.0)).rgb * kernel[i];
}
for(int i = 1; i < dofSize; i++) {
dof_result += texture(image2, texCoords + vec2(texOffset.x * i, 0.0)).rgb * dof_kernel[i];
dof_result += texture(image2, texCoords - vec2(texOffset.x * i, 0.0)).rgb * dof_kernel[i];
}
} else {
for(int i = 1; i < kernel.length(); i++) {
bloom_result += texture(image, texCoords + vec2(0.0, texOffset.y * i)).rgb * kernel[i];
bloom_result += texture(image, texCoords - vec2(0.0, texOffset.y * i)).rgb * kernel[i];
}
for(int i = 1; i < dofSize; i++) {
dof_result += texture(image2, texCoords + vec2(0.0, texOffset.y * i)).rgb * dof_kernel[i];
dof_result += texture(image2, texCoords - vec2(0.0, texOffset.y * i)).rgb * dof_kernel[i];
}
}
bloomFragColor = vec4(bloom_result, 1.0);
dofFragColor = vec4(dof_result, fdof);
}
And the settings for the DOF texture: glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, SCR_W, SCR_H, 0, GL_RGBA, GL_FLOAT, NULL)
Optimization of the shader I'll do later, now I'm very concerned about this artifact. How it can be eliminated? It is desirable not to change the way of realization of the depth of field. But if you know a more productive way - a big request to share it.
I will be grateful for help.
The problem is solved. My mistake was that I changed the size of DOF blur kernel, although I had to change only the sigma. Corrected shader code:
#version 330
precision mediump float;
#define BLOOM_KERNEL_SIZE 8
#define DOF_KERNEL_SIZE 8
/* ^^^ definitions ^^^ */
layout (location = 0) out vec4 bloomFragColor;
layout (location = 1) out vec4 dofFragColor;
in vec2 texCoords;
uniform sampler2D image; // bloom
uniform sampler2D image2; // dof
uniform bool isHorizontal;
uniform float max_sigma = 12.0;
uniform float min_sigma = 0.0001;
uniform float kernel[BLOOM_KERNEL_SIZE];
float dof_kernel[DOF_KERNEL_SIZE];
vec4 tmp;
vec3 bloom_result;
vec3 dof_result;
float fdof;
const int DOF_LCR_SIZE = DOF_KERNEL_SIZE * 2 - 1; // left-center-right (lllcrrr)
const int DOF_MEAN = DOF_LCR_SIZE / 2;
void makeDofKernel(float sigma) {
float sum = 0; // For accumulating the kernel values
for (int x = DOF_MEAN; x < DOF_LCR_SIZE; x++) {
dof_kernel[x - DOF_MEAN] = exp(-0.5 * pow((x - DOF_MEAN) / sigma, 2.0));
// Accumulate the kernel values
sum += dof_kernel[x - DOF_MEAN];
}
sum += sum - dof_kernel[0];
// Normalize the kernel
for (int x = 0; x < DOF_KERNEL_SIZE; x++) dof_kernel[x] /= sum;
}
void main() {
vec2 texOffset = 1.0 / textureSize(image, 0); // gets size of single texel
tmp = texture(image2, texCoords);
fdof = tmp.a;
makeDofKernel(max_sigma * fdof + min_sigma);
bloom_result = texture(image, texCoords).rgb * kernel[0]; // current fragment’s contribution
dof_result = tmp.rgb * dof_kernel[0];
if(isHorizontal) {
for(int i = 1; i < BLOOM_KERNEL_SIZE; i++) {
bloom_result += texture(image, texCoords + vec2(texOffset.x * i, 0.0)).rgb * kernel[i];
bloom_result += texture(image, texCoords - vec2(texOffset.x * i, 0.0)).rgb * kernel[i];
}
for(int i = 1; i < DOF_KERNEL_SIZE; i++) {
dof_result += texture(image2, texCoords + vec2(texOffset.x * i, 0.0)).rgb * dof_kernel[i];
dof_result += texture(image2, texCoords - vec2(texOffset.x * i, 0.0)).rgb * dof_kernel[i];
}
} else {
for(int i = 1; i < BLOOM_KERNEL_SIZE; i++) {
bloom_result += texture(image, texCoords + vec2(0.0, texOffset.y * i)).rgb * kernel[i];
bloom_result += texture(image, texCoords - vec2(0.0, texOffset.y * i)).rgb * kernel[i];
}
for(int i = 1; i < DOF_KERNEL_SIZE; i++) {
dof_result += texture(image2, texCoords + vec2(0.0, texOffset.y * i)).rgb * dof_kernel[i];
dof_result += texture(image2, texCoords - vec2(0.0, texOffset.y * i)).rgb * dof_kernel[i];
}
}
bloomFragColor = vec4(bloom_result, 1.0);
dofFragColor = vec4(dof_result, fdof);
}
Result:

OpenGL Sphere vertices and UV coordinates

I know there are many similar questions for this issue, such as this one, but I can't seem to figure out what is going wrong in my program.
I am attempting to create a unit sphere using the naive longitude/latitude method, then I attempt to wrap a texture around the sphere using UV coordinates.
I am seeing the classic vertical seam issue, but I'm also some strangeness at both poles.
North Pole...
South Pole...
Seam...
The images are from a sphere with 180 stacks and 360 slices.
I create it as follows.
First, here are a couple of convenience structures I'm using...
struct Point {
float x;
float y;
float z;
float u;
float v;
};
struct Quad {
Point lower_left; // Lower left corner of quad
Point lower_right; // Lower right corner of quad
Point upper_left; // Upper left corner of quad
Point upper_right; // Upper right corner of quad
};
I first specify a sphere which is '_stacks' high and '_slices' wide.
float* Sphere::generate_glTriangle_array(int& num_elements) const
{
int elements_per_point = 5; //xyzuv
int points_per_triangle = 3;
int triangles_per_mesh = _stacks * _slices * 2; // 2 triangles makes a quad
num_elements = triangles_per_mesh * points_per_triangle * elements_per_point;
float *buff = new float[num_elements];
int i = 0;
Quad q;
for (int stack=0; stack<_stacks; ++stack)
{
for (int slice=0; slice<_slices; ++slice)
{
q = generate_sphere_quad(stack, slice);
load_quad_into_array(q, buff, i);
}
}
return buff;
}
Quad Sphere::generate_sphere_quad(int stack, int slice) const
{
Quad q;
std::cout << "Stack " << stack << ", Slice: " << slice << std::endl;
std::cout << " Lower left...";
q.lower_left = generate_sphere_coord(stack, slice);
std::cout << " Lower right...";
q.lower_right = generate_sphere_coord(stack, slice+1);
std::cout << " Upper left...";
q.upper_left = generate_sphere_coord(stack+1, slice);
std::cout << " Upper right...";
q.upper_right = generate_sphere_coord(stack+1, slice+1);
std::cout << std::endl;
return q;
}
Point Sphere::generate_sphere_coord(int stack, int slice) const
{
Point p;
p.y = 2.0 * stack / _stacks - 1.0;
float r = sqrt(1 - p.y * p.y);
float angle = 2.0 * M_PI * slice / _slices;
p.x = r * sin(angle);
p.z = r * cos(angle);
p.u = (0.5 + ( (atan2(p.z, p.x)) / (2 * M_PI) ));
p.v = (0.5 + ( (asin(p.y)) / M_PI ));
std::cout << " Point: (x: " << p.x << ", y: " << p.y << ", z: " << p.z << ") [u: " << p.u << ", v: " << p.v << "]" << std::endl;
return p;
}
I then load my array, specifying vertices of two CCW triangles for each Quad...
void Sphere::load_quad_into_array(const Quad& q, float* buff, int& buff_idx, bool counter_clockwise=true)
{
if (counter_clockwise)
{
// First triangle
load_point_into_array(q.lower_left, buff, buff_idx);
load_point_into_array(q.upper_right, buff, buff_idx);
load_point_into_array(q.upper_left, buff, buff_idx);
// Second triangle
load_point_into_array(q.lower_left, buff, buff_idx);
load_point_into_array(q.lower_right, buff, buff_idx);
load_point_into_array(q.upper_right, buff, buff_idx);
}
else
{
// First triangle
load_point_into_array(q.lower_left, buff, buff_idx);
load_point_into_array(q.upper_left, buff, buff_idx);
load_point_into_array(q.upper_right, buff, buff_idx);
// Second triangle
load_point_into_array(q.lower_left, buff, buff_idx);
load_point_into_array(q.upper_right, buff, buff_idx);
load_point_into_array(q.lower_right, buff, buff_idx);
}
}
void Sphere::load_point_into_array(const Point& p, float* buff, int& buff_idx)
{
buff[buff_idx++] = p.x;
buff[buff_idx++] = p.y;
buff[buff_idx++] = p.z;
buff[buff_idx++] = p.u;
buff[buff_idx++] = p.v;
}
My vertex and fragment shaders are simple...
// Vertex shader
#version 450 core
in vec3 vert;
in vec2 texcoord;
uniform mat4 matrix;
out FS_INPUTS {
vec2 i_texcoord;
} tex_data;
void main(void) {
tex_data.i_texcoord = texcoord;
gl_Position = matrix * vec4(vert, 1.0);
}
// Fragment shader
#version 450 core
in FS_INPUTS {
vec2 i_texcoord;
};
layout (binding=1) uniform sampler2D tex_id;
out vec4 color;
void main(void) {
color = texture(tex_id, texcoord);
}
My draw command is:
glDrawArrays(GL_TRIANGLES, 0, num_elements/5);
Thanks!
First of all, this code does some funny extra work:
Point Sphere::generate_sphere_coord(int stack, int slice) const
{
Point p;
p.y = 2.0 * stack / _stacks - 1.0;
float r = sqrt(1 - p.y * p.y);
float angle = 2.0 * M_PI * slice / _slices;
p.x = r * sin(angle);
p.z = r * cos(angle);
p.u = (0.5 + ( (atan2(p.z, p.x)) / (2 * M_PI) ));
p.v = (0.5 + ( (asin(p.y)) / M_PI ));
return p;
}
Calling cos and sin just to cal atan2 on the result is just extra work in the best case, and in the worst case you might get the wrong branch cuts. You can calculate p.u directly from slice and slice instead.
The Seam
You are going to have a seam in your sphere. This is normal, most models will have a seam (or many seams) in their UV maps somewhere. The problem is that the UV coordinates should still increase linearly next to the seam. For example, think about a loop of vertices that go around the globe's equator. At some point, the UV coordinates will wrap around, something like this:
0.8, 0.9, 0.0, 0.1, 0.2
The problem is that you'll get four quads, but one of them will be wrong:
quad 1: u = 0.8 ... 0.9
quad 2: u = 0.9 ... 0.0 <<----
quad 3: u = 0.0 ... 0.1
quad 4: u = 0.1 ... 0.2
Look at how messed up quad 2 is. You will have to generate instead the following data:
quad 1: u = 0.8 ... 0.9
quad 2: u = 0.9 ... 1.0
quad 3: u = 0.0 ... 0.1
quad 4: u = 0.1 ... 0.2
A Fixed Version
Here is a sketch of a fixed version.
namespace {
const float pi = std::atan(1.0f) * 4.0f;
// Generate point from the u, v coordinates in (0..1, 0..1)
Point sphere_point(float u, float v) {
float r = std::sin(pi * v);
return Point{
r * std::cos(2.0f * pi * u),
r * std::sin(2.0f * pi * u),
std::cos(pi * v),
u,
v
};
}
}
// Create array of points with quads that make a unit sphere.
std::vector<Point> sphere(int hSize, int vSize) {
std::vector<Point> pt;
for (int i = 0; i < hSize; i++) {
for (int j = 0; j < vSize; j++) {
float u0 = (float)i / (float)hSize;
float u1 = (float)(i + 1) / (float)hSize;
float v0 = (float)j / (float)vSize;
float v1 = (float)(j + 1) / float(vSize);
// Create quad as two triangles.
pt.push_back(sphere_point(u0, v0));
pt.push_back(sphere_point(u1, v0));
pt.push_back(sphere_point(u0, v1));
pt.push_back(sphere_point(u0, v1));
pt.push_back(sphere_point(u1, v0));
pt.push_back(sphere_point(u1, v1));
}
}
}
Note that there is some easy optimization you could do, and also note that due to rounding errors, the seam might not line up quite correctly. These are left as an exercise for the reader.
More Problems
Even with the fixed version, you will likely see artifacts at the poles. This is because the screen space texture coordinate derivatives have a singularity at the poles.
The recommended way to fix this is to use a cube map texture instead. This will also greatly simplify the sphere geometry data, since you can completely eliminate the UV coordinates and you won't have a seam.
As a kludge, you can enable anisotropic filtering instead.