Math performed on CPU has different result on GPU - opengl

I am attempting to have a GLSL fragment shader distort incoming fragments based on their texture coordinates to poorly simulate a CRT.
After the code failed to work, I ported it to C++ to modify the RGB values of a texture. The code worked as expected.
This brings me to believe that something is wrong with my GLSL code, even though it is mirrored in C++ and works perfectly.
Is there something that I don't know about GLSL math that could be causing this?
C++ code
const unsigned int RED = 0xFFFF0000;
const unsigned int BLUE = 0xFF0000FF;
const float X_MAX = 429.0f/448.0f;
const float Y_MAX = 320.0f/336.0f;
const float X_CORNER = 410.0f/448.0f;
const float Y_CORNER = 306.0f/336.0f;
const float X_COEF = (X_MAX-X_CORNER) / (Y_CORNER * Y_CORNER);
const float Y_COEF = (Y_MAX-Y_CORNER) / (X_CORNER * X_CORNER);
float FUNCX(float y)
{
return X_MAX-X_COEF*y*y;
}
float FUNCY(float x)
{
return Y_MAX-Y_COEF*x*x;
}
unsigned int get(glm::vec2 intex)
{
intex *= 2.0; // Transform the texture rectangle from 0..1
intex.x -= 1.0; // to
intex.y -= 1.0; // -1 .. 1
glm::vec2 d = glm::vec2(0.0,0.0);
d.x = FUNCX(intex.y); // get the curve amount for X values based on Y input
d.y = FUNCY(intex.x); // get the curve amount for Y values based on X input
if (abs(intex.x/d.x) > 1.0) // if the X value is outside of the curve
return RED; // draw RED for debugging
if (abs(intex.y/d.y) > 1.0) // if the Y value is outside of the curve
return BLUE; // draw BLUE for debugging
glm::vec2 outtex = glm::vec2(0.0f,0.0f);
outtex.x = 1.0 + intex.x/d.x; // Now the -1 .. 1 values get shifted back
outtex.y = 1.0 + intex.y/d.y; // to
outtex /= 2.0; // 0 .. 1
return texture.get(512*outtex.x,512*outtex.y);
}
GLSL fragment shader
const vec4 RED = vec4(1.0,0.0,0.0,1.0);
const vec4 BLUE = vec4(0.0,0.0,1.0,1.0);
const float X_MAX = 429.0/448.0;
const float Y_MAX = 320.0/336.0;
const float X_CORNER = 410.0/448.0;
const float Y_CORNER = 306.0/336.0;
const float X_COEF = (X_MAX-X_CORNER) / (Y_CORNER * Y_CORNER);
const float Y_COEF = (Y_MAX-Y_CORNER) / (X_CORNER * X_CORNER);
float FUNCX(float y)
{
return X_MAX-X_COEF*y*y;
}
float FUNCY(float x)
{
return Y_MAX-Y_COEF*x*x;
}
vec4 get(vec2 intex)
{
intex *= 2.0; // Transform the texture rectangle from 0..1
intex.x -= 1.0; // to
intex.y -= 1.0; // -1 .. 1
vec2 d = vec2(0.0,0.0);
d.x = FUNCX(intex.y); // get the curve amount for X values based on Y input
d.y = FUNCY(intex.x); // get the curve amount for Y values based on X input
if (abs(intex.x/d.x) > 1.0) // if the X value is outside of the curve
return RED; // draw RED for debugging
if (abs(intex.y/d.y) > 1.0) // if the Y value is outside of the curve
return BLUE; // draw BLUE for debugging
vec2 outtex = vec2(0.0,0.0);
outtex.x = 1.0 + intex.x/d.x; // Now the -1 .. 1 values get shifted back
outtex.y = 1.0 + intex.y/d.y; // to
outtex /= 2.0; // 0 .. 1
return texture2D(texture,outtex);
}
Note: The Super Mario World image is for testing purposes only.
Note 2: The 512 values in the C++ code are the size of the texture used.
Edit: There was a typo in the GLSL code where a y value was divided by an x instead of y. This has been fixed and the new output is the same because the denominator values are so close.

The code is not the same. In the C++ code:
if (abs(intex.y/d.y) > 1.0) // if the Y value is outside of the curve
return BLUE; // draw BLUE for debugging
In the GLSL code:
if (abs(intex.y/d.x) > 1.0) // if the Y value is outside of the curve
return BLUE; // draw BLUE for debugging
The C++ version divides by d.y, the GLSL version by d.x.

The formula used only supported UV values from 0 to 1. If values outside of that range were used, the formula broke down.

Related

Opengl double overflow

I have double 'radius' = 2.0E-45, when i set it to ~2.0E-46 calculation collapse resulting in white screen. So seems like issue is overflow. I wrote the same algorithm but using nubma cuda and f64 (double precision) 'radius'. And everything works fine. I am using f32 texture buffer for 'depth_array' (there is no float64 dtype for this), but numba implementation works fine with f32, and opengl implementation also works fine until 'radius' bigger than ~2.0E-46. Why numba implementation works, while opengl not? I want to stick with opengl. Is there any possibility to fix it?
I only put in parts that use the 'radius'. All other variables are double type. (code is messy and just a scratch)
#version 150
#extension GL_ARB_gpu_shader_fp64 : enable
double radius = 2.0E-45;
...
dvec2 pixel = dvec2(gl_FragCoord.xy) + dvec2(-0.5+(double(x)+0.5)/double(AA),-0.5+(double(y)+0.5)/double(AA));
dvec2 c = pixel/dvec2(width, height) * dvec2(radius, radius) + dvec2(-radius/2, -radius/2);
color.rgb += sample(c);
...
vec3 sample(dvec2 dn)
{
vec3 color = vec3(0.0,0.0,0.0);
dvec2 d0 = dn;
double zn_size = 0.0;
int i = 0;
while (i < depth)
{
int x = i % depth;
dvec2 value = dvec2(texelFetch(depth_array, x).rg);
dn = complex_mul(dn, value + dn);
dn = dn + d0;
i++;
x = i % depth;
value = dvec2(texelFetch(depth_array, x).rg);
dvec2 zn = value * 0.5 + dn;
zn_size = dot(zn, zn);
if (zn_size > r)
{
double fraciter = (zn_size-r)/(r2-r);
double iter = double(i) - fraciter;
double m = sqrt(iter)*mul*2.0;
color = sin(vec3(.1, .15, .2)*float(m)*0.5)*.5+0.5;
break;
}
}
return color;
}
In GLSL, the literal value 2.0E-45 has the type float. That means the value will be squashed into the valid range of a float before it gets assigned to a value.
If you want a literal to be a double, then it needs to use the proper suffix: 2.0E-45lf.

Perlin Noise getting wrong values in Y axis (C++)

Issue
I'm trying to implement the Perlin Noise algorithm in 2D with a single octave with a size of 16x16. I'm using this as heightmap data for a terrain, however it only seems to work in one axis. Whenever the sample point moves to a new Y section in the Perlin Noise grid, the gradient is very different from what I expect (for example, it often flips from 0.98 to -0.97, which is a very sudden change).
This image shows the staggered terrain in the z direction (which is the y axis in the 2D Perlin Noise grid)
Code
I've put the code that calculates which sample point to use at the end since it's quite long and I believe it's not where the issue is, but essentially I scale down the terrain to match the Perlin Noise grid (16x16) and then sample through all the points.
Gradient At Point
So the code that calculates out the gradient at a sample point is the following:
// Find the gradient at a certain sample point
float PerlinNoise::gradientAt(Vector2 point)
{
// Decimal part of float
float relativeX = point.x - (int)point.x;
float relativeY = point.y - (int)point.y;
Vector2 relativePoint = Vector2(relativeX, relativeY);
vector<float> weights(4);
// Find the weights of the 4 surrounding points
weights = surroundingWeights(point);
float fadeX = fadeFunction(relativePoint.x);
float fadeY = fadeFunction(relativePoint.y);
float lerpA = MathUtils::lerp(weights[0], weights[1], fadeX);
float lerpB = MathUtils::lerp(weights[2], weights[3], fadeX);
float lerpC = MathUtils::lerp(lerpA, lerpB, fadeY);
return lerpC;
}
Surrounding Weights of Point
I believe the issue is somewhere here, in the function that calculates the weights for the 4 surrounding points of a sample point, but I can't seem to figure out what is wrong since all the values seem sensible in the function when stepping through it.
// Find the surrounding weight of a point
vector<float> PerlinNoise::surroundingWeights(Vector2 point){
// Produces correct values
vector<Vector2> surroundingPoints = surroundingPointsOf(point);
vector<float> weights;
for (unsigned i = 0; i < surroundingPoints.size(); ++i) {
// The corner to the sample point
Vector2 cornerToPoint = surroundingPoints[i].toVector(point);
// Getting the seeded vector from the grid
float x = surroundingPoints[i].x;
float y = surroundingPoints[i].y;
Vector2 seededVector = baseGrid[x][y];
// Dot product between the seededVector and corner to the sample point vector
float dotProduct = cornerToPoint.dot(seededVector);
weights.push_back(dotProduct);
}
return weights;
}
OpenGL Setup and Sample Point
Setting up the heightmap and getting the sample point. Variables 'wrongA' and 'wrongA' is an example of when the gradient flips and changes suddenly.
void HeightMap::GenerateRandomTerrain() {
int perlinGridSize = 16;
PerlinNoise perlin_noise = PerlinNoise(perlinGridSize, perlinGridSize);
numVertices = RAW_WIDTH * RAW_HEIGHT;
numIndices = (RAW_WIDTH - 1) * (RAW_HEIGHT - 1) * 6;
vertices = new Vector3[numVertices];
textureCoords = new Vector2[numVertices];
indices = new GLuint[numIndices];
float perlinScale = RAW_HEIGHT/ (float) (perlinGridSize -1);
float height = 50;
float wrongA = perlin_noise.gradientAt(Vector2(0, 68.0f / perlinScale));
float wrongB = perlin_noise.gradientAt(Vector2(0, 69.0f / perlinScale));
for (int x = 0; x < RAW_WIDTH; ++x) {
for (int z = 0; z < RAW_HEIGHT; ++z) {
int offset = (x* RAW_WIDTH) + z;
float xVal = (float)x / perlinScale;
float yVal = (float)z / perlinScale;
float noise = perlin_noise.gradientAt(Vector2( xVal , yVal));
vertices[offset] = Vector3(x * HEIGHTMAP_X, noise * height, z * HEIGHTMAP_Z);
textureCoords[offset] = Vector2(x * HEIGHTMAP_TEX_X, z * HEIGHTMAP_TEX_Z);
}
}
numIndices = 0;
for (int x = 0; x < RAW_WIDTH - 1; ++x) {
for (int z = 0; z < RAW_HEIGHT - 1; ++z) {
int a = (x * (RAW_WIDTH)) + z;
int b = ((x + 1)* (RAW_WIDTH)) + z;
int c = ((x + 1)* (RAW_WIDTH)) + (z + 1);
int d = (x * (RAW_WIDTH)) + (z + 1);
indices[numIndices++] = c;
indices[numIndices++] = b;
indices[numIndices++] = a;
indices[numIndices++] = a;
indices[numIndices++] = d;
indices[numIndices++] = c;
}
}
BufferData();
}
Turned out the issue was in the interpolation stage:
float lerpA = MathUtils::lerp(weights[0], weights[1], fadeX);
float lerpB = MathUtils::lerp(weights[2], weights[3], fadeX);
float lerpC = MathUtils::lerp(lerpA, lerpB, fadeY);
I had the interpolation in the y axis the wrong way around, so it should have been:
lerp(lerpB, lerpA, fadeY)
Instead of:
lerp(lerpA, lerpB, fadeY)

How do I return an array of float3's from my compute shader?

Basically I'm trying to handle ray-tracing on my compute shader and I've tested that it does work and outputs any individual float3 correctly, I made a much slower implementation where individual pixels were rendered on the GPU and copied back to the CPU every time, very slow but it proved my maths in the GPU ray tracing function were sound and gave the right results.
However I'm having difficulty outputting an array of float3's as the RWStructuredBuffer confuses me somewhat
Here is my RWStructuredBuffer
RWStructuredBuffer<float3> Data: register(u0);
Nothing special but there for reference
Here is my function on the compute shader that I call with Dispatch
groupshared uint things;
[numthreads(1, 1, 1)]
void RenderRay(uint3 Gid : SV_GroupID, uint3 DTid : SV_DispatchThreadID, uint3 GTid : SV_GroupThreadID, uint GI : SV_GroupIndex)
{
float4 empty = { 0, 0, 0, 0 };
uint offset = 0;
float3 pixel;
float invWidth = 1 / float(width.x), invHeight = 1 / float(height.x);
float fov = 30, aspectratio = width.x / float(height.x);
float angle = tan(M_PI * 0.5 * fov / 180.);
GroupMemoryBarrierWithGroupSync();
for (uint y = 0; y < height.x; ++y)
{
if(y>0)
offset += height.x;
for (uint x = 0; x < width.x; ++x)
{
float xx = (2 * ((x + 0.5) * invWidth) - 1) * angle * aspectratio;
float yy = (1 - 2 * ((y + 0.5) * invHeight)) * angle;
float4 raydirection = { xx, yy, -1, 0 };
normalize(raydirection);
pixel = trace(empty, raydirection, 0);
Data[0][x] = pixel;
}
}
GroupMemoryBarrierWithGroupSync();
Setting Data[0] = pixel worked for the pixel by pixel implementation since I only needed to return the one value, but I can't achieve the same results trying to output the whole image at once as I can't quite figure out how to add these into an array
Writing it out it sounds like a silly problem but nontheless I'm quite stuck.
Thanks in advance!

Negative row and column in terrain following algorithm

I'm trying to do terrain following, and I get a negative camera position in the xz plane. Now I get an out of boundary exception, because the row or the col is negative. How would I transform the cell of the grid to the origin correctly, giving negative camera coordinates.
Here is the two functions
int cGrid::getHeightmapEntry(int row, int col)
{
return m_heightmap[row * 300 + col];
}
float cGrid::getHeight(float x, float z, float _width, float _depth, int _cellSpacing)
{
// Translate on xz-plane by the transformation that takes
// the terrain START point to the origin.
x = ((float)_width / 2.0f) + x;
z = ((float)_depth / 2.0f) - z;
// Scale down by the transformation that makes the
// cellspacing equal to one. This is given by
// 1 / cellspacing since; cellspacing * 1 / cellspacing = 1.
x /= (float)_cellSpacing;
z /= (float)_cellSpacing;
// From now on, we will interpret our positive z-axis as
// going in the 'down' direction, rather than the 'up' direction.
// This allows to extract the row and column simply by 'flooring'
// x and z:
float col = ::floorf(x);
float row = ::floorf(z);
if (row < 0 || col<0)
{
row = 0;
}
// get the heights of the quad we're in:
//
// A B
// *---*
// | / |
// *---*
// C D
float A = getHeightmapEntry(row, col);
float B = getHeightmapEntry(row, col + 1);
float C = getHeightmapEntry(row + 1, col);
float D = getHeightmapEntry(row + 1, col + 1);
//
// Find the triangle we are in:
//
// Translate by the transformation that takes the upper-left
// corner of the cell we are in to the origin. Recall that our
// cellspacing was nomalized to 1. Thus we have a unit square
// at the origin of our +x -> 'right' and +z -> 'down' system.
float dx = x - col;
float dz = z - row;
// Note the below compuations of u and v are unneccessary, we really
// only need the height, but we compute the entire vector to emphasis
// the books discussion.
float height = 0.0f;
if (dz < 1.0f - dx) // upper triangle ABC
{
float uy = B - A; // A->B
float vy = C - A; // A->C
// Linearly interpolate on each vector. The height is the vertex
// height the vectors u and v originate from {A}, plus the heights
// found by interpolating on each vector u and v.
height = A + Lerp(0.0f, uy, dx) + Lerp(0.0f, vy, dz);
}
else // lower triangle DCB
{
float uy = C - D; // D->C
float vy = B - D; // D->B
// Linearly interpolate on each vector. The height is the vertex
// height the vectors u and v originate from {D}, plus the heights
// found by interpolating on each vector u and v.
height = D + Lerp(0.0f, uy, 1.0f - dx) + Lerp(0.0f, vy, 1.0f - dz);
}
return height;
}
float height = m_Grid.getHeight(position.x, position.y, 49 * 300, 49 * 300, 6.1224489795918367f);
if (height != 0)
{
position.y = height + 10.0f;
}
m_Camera.SetPosition(position.x, position.y, position.z);
bool cGrid::readRawFile(std::string fileName, int m, int n)
{
// A height for each vertex
std::vector<BYTE> in(m*n);
std::ifstream inFile(fileName.c_str(), std::ios_base::binary);
if (!inFile)
return false;
inFile.read(
(char*)&in[0], // buffer
in.size());// number of bytes to read into buffer
inFile.close();
// copy BYTE vector to int vector
m_heightmap.resize(n*m);
for (int i = 0; i < in.size(); i++)
m_heightmap[i] = (float)((in[i])/255)*50.0f;
return true;
}
m_Grid.readRawFile("castlehm257.raw", 50, 50);
I infer that you’re storing a 50 by 50 matrix inside a 300 by 300 matrix, to represent a grid of 49 by 49 cells. I also infer that m_Grid is an object of type cGrid. Your code appears to contain the following errors:
Argument(2) of call m_Grid.getHeight is not a z value.
Argument(3) of call m_Grid.getHeight is inconsistent with argument(5).
Argument(4) of call m_Grid.getHeight is inconsistent with argument(5).
Implicit cast of literal float to int in argument(5) of call m_Grid.getHeight - the value will be truncated.
Try changing your function call to this:
float height = m_Grid.getHeight(position.x, position.z, 49 * cellspacing, 49 * cellspacing, cellspacing);
-- where cellspacing is as defined in your diagram.
Also, try changing parameter(5) of cGrid::getHeight from int _cellSpacing to float _cellSpacing.
(I have edited this answer a couple of times as my understanding of your code has evolved.)

How do I use texture-mapping in a simple ray tracer?

I am attempting to add features to a ray tracer in C++. Namely, I am trying to add texture mapping to the spheres. For simplicity, I am using an array to store the texture data. I obtained the texture data by using a hex editor and copying the correct byte values into an array in my code. This was just for my testing purposes. When the values of this array correspond to an image that is simply red, it appears to work close to what is expected except there is no shading.
first image http://dl.dropbox.com/u/367232/Texture.jpg
The bottom right of the image shows what a correct sphere should look like. This sphere's colour using one set colour, not a texture map.
Another problem is that when the texture map is of something other than just one colour pixels, it turns white. My test image is a picture of water, and when it maps, it shows only one ring of bluish pixels surrounding the white colour.
bmp http://dl.dropbox.com/u/367232/vPoolWater.bmp
When this is done, it simply appears as this:
second image http://dl.dropbox.com/u/367232/texture2.jpg
Here are a few code snippets:
Color getColor(const Object *object,const Ray *ray, float *t)
{
if (object->materialType == TEXTDIF || object->materialType == TEXTMATTE) {
float distance = *t;
Point pnt = ray->origin + ray->direction * distance;
Point oc = object->center;
Vector ve = Point(oc.x,oc.y,oc.z+1) - oc;
Normalize(&ve);
Vector vn = Point(oc.x,oc.y+1,oc.z) - oc;
Normalize(&vn);
Vector vp = pnt - oc;
Normalize(&vp);
double phi = acos(-vn.dot(vp));
float v = phi / M_PI;
float u;
float num1 = (float)acos(vp.dot(ve));
float num = (num1 /(float) sin(phi));
float theta = num /(float) (2 * M_PI);
if (theta < 0 || theta == NAN) {theta = 0;}
if (vn.cross(ve).dot(vp) > 0) {
u = theta;
}
else {
u = 1 - theta;
}
int x = (u * IMAGE_WIDTH) -1;
int y = (v * IMAGE_WIDTH) -1;
int p = (y * IMAGE_WIDTH + x)*3;
return Color(TEXT_DATA[p+2],TEXT_DATA[p+1],TEXT_DATA[p]);
}
else {
return object->color;
}
};
I call the colour code here in Trace:
if (object->materialType == MATTE)
return getColor(object, ray, &t);
Ray shadowRay;
int isInShadow = 0;
shadowRay.origin.x = pHit.x + nHit.x * bias;
shadowRay.origin.y = pHit.y + nHit.y * bias;
shadowRay.origin.z = pHit.z + nHit.z * bias;
shadowRay.direction = light->object->center - pHit;
float len = shadowRay.direction.length();
Normalize(&shadowRay.direction);
float LdotN = shadowRay.direction.dot(nHit);
if (LdotN < 0)
return 0;
Color lightColor = light->object->color;
for (int k = 0; k < numObjects; k++) {
if (Intersect(objects[k], &shadowRay, &t) && !objects[k]->isLight) {
if (objects[k]->materialType == GLASS)
lightColor *= getColor(objects[k], &shadowRay, &t); // attenuate light color by glass color
else
isInShadow = 1;
break;
}
}
lightColor *= 1.f/(len*len);
return (isInShadow) ? 0 : getColor(object, &shadowRay, &t) * lightColor * LdotN;
}
I left out the rest of the code as to not bog down the post, but it can be seen here. Any help is greatly appreciated. The only portion not included in the code, is where I define the texture data, which as I said, is simply taken straight from a bitmap file of the above image.
Thanks.
It could be that the texture is just washed out because the light is so bright and so close. Notice how in the solid red case, there doesn't seem to be any gradation around the sphere. The red looks like it's saturated.
Your u,v mapping looks right, but there could be a mistake there. I'd add some assert statements to make sure u and v and really between 0 and 1 and that the p index into your TEXT_DATA array is also within range.
If you're debugging your textures, you should use a constant material whose color is determined only by the texture and not the lights. That way you can make sure you are correctly mapping your texture to your primitive and filtering it properly before doing any lighting on it. Then you know that part isn't the problem.