i am working on an implementation of the Separting Axis Theorem for use in 2D games. It kind of works but just kind of.
I use it like this:
bool penetration = sat(c1, c2) && sat(c2, c1);
Where c1 and c2 are of type Convex, defined as:
class Convex
{
public:
float tx, ty;
public:
std::vector<Point> p;
void translate(float x, float y) {
tx = x;
ty = y;
}
};
(Point is a structure of float x, float y)
The points are typed in clockwise.
My current code (ignore Qt debug):
bool sat(Convex c1, Convex c2, QPainter *debug)
{
//Debug
QColor col[] = {QColor(255, 0, 0), QColor(0, 255, 0), QColor(0, 0, 255), QColor(0, 0, 0)};
bool ret = true;
int c1_faces = c1.p.size();
int c2_faces = c2.p.size();
//For every face in c1
for(int i = 0; i < c1_faces; i++)
{
//Grab a face (face x, face y)
float fx = c1.p[i].x - c1.p[(i + 1) % c1_faces].x;
float fy = c1.p[i].y - c1.p[(i + 1) % c1_faces].y;
//Create a perpendicular axis to project on (axis x, axis y)
float ax = -fy, ay = fx;
//Normalize the axis
float len_v = sqrt(ax * ax + ay * ay);
ax /= len_v;
ay /= len_v;
//Debug graphics (ignore)
debug->setPen(col[i]);
//Draw the face
debug->drawLine(QLineF(c1.tx + c1.p[i].x, c1.ty + c1.p[i].y, c1.p[(i + 1) % c1_faces].x + c1.tx, c1.p[(i + 1) % c1_faces].y + c1.ty));
//Draw the axis
debug->save();
debug->translate(c1.p[i].x, c1.p[i].y);
debug->drawLine(QLineF(c1.tx, c1.ty, ax * 100 + c1.tx, ay * 100 + c1.ty));
debug->drawEllipse(QPointF(ax * 100 + c1.tx, ay * 100 + c1.ty), 10, 10);
debug->restore();
//Carve out the min and max values
float c1_min = FLT_MAX, c1_max = FLT_MIN;
float c2_min = FLT_MAX, c2_max = FLT_MIN;
//Project every point in c1 on the axis and store min and max
for(int j = 0; j < c1_faces; j++)
{
float c1_proj = (ax * (c1.p[j].x + c1.tx) + ay * (c1.p[j].y + c1.ty)) / (ax * ax + ay * ay);
c1_min = min(c1_proj, c1_min);
c1_max = max(c1_proj, c1_max);
}
//Project every point in c2 on the axis and store min and max
for(int j = 0; j < c2_faces; j++)
{
float c2_proj = (ax * (c2.p[j].x + c2.tx) + ay * (c2.p[j].y + c2.ty)) / (ax * ax + ay * ay);
c2_min = min(c2_proj, c2_min);
c2_max = max(c2_proj, c2_max);
}
//Return if the projections do not overlap
if(!(c1_max >= c2_min && c1_min <= c2_max))
ret = false; //return false;
}
return ret; //return true;
}
What am i doing wrong? It registers collision perfectly but is over sensitive on one edge (in my test using a triangle and a diamond):
//Triangle
push_back(Point(0, -150));
push_back(Point(0, 50));
push_back(Point(-100, 100));
//Diamond
push_back(Point(0, -100));
push_back(Point(100, 0));
push_back(Point(0, 100));
push_back(Point(-100, 0));
I am getting this mega-adhd over this, please help me out :)
http://u8999827.fsdata.se/sat.png
OK, I was wrong the first time. Looking at your picture of a failure case it is obvious a separating axis exists and is one of the normals (the normal to the long edge of the triangle). The projection is correct, however, your bounds are not.
I think the error is here:
float c1_min = FLT_MAX, c1_max = FLT_MIN;
float c2_min = FLT_MAX, c2_max = FLT_MIN;
FLT_MIN is the smallest normal positive number representable by a float, not the most negative number. In fact you need:
float c1_min = FLT_MAX, c1_max = -FLT_MAX;
float c2_min = FLT_MAX, c2_max = -FLT_MAX;
or even better for C++
float c1_min = std::numeric_limits<float>::max(), c1_max = -c1_min;
float c2_min = std::numeric_limits<float>::max(), c2_max = -c2_min;
because you're probably seeing negative projections onto the axis.
Related
I'm using the Visual Studio profiler for the first time and I'm trying to interpret the results. Looking at the percentages on the left, I found this subtraction's time cost a bit strange:
Other parts of the code contain more complex expressions, like:
Even a simple multiplication seems way faster than the subtraction :
Other multiplications take way longer and I really don't get why, like this :
So, I guess my question is if there is anything weird going on here.
Complex expressions take longer than that subtraction and some expressions take way longer than similar other ones. I run the profiler several times and the distribution of the percentages is always like this. Am I just interpreting this wrong?
Update:
I was asked to give the profile for the whole function so here it is, even though it's a bit big. I ran the function inside a for loop for 1 minute and got 50k samples. The function contains a double loop. I include the text first for ease, followed by the pictures of profiling. Note that the code in text is a bit updated.
for (int i = 0; i < NUMBER_OF_CONTOUR_POINTS; i++) {
vec4 contourPointV(contour3DPoints[i], 1);
float phi = angles[i];
float xW = pose[0][0] * contourPointV.x + pose[1][0] * contourPointV.y + contourPointV.z * pose[2][0] + pose[3][0];
float yW = pose[0][1] * contourPointV.x + pose[1][1] * contourPointV.y + contourPointV.z * pose[2][1] + pose[3][1];
float zW = pose[0][2] * contourPointV.x + pose[1][2] * contourPointV.y + contourPointV.z * pose[2][2] + pose[3][2];
float x = -G_FU_STRICT * xW / zW;
float y = -G_FV_STRICT * yW / zW;
x = (x + 1) * G_WIDTHo2;
y = (y + 1) * G_HEIGHTo2;
y = G_HEIGHT - y;
phi -= extraTheta;
if (phi < 0)phi += CV_PI2;
int indexForTable = phi * oneKoverPI;
//vec2 ray(cos(phi), sin(phi));
vec2 ray(cos_pre[indexForTable], sin_pre[indexForTable]);
vec2 ray2(-ray.x, -ray.y);
float outerStepX = ray.x * step;
float outerStepY = ray.y * step;
cv::Point2f outerPoint(x + outerStepX, y + outerStepY);
cv::Point2f innerPoint(x - outerStepX, y - outerStepY);
cv::Point2f contourPointCV(x, y);
cv::Point2f contourPointCVcopy(x, y);
bool cut = false;
if (!isInView(outerPoint.x, outerPoint.y) || !isInView(innerPoint.x, innerPoint.y)) {
cut = true;
}
bool outside2 = true; bool outside1 = true;
if (cut) {
outside2 = myClipLine(contourPointCV.x, contourPointCV.y, outerPoint.x, outerPoint.y, G_WIDTH - 1, G_HEIGHT - 1);
outside1 = myClipLine(contourPointCVcopy.x, contourPointCVcopy.y, innerPoint.x, innerPoint.y, G_WIDTH - 1, G_HEIGHT - 1);
}
myIterator innerRayMine(contourPointCVcopy, innerPoint);
myIterator outerRayMine(contourPointCV, outerPoint);
if (!outside1) {
innerRayMine.end = true;
innerRayMine.prob = true;
}
if (!outside2) {
outerRayMine.end = true;
innerRayMine.prob = true;
}
vec2 normal = -ray;
float dfdxTerm = -normal.x;
float dfdyTerm = normal.y;
vec3 point3D = vec3(xW, yW, zW);
cv::Point contourPoint((int)x, (int)y);
float Xc = point3D.x; float Xc2 = Xc * Xc; float Yc = point3D.y; float Yc2 = Yc * Yc; float Zc = point3D.z; float Zc2 = Zc * Zc;
float XcYc = Xc * Yc; float dfdxFu = dfdxTerm * G_FU; float dfdyFv = dfdyTerm * G_FU; float overZc2 = 1 / Zc2; float overZc = 1 / Zc;
pixelJacobi[0] = (dfdyFv * (Yc2 + Zc2) + dfdxFu * XcYc) * overZc2;
pixelJacobi[1] = (-dfdxFu * (Xc2 + Zc2) - dfdyFv * XcYc) * overZc2;
pixelJacobi[2] = (-dfdyFv * Xc + dfdxFu * Yc) * overZc;
pixelJacobi[3] = -dfdxFu * overZc;
pixelJacobi[4] = -dfdyFv * overZc;
pixelJacobi[5] = (dfdyFv * Yc + dfdxFu * Xc) * overZc2;
float commonFirstTermsSum = 0;
float commonFirstTermsSquaredSum = 0;
int test = 0;
while (!innerRayMine.end) {
test++;
cv::Point xy = innerRayMine.pos(); innerRayMine++;
int x = xy.x;
int y = xy.y;
float dx = x - contourPoint.x;
float dy = y - contourPoint.y;
vec2 dxdy(dx, dy);
float raw = -glm::dot(dxdy, normal);
float heavisideTerm = heaviside_pre[(int)raw * 100 + 1000];
float deltaTerm = delta_pre[(int)raw * 100 + 1000];
const Vec3b rgb = ante[y * 640 + x];
int red = rgb[0]; int green = rgb[1]; int blue = rgb[2];
red = red >> 3; red = red << 10; green = green >> 3; green = green << 5; blue = blue >> 3;
int colorIndex = red + green + blue;
pF = pFPointer[colorIndex];
pB = pBPointer[colorIndex];
float denAsMul = 1 / (pF + pB + 0.000001);
pF = pF * denAsMul;
float pfMinusPb = 2 * pF - 1;
float denominator = heavisideTerm * (pfMinusPb)+pB + 0.000001;
float commonFirstTerm = -pfMinusPb / denominator * deltaTerm;
commonFirstTermsSum += commonFirstTerm;
commonFirstTermsSquaredSum += commonFirstTerm * commonFirstTerm;
}
}
Visual Studio profiles by sampling: it interrupts execution often and records the value of the instruction pointer; it then maps it to the source and calculates the frequency of hitting that line.
There are few issues with that: it's not always possible to figure out which line produced a specific assembly instruction in the optimized code.
One trick I use is to move the code of interest into a separate function and declare it with __declspec(noinline) .
In your example, are you sure the subtraction was performed as many times as multiplication? I would be more puzzled by the difference in subsequent multiplication (0.39% and 0.53%)
Update:
I believe that the following lines:
float phi = angles[i];
and
phi -= extraTheta;
got moved together in assembly and the time spent getting angles[i] was added to that subtraction line.
I am a beginner in c++ and have coded a for loop to show a hollow circle when I run the code, however, I was wondering how I could achieve a filled-in circle using the distance formula (d = sqrt((ax-bx)^2 + (ay-by)^2). Here's what I have so far! Any help would be appreciated!
int MAX = 728;
for (float t = 0; t < 2 * 3.14; t += 0.01)
SetPixel(MAX / 4 + MAX / 6 * sin(t), MAX / 4 + MAX / 6 * cos(t), 255, 255, 0);
#include <windows.h>
#include <iostream>
using namespace std;
int main()
{
HWND consoleWindow = GetConsoleWindow(); // Get a console handle
HDC consoleDC = GetDC(consoleWindow); // Get a handle to device context
int max = 628;
float i = 0;
float t;
float doublePi = 6.29;
for (i = 0.0; i < max; i += 2.0) {
for (t = 0.0; t < doublePi; t += 0.01) {
SetPixel(consoleDC, max / 4 + (max - i) / 6 * sin(t), max / 4 + (max - i) / 6 * cos(t), RGB(255, 255, 0));
}
}
ReleaseDC(consoleWindow, consoleDC);
cin.ignore();
return 0;
}
Working almost well. Draw and fill in! A little slow...
Pffff... do not use sin and cos! instead use the sqrt(1-x^2) approach. You can view the formula rendering a circle in google for example: https://www.google.com/search?q=sqrt(1-x^2)
I edit this answer because it seems that is not clear:
float radius = 50.0f;
for (int x = -radius; x <= radius; ++x) {
int d = round(sqrt(1.0f - (x * x / radius / radius)) * radius);
for (int y = -d; y <= d; ++y) {
SetPixel(x, y, 255, 255, 0);
}
}
Note: each graphic library is different, so I assumed that you used rightfully the "SetPixel" function.
Now, for most people say the sqrt(1-x^2) approach should be enough, but it seem that some downvoters does not think the same XD.
Inefficient as can be, and probably the last way you really want to draw a circle ... but ...
Over the entire square encompassing your circle, calculate each pixel's distance from the center and set if under or equal the radius.
// Draw a circle centered at (Xcenter,Ycenter) with given radius using distance formula
void drawCircle(HDC dc, int XCenter, int YCenter, int radius, COLORREF c) {
double fRad = radius * 1.0; // Just a shortcut to avoid thrashing data types
for (int x = XCenter - radius; x<XCenter + radius; x++) {
for (int y = YCenter - radius; y<YCenter + radius; y++) {
double d = sqrt(((x - XCenter) * (x - XCenter)) + ((y - YCenter) * (y - YCenter)) );
if (d <= fRad) SetPixel(dc, x, y, c);
}
}
}
Caveat: No more caveat, used a C++ environment and tested it this time. :-)
Call thusly:
int main()
{
HWND consoleWindow = GetConsoleWindow();
HDC consoleDC = GetDC(consoleWindow);
drawCircle(consoleDC, 50, 50, 20, RGB(255, 0, 255));
ReleaseDC(consoleWindow, consoleDC);
return 0;
}
Issue
I'm trying to implement the Perlin Noise algorithm in 2D with a single octave with a size of 16x16. I'm using this as heightmap data for a terrain, however it only seems to work in one axis. Whenever the sample point moves to a new Y section in the Perlin Noise grid, the gradient is very different from what I expect (for example, it often flips from 0.98 to -0.97, which is a very sudden change).
This image shows the staggered terrain in the z direction (which is the y axis in the 2D Perlin Noise grid)
Code
I've put the code that calculates which sample point to use at the end since it's quite long and I believe it's not where the issue is, but essentially I scale down the terrain to match the Perlin Noise grid (16x16) and then sample through all the points.
Gradient At Point
So the code that calculates out the gradient at a sample point is the following:
// Find the gradient at a certain sample point
float PerlinNoise::gradientAt(Vector2 point)
{
// Decimal part of float
float relativeX = point.x - (int)point.x;
float relativeY = point.y - (int)point.y;
Vector2 relativePoint = Vector2(relativeX, relativeY);
vector<float> weights(4);
// Find the weights of the 4 surrounding points
weights = surroundingWeights(point);
float fadeX = fadeFunction(relativePoint.x);
float fadeY = fadeFunction(relativePoint.y);
float lerpA = MathUtils::lerp(weights[0], weights[1], fadeX);
float lerpB = MathUtils::lerp(weights[2], weights[3], fadeX);
float lerpC = MathUtils::lerp(lerpA, lerpB, fadeY);
return lerpC;
}
Surrounding Weights of Point
I believe the issue is somewhere here, in the function that calculates the weights for the 4 surrounding points of a sample point, but I can't seem to figure out what is wrong since all the values seem sensible in the function when stepping through it.
// Find the surrounding weight of a point
vector<float> PerlinNoise::surroundingWeights(Vector2 point){
// Produces correct values
vector<Vector2> surroundingPoints = surroundingPointsOf(point);
vector<float> weights;
for (unsigned i = 0; i < surroundingPoints.size(); ++i) {
// The corner to the sample point
Vector2 cornerToPoint = surroundingPoints[i].toVector(point);
// Getting the seeded vector from the grid
float x = surroundingPoints[i].x;
float y = surroundingPoints[i].y;
Vector2 seededVector = baseGrid[x][y];
// Dot product between the seededVector and corner to the sample point vector
float dotProduct = cornerToPoint.dot(seededVector);
weights.push_back(dotProduct);
}
return weights;
}
OpenGL Setup and Sample Point
Setting up the heightmap and getting the sample point. Variables 'wrongA' and 'wrongA' is an example of when the gradient flips and changes suddenly.
void HeightMap::GenerateRandomTerrain() {
int perlinGridSize = 16;
PerlinNoise perlin_noise = PerlinNoise(perlinGridSize, perlinGridSize);
numVertices = RAW_WIDTH * RAW_HEIGHT;
numIndices = (RAW_WIDTH - 1) * (RAW_HEIGHT - 1) * 6;
vertices = new Vector3[numVertices];
textureCoords = new Vector2[numVertices];
indices = new GLuint[numIndices];
float perlinScale = RAW_HEIGHT/ (float) (perlinGridSize -1);
float height = 50;
float wrongA = perlin_noise.gradientAt(Vector2(0, 68.0f / perlinScale));
float wrongB = perlin_noise.gradientAt(Vector2(0, 69.0f / perlinScale));
for (int x = 0; x < RAW_WIDTH; ++x) {
for (int z = 0; z < RAW_HEIGHT; ++z) {
int offset = (x* RAW_WIDTH) + z;
float xVal = (float)x / perlinScale;
float yVal = (float)z / perlinScale;
float noise = perlin_noise.gradientAt(Vector2( xVal , yVal));
vertices[offset] = Vector3(x * HEIGHTMAP_X, noise * height, z * HEIGHTMAP_Z);
textureCoords[offset] = Vector2(x * HEIGHTMAP_TEX_X, z * HEIGHTMAP_TEX_Z);
}
}
numIndices = 0;
for (int x = 0; x < RAW_WIDTH - 1; ++x) {
for (int z = 0; z < RAW_HEIGHT - 1; ++z) {
int a = (x * (RAW_WIDTH)) + z;
int b = ((x + 1)* (RAW_WIDTH)) + z;
int c = ((x + 1)* (RAW_WIDTH)) + (z + 1);
int d = (x * (RAW_WIDTH)) + (z + 1);
indices[numIndices++] = c;
indices[numIndices++] = b;
indices[numIndices++] = a;
indices[numIndices++] = a;
indices[numIndices++] = d;
indices[numIndices++] = c;
}
}
BufferData();
}
Turned out the issue was in the interpolation stage:
float lerpA = MathUtils::lerp(weights[0], weights[1], fadeX);
float lerpB = MathUtils::lerp(weights[2], weights[3], fadeX);
float lerpC = MathUtils::lerp(lerpA, lerpB, fadeY);
I had the interpolation in the y axis the wrong way around, so it should have been:
lerp(lerpB, lerpA, fadeY)
Instead of:
lerp(lerpA, lerpB, fadeY)
I copied this ellipse code directly from the opengl textbook:
void ellipseMidpoint (int xCenter, int yCenter, int Rx, int Ry)
{
int Rx2 = Rx * Rx;
int Ry2 = Ry * Ry;
int twoRx2 = 2 * Rx2;
int twoRy2 = 2 * Ry2;
int p;
int x = 0;
int y = Ry;
int px = 0;
int py = twoRx2 * y;
//initial points in both quadrants
ellipsePlotPoints (xCenter, yCenter, x, y);
//Region 1
p = round (Ry2 - (Rx2 * Ry) + (0.25 * Rx2));
while (px < py) {
x++;
px += twoRy2;
if (p < 0)
p += Ry2 + px;
else {
y--;
py -= twoRx2;
p += Ry2 + px - py;
}
ellipsePlotPoints (xCenter, yCenter, x, y);
}
//Region 2
p = round (Ry2 * (x+0.5) * (x+0.5) + Rx2 * (y-1) * (y-1) - Rx2 * Ry2);
while (y > 0) {
y--;
py -= twoRx2;
if (p > 0)
p += Rx2 - py;
else {
x++;
px += twoRy2;
p += Rx2 - py + px;
}
ellipsePlotPoints (xCenter, yCenter, x, y);
}
}
void ellipsePlotPoints (int xCenter, int yCenter, int x, int y)
{
setPixel (xCenter + x, yCenter + y);
setPixel (xCenter - x, yCenter + y);
setPixel (xCenter + x, yCenter - y);
setPixel (xCenter - x, yCenter - y);
}
void setPixel (GLint xPos, GLint yPos)
{
glBegin (GL_POINTS);
glVertex2i(xPos, yPos);
glEnd();
}
The smaller ellipses seem to be fine but the larger ones are pointy and sort of flat at the ends.
Any ideas why?
Here is a current screenshot:
I think you're encountering overflow. I played with your code. While I never saw exactly the same "lemon" type shapes from your pictures, things definitely fell apart at large sizes, and it was caused by overflowing the range of the int variables used in the code.
For example, look at one of the first assignments:
int py = twoRx2 * y;
If you substitute, this becomes:
int py = 2 * Rx * Rx * Ry;
If you use a value of 1000 each for Rx and Ry, this is 2,000,000,000. Which is very close to the 2^31 - 1 top of the range of a 32-bit int.
If you want to use this algorithm for larger sizes, you could use 64-bit integer variables. Depending on your system, the type would be long or long long. Or more robustly, int64_t after including <stdint.h>.
Now, if all you want to do is draw an ellipsis with OpenGL, there are much better ways. The Bresenham type algorithms used in your code are ideal if you need to draw a curve pixel by pixel. But OpenGL is a higher level API, which knows how to render more complex primitives than just pixels. For a curve, you will most typically use a connected set of line segments to approximate the curve. OpenGL will then take care of turning those line segments into pixels.
The simplest way to draw an ellipsis is to directly apply the parametric representation. With phi an angle between 0 and PI, and using the naming from your code, the points on the ellipsis are:
x = xCenter + Rx * cos(phi)
y = yCenter + Ry * sin(phi)
You can use an increment for phi that meets your precision requirements, and the code will look something to generate an ellipsis approximated by DIV_COUNT points will look something like this:
float angInc = 2.0f * m_PI / (float)DIV_COUNT;
float ang = 0.0f;
glBegin(GL_LINE_LOOP);
for (int iDiv = 0; iDiv < DIV_COUNT; ++iDiv) {
ang += angInc;
float x = xCenter + Rx * cos(ang);
float y = yCenter + Ry * sin(ang);
glVertex2f(x, y);
glEnd();
If you care about efficiency, you can avoid calculating the trigonometric functions for each point, and apply an incremental rotation to calculate each point from the previous one:
float angInc = 2.0f * M_PI / (float)DIV_COUNT;
float cosInc = cos(angInc);
float sinInc = sin(angInc);
float cosAng = 1.0f;
float sinAng = 0.0f
glBegin(GL_LINE_LOOP);
for (int iDiv = 0; iDiv < DIV_COUNT; ++iDiv) {
float newCosAng = cosInc * cosAng - sinInc * sinAng;
sinAng = sinInc * cosAng + cosInc * sinAng;
cosAng = newCosAng;
float x = xCenter + Rx * cosAng;
float y = yCenter + Ry * sinAng;
glVertex2f(x, y);
glEnd();
This code is of course just for illustrating the math, and to get you started. In reality, you should use current OpenGL rendering methods, which includes vertex buffers, etc.
I am trying to implement a bilinear interpolation function, but for some reason I am getting bad output. I cant seem to figure out what's wrong, any help getting on the right track will be appreciated.
double lerp(double c1, double c2, double v1, double v2, double x)
{
if( (v1==v2) ) return c1;
double inc = ((c2-c1)/(v2 - v1)) * (x - v1);
double val = c1 + inc;
return val;
};
void bilinearInterpolate(int width, int height)
{
// if the current size is the same, do nothing
if(width == GetWidth() && height == GetHeight())
return;
//Create a new image
std::unique_ptr<Image2D> image(new Image2D(width, height));
// x and y ratios
double rx = (double)(GetWidth()) / (double)(image->GetWidth()); // oldWidth / newWidth
double ry = (double)(GetHeight()) / (double)(image->GetHeight()); // oldWidth / newWidth
// loop through destination image
for(int y=0; y<height; ++y)
{
for(int x=0; x<width; ++x)
{
double sx = x * rx;
double sy = y * ry;
uint xl = std::floor(sx);
uint xr = std::floor(sx + 1);
uint yt = std::floor(sy);
uint yb = std::floor(sy + 1);
for (uint d = 0; d < image->GetDepth(); ++d)
{
uchar tl = GetData(xl, yt, d);
uchar tr = GetData(xr, yt, d);
uchar bl = GetData(xl, yb, d);
uchar br = GetData(xr, yb, d);
double t = lerp(tl, tr, xl, xr, sx);
double b = lerp(bl, br, xl, xr, sx);
double m = lerp(t, b, yt, yb, sy);
uchar val = std::floor(m + 0.5);
image->SetData(x,y,d,val);
}
}
}
//Cleanup
mWidth = width; mHeight = height;
std::swap(image->mData, mData);
}
Input Image (4 pixels wide and high)
My Output
Expected Output (Photoshop's Bilinear Interpolation)
Photoshop's algorithm assumes that each source pixel's color is in the center of the pixel, while your algorithm assumes that the color is in its topleft. This causes your results to be shifted half a pixel up and left compared to Photoshop.
Another way to look at it is that your algorithm maps the x coordinate range (0, srcWidth) to (0, dstWidth), while Photoshop maps (-0.5, srcWidth-0.5) to (-0.5, dstWidth-0.5), and the same in y coordinate.
Instead of:
double sx = x * rx;
double sy = y * ry;
You can use:
double sx = (x + 0.5) * rx - 0.5;
double sy = (y + 0.5) * ry - 0.5;
to get similar results. Note that this can give you a negative value for sx and sy.