I just wrote a small netpbm parser and I am having fun with it, drawing mostly parametric equations. They look OK for a first time thing, but how can I expand upon this and have something that looks legit? This picture is how my method recreated the Arctic Monkeys logo which was just
0.5[cos(19t) - cos(21t)]
(I was trying to plot both cosines first before superpositioning them)
It obviously looks very "crispy" and sharp. I used as small of a step size as I could without it taking forever to finish. (0.0005, takes < 5 sec)
The only idea I had was that when drawing a white pixel, I should also draw its immediate neighbors with some slightly lighter gray. And then draw the neighbors of THOSE pixels with even lighter gray. Almost like the white color is "dissolving" or "dissipating".
I didn't try to implement this because it felt like a really bad way to do it and I am not even sure it'd produce anything near the desirable effect so I thought I'd ask first.
EDIT: here's a sample code that draws just a small spiral
the draw loop:
for (int t = 0; t < 6 * M_PI; t += 0.0005)
{
double r = t;
new_x = 10 * r * cosf(0.1 * M_PI * t);
new_y = -10 * r * sinf(0.1 * M_PI * t);
img.SetPixel(new_x + img.img_width / 2, new_y + img.img_height / 2, 255);
}
//img is a PPM image with magic number P5 (binary grayscale)
SetPixel:
void PPMimage::SetPixel(const uint16_t x, const uint16_t y, const uint16_t pixelVal)
{
assert(pixelVal >= 0 && pixelVal <= max_greys && "pixelVal larger than image's maximum max_grey\n%d");
assert(x >= 0 && x < img_width && "X value larger than image width\n");
assert(y >= 0 && y < img_height && "Y value larger than image height\n");
img_raster[y * img_width + x] = pixelVal;
}
This is what this code produces
A very basic form of antialiasing for a scatter plot (made of points rather than lines) can be achieved by applying something like stochastic rounding: consider the brush to be a pixel-sized square (but note the severe limitations of this model), centered at the non-integer coordinates of the plotted point, and compute its overlap with the four pixels that share the corner closest to that point. Treat that overlap fraction as a grayscale fraction and set each pixel to the largest value for a large number of points approximating a line, or do alpha blending for a small number of discrete points.
Related
I wanted to draw a circle using graphics.h in C++, but not directly using the circle() function. The circle I want to draw uses smaller circles as it's points i.e. The smaller circles would constitute the circumference of the larger circle. So I thought, if I did something like this, it would work:
{
int radius = 4;
// Points at which smaller circles would be drawn
int x, y;
int maxx = getmaxx();
int maxy = getmaxy();
// Co-ordinates of center of the larger circle (centre of the screen)
int h = maxx/2;
int k = maxy/2;
//Cartesian cirle formula >> (X-h)^2 + (Y-k)^2 = radius^2
//Effectively, this nested loop goes through every single coordinate on the screen
int gmode = DETECT;
int gdriver;
initgraph(&gmode, &gdriver, "");
for(x = 0; x<maxx; x++)
{
for(y = 0; y<maxy; y++)
{
if((((x-h)*(x-h)) + ((y-k)*(y-k))) == (radius*radius))
{
circle(x, y, 5) //Draw smaller circle with radius 5
} //at points which satisfy circle equation only!
}
}
getch();
}
This is when I'm using graphics.h on Turbo C++ as this is the compiler we're learning with at school.
I know it's ancient.
So, theoretically, since the nested for loops check all the points on the screen, and draw a small circle at every point that satisfies the circle equation only, I thought I would get a large circle of radius as entered, whose circumference constitutes of the smaller circles I make in the for loop.
However, when I try the program, I get four hyperbolas (all pointing towards the center of the screen) and when I increase the radius, the pointiness (for lack of a better word) of the hyperbolas increase, until finally, when the radius is 256 or more, the two hyperbolas on the top and bottom intersect to make a large cross on my screen like : "That's it, user, I give up!"
I came to the value 256 as I noticed that of the radius was a multiple of 4 the figures looked ... better?
I looked around for a solution for quite some time, but couldn't get any answers, so here I am.
Any suggestions???
EDIT >> Here's a rough diagram of the output I got...
There are two issues in your code:
First: You should really call initgraph before you call getmaxx and getmaxy, otherwise they will not necessarily return the correct dimensions of the graphics mode. This may or may not be a contributing factor depending on your setup.
Second, and most importantly: In Turbo C++, int is 16-bit. For example, here is circle with radius 100 (after the previous initgraph order issue was fixed):
Note the stray circles in the four corners. If we do a little debugging and add some print-outs (a useful strategy that you should file away for future reference):
if((((x-h)*(x-h)) + ((y-k)*(y-k))) == (radius*radius))
{
printf(": (%d-%d)^2 + (%d-%d)^2 = %d^2\n", x, h, y, k, radius);
circle(x, y, 5); //Draw smaller circle with radius
} //at points which satisfy circle equation only!
You can see what's happening (first line is maxx and maxy, not shown in above snippet):
In particular that circle at (63, 139) is one of the corners. If you do the math, you see that:
(63 - 319)2 + (139 - 239)2 = 75536
And since your ints are 16-bit, 75536 modulo 65536 = 10000 = the value that ends up being calculated = 1002 = a circle where it shouldn't be.
An easy solution to this is to just change the relevant variables to long:
maxx, maxy
x, y
h, k
So:
long x, y;
...
initgraph(...);
...
long maxx = getmaxx();
long maxy = getmaxy();
...
long h = maxx / 2;
long k = maxy / 2;
And then you'll end up with correct output:
Note of course that like other answers point out, since you are using ints, you'll miss a lot of points. This may or may not be OK, but some values will produce noticeably poorer results (e.g. radius 256 only seems to have 4 integer solutions). You could introduce a tolerance if you want. You could also use a more direct approach but that might defeat the purpose of your exercise with the Cartesian circle formula. If you're into this sort of thing, here is a 24-page document containing a bunch of discussion, proofs, and properties about integers that are the sum of two squares.
I don't know enough about Turbo C++ to know if you can make it use 32-bit ints, I'll leave that as an exercise to you.
First of all, maxx and maxy are integers, which you initialize using some functions representing the borders of the screen and then later you use them as functions. Just remove the paranthesis:
// Co-ordinates of center of the larger circle (centre of the screen)
int h = maxx/2;
int k = maxy/2;
Then, you are checking for exact equality to check whether a point is on a circle. Since the screen is a grid of pixels, many of your points will be missed. You need to add a tolerance, a maximum distance between the point you check and the actual circle. So change this line:
if(((x-h)*(x-h)) + ((y-k)*(y-k)) == radius*radius)
to this:
if(abs(((x-h)*(x-h)) + ((y-k)*(y-k)) - radius*radius) < 2)
Introduction of some level of tolerance will solve the problem.
But it is not wise to check all the points in graphical window. Would you change an approach? You can draw needed small circles without checks at all:
To fill all big circle circumference (with RBig radius), you need NCircles small circles with RSmall radius
NCircles = round to integer (Pi / ArcSin(RSmall / RBig));
Center of i-th small circle is at position
cx = mx + Round(RBig * Cos(i * 2 * Pi / N));
cy = my + Round(RBig * Sin(i * 2 * Pi / N));
where mx, my - center of the big circle
For my work I have to convert a point cloud to a grey scale (depth) image meaning that the z coordinate of each XYZ point in the cloud represents a shade of grey. For mapping a Z coordinate from the [z_min, z_max] interval to the [0..255] interval I used the map function of Arduino:
float map(float x, float in_min, float in_max, float out_min, float out_max)
{ return (x - in_min) * (out_max - out_min) / (in_max - in_min) + out_min; }
With that done I need to write the result to an image, the problem being that the clouds that I have can have millions of points so I can't just write them 1 by 1 to an image in order. Let's say that I have 3000x1000 ordered XY points. How would I do if I wanted to write them to a 700x300 pixels image? I hope the question is clear, thanks in advance for answering.
I have managed to find a solution to my problem. It is a fairy long algorithm for stack overflow but bear with me. The idea is write a vector of XY grey scale points as a pgm file.
Step 1: cloud_to_greyscale - function that converts an XYZ Point Cloud into a vector of XY grey scale points and that receives a cloud as a parameter:
for each point pt in cloud
point_xy_greyscale.x <- pt.x
point_xy_greyscale.y <- pt.y
point_xy_greyscale.greyscale <- map(pt.z, z_min, z_max, 0, 255)
greyscale_vector.add(point_xy_greyscale)
loop
return greyscale_vector
Step 2: greyscale_to_image - function that writes the previously returned vector as a greyscale_image, a class that has a width, a height and a _pixels member corresponding to a double dimensional array of unsigned short usually. The function receives the following parameters: a greyscale_vector (to be turned into the image) and an x_epsilon that will help us delimit the x pixel coordinates for our points, knowing that the x point coordinates are floats (and thus not suitable as array indices).
A little background info: I work on something called widop clouds so in my 3D space x is the width, y is the depth and z is the height. Also worth noting is the fact that y is an integer so for my problem, the height of the image is easy to find: it's y_max - y_min. To find the width of the image, follow the algorithm below and if it isn't clear I will answer any questions and I'm open to suggestions.
img_width <- 0; // image width
img_height <- y_max - y_min + 1 // image height
// determining image width
for each point greyscale_xy_point in greyscale_vector
point_x_cell <- (pt.x - x_min) * x_epsilon * 10
if point_x_cell > img_width
img_width <- point_x_cell + 1
loop
// defining and initializing image with the calculated height and width
greyscale_img(img_width, img_height)
// initializing greyscale image points
for y <- 0 to greyscale_img.height
for x <- 0 to greyscale_img.width
greyscale_img[y][x] = 0
loop
loop
// filling image with vector data
for each point point_xy_greyscale in greyscale_vector
image_x = (point_xy_greyscale.x - x_min) * x_epsilon * 10
image_y = point_xy_greyscale.y - y_min
greyscale_image[image_y][image_x] = point_xy_greyscale.greyscale
loop
return greyscale_image
The only thing left to do is to write the image to the file, but that is easy to do, you can just find the format rules in the previous link related to the pgm format. I hope this helps someone.
EDIT_1: I added a picture of the result. It is supposed to be a railway and the reason it's fairly dark is that there are some objects that are tall so ground objects are darker.
depth image of railway
On SO, found the following simple algorithm for drawing filled circles:
for(int y=-radius; y<=radius; y++)
for(int x=-radius; x<=radius; x++)
if(x*x+y*y <= radius*radius)
setpixel(origin.x+x, origin.y+y);
Is there an equally simple algorithm for drawing filled ellipses?
Simpler, with no double and no division (but be careful of integer overflow):
for(int y=-height; y<=height; y++) {
for(int x=-width; x<=width; x++) {
if(x*x*height*height+y*y*width*width <= height*height*width*width)
setpixel(origin.x+x, origin.y+y);
}
}
We can take advantage of two facts to optimize this significantly:
Ellipses have vertical and horizontal symmetry;
As you progress away from an axis, the contour of the ellipse slopes more and more.
The first fact saves three-quarters of the work (almost); the second fact tremendously reduces the number of tests (we test only along the edge of the ellipse, and even there we don't have to test every point).
int hh = height * height;
int ww = width * width;
int hhww = hh * ww;
int x0 = width;
int dx = 0;
// do the horizontal diameter
for (int x = -width; x <= width; x++)
setpixel(origin.x + x, origin.y);
// now do both halves at the same time, away from the diameter
for (int y = 1; y <= height; y++)
{
int x1 = x0 - (dx - 1); // try slopes of dx - 1 or more
for ( ; x1 > 0; x1--)
if (x1*x1*hh + y*y*ww <= hhww)
break;
dx = x0 - x1; // current approximation of the slope
x0 = x1;
for (int x = -x0; x <= x0; x++)
{
setpixel(origin.x + x, origin.y - y);
setpixel(origin.x + x, origin.y + y);
}
}
This works because each scan line is shorter than the previous one, by at least as much
as that one was shorter than the one before it. Because of rounding to integer pixel coordinates, that's not perfectly accurate -- the line can be shorter by one pixel less that that.
In other words, starting from the longest scan line (the horizontal diameter), the amount by which each line is shorter than the previous one, denoted dx in the code, decreases by at most one, stays the same, or increases. The first inner for finds the exact amount by which the current scan line is shorter than the previous one, starting at dx - 1 and up, until we land just inside the ellipse.
| x1 dx x0
|###### |<-->|
current scan line --> |########### |<>|previous dx
previous scan line --> |################ |
two scan lines ago --> |###################
|#####################
|######################
|######################
+------------------------
To compare the number of inside-ellipse tests, each asterisk is one pair of coordinates tested in the naive version:
*********************************************
*********************************************
*********************************************
*********************************************
*********************************************
*********************************************
*********************************************
*********************************************
*********************************************
*********************************************
*********************************************
*********************************************
*********************************************
*********************************************
*********************************************
*********************************************
*********************************************
... and in the improved version:
*
**
****
***
***
***
**
**
An ellipse (about the origin) is a circle that has been linearly stretched along the x or y axes. So you can modify your loop like this:
for(int y=-height; y<=height; y++) {
for(int x=-width; x<=width; x++) {
double dx = (double)x / (double)width;
double dy = (double)y / (double)height;
if(dx*dx+dy*dy <= 1)
setpixel(origin.x+x, origin.y+y);
}
}
You can see that if width == height == radius, then this is equivalent to your code for drawing a circle.
Replace
x*x+y*y <= radius*radius
with
Axx*x*x + 2*Axy*x*y + Ayy*y*y < radius*radius
where you have three constants, Axx, Axy, Ayy. When Axy=0, the ellipse will have its axes straight horizontal and vertical. Axx=Ayy=1 makes a circle. The bigger Axx, the smaller the width. Similar for Ayy and height. For an arbitrary ellipse tilted at any given angle, it takes a bit of algebra to figure out the constants.
Mathematically Axx, Axy, Ayy are a "tensor" but perhaps you don't want to get into that stuff.
UPDATE - detailed math. I don't think S.O. can make nice math like Math S.E. so this will look crude.
You want to draw (or do whatever) with an ellipse in x,y coordinates. The ellipse is tilted. We create an alternative coordinate system x',y' aligned with the ellipse. Clearly, points on the ellipse satisfy
(x'/a)^2 + (y'/b)^2 = 1
By contemplating some well-chosen random points we see that
x' = C*x + S*y
y' = -S*x + C*y
where S, C are sin(θ) and cos(θ), θ being the angle of the x' axis w.r.t. the x axis. We can shorten this with notation x = (x,y) and similar for primed, and R a 2x2 matrix involving C and S:
x' = R x
The ellipse equation can be written
T(x') A'' x' = 1
where 'T' to indicates transpose and, dropping '^' to avoid poking everyone in the eyes, so that "a2" really means a^2,
A'' =
1/a2 0
0 1/b2
Using x' = Rx we find
T(Rx) A'' Rx = 1
T(x) T(R) A'' R x =1
T(x) A x = 1
where A, the thing you need to know to make your tilted drawing scan line algorithm work, is
A = T(R) A'' R =
C2/a2+S2/b2 SC(1/a2-1/b2)
SC/(1/a2-1/b2) S2/a2 + C2/b2
Multiply these by x and y according to T(x)Ax and you've got it.
A fast Bresenham type algorithm, as proposed by this paper, works really well. Here's an OpenGL implementation that I wrote for the same.
The basic premise is that you plot the curve on one quadrant, which we can mirror on to the other three quadrants. These vertices are computed using an error function, similar to what you use in the midpoint circle algorithm for circles. The paper I have linked above has a pretty nifty proof for the equation, and the algorithm distills down to just checking if a given vertex is within an ellipse or not, just by substituting its values in the error function. The algorithm also tracks the tangent line slope of the curve we are drawing in the first quadrant, and increments x or y depending on the slope value - which contributes further to the performance of the algorithm. Here's an image that shows what's going on:
As for filling the ellipse, once we know the vertices in each quadrant (which is essentially mirror reflections across x and y axes), we get 4 vertices for every vertex that we compute - which is sufficient to draw a quad (in OpenGL anyway). Once we draw quads for all such vertices, we get a filled ellipse. The implementation I have given employs VBO for performance reasons, but you don't strictly need it.
The implementation also shows you how to achieve a filled ellipse using triangles and lines instead of drawing quads - the quads are clearly better though, as it is a primitive and we only draw one quad for 4 vertices, as opposed to one triangle per vertex in the triangle implementation.
I am looking for optimized functions in c++ for calculating areal averages of floats. the function is passed a source float array, a destination float array (same size as source array), array width and height, "blurring" area width and height.
The function should "wrap-around" edges for the blurring/averages calculations.
Here is example code that blur with a rectangular shape:
/*****************************************
* Find averages extended variations
*****************************************/
void findaverages_ext(float *floatdata, float *dest_data, int fwidth, int fheight, int scale, int aw, int ah, int weight, int xoff, int yoff)
{
printf("findaverages_ext scale: %d, width: %d, height: %d, weight: %d \n", scale, aw, ah, weight);
float total = 0.0;
int spos = scale * fwidth * fheight;
int apos;
int w = aw;
int h = ah;
float* f_temp = new float[fwidth * fheight];
// Horizontal
for(int y=0;y<fheight ;y++)
{
Sleep(10); // Do not burn your processor
total = 0.0;
// Process entire window for first pixel (including wrap-around edge)
for (int kx = 0; kx <= w; ++kx)
if (kx >= 0 && kx < fwidth)
total += floatdata[y*fwidth + kx];
// Wrap
for (int kx = (fwidth-w); kx < fwidth; ++kx)
if (kx >= 0 && kx < fwidth)
total += floatdata[y*fwidth + kx];
// Store first window
f_temp[y*fwidth] = (total / (w*2+1));
for(int x=1;x<fwidth ;x++) // x width changes with y
{
// Substract pixel leaving window
if (x-w-1 >= 0)
total -= floatdata[y*fwidth + x-w-1];
// Add pixel entering window
if (x+w < fwidth)
total += floatdata[y*fwidth + x+w];
else
total += floatdata[y*fwidth + x+w-fwidth];
// Store average
apos = y * fwidth + x;
f_temp[apos] = (total / (w*2+1));
}
}
// Vertical
for(int x=0;x<fwidth ;x++)
{
Sleep(10); // Do not burn your processor
total = 0.0;
// Process entire window for first pixel
for (int ky = 0; ky <= h; ++ky)
if (ky >= 0 && ky < fheight)
total += f_temp[ky*fwidth + x];
// Wrap
for (int ky = fheight-h; ky < fheight; ++ky)
if (ky >= 0 && ky < fheight)
total += f_temp[ky*fwidth + x];
// Store first if not out of bounds
dest_data[spos + x] = (total / (h*2+1));
for(int y=1;y< fheight ;y++) // y width changes with x
{
// Substract pixel leaving window
if (y-h-1 >= 0)
total -= f_temp[(y-h-1)*fwidth + x];
// Add pixel entering window
if (y+h < fheight)
total += f_temp[(y+h)*fwidth + x];
else
total += f_temp[(y+h-fheight)*fwidth + x];
// Store average
apos = y * fwidth + x;
dest_data[spos+apos] = (total / (h*2+1));
}
}
delete f_temp;
}
What I need is similar functions that for each pixel finds the average (blur) of pixels from shapes different than rectangular.
The specific shapes are: "S" (sharp edges), "O" (rectangular but hollow), "+" and "X", where the average float is stored at the center pixel on destination data array. Size of blur shape should be variable, width and height.
The functions does not need to be pixelperfect, only optimized for performance. There could be separate functions for each shape.
I am also happy if anyone can tip me of how to optimize the example function above for rectangluar blurring.
What you are trying to implement are various sorts of digital filters for image processing. This is equivalent to convolving two signals where the 2nd one would be the filter's impulse response. So far, you regognized that a "rectangular average" is separable. By separable I mean, you can split the filter into two parts. One that operates along the X axis and one that operates along the Y axis -- in each case a 1D filter. This is nice and can save you lots of cycles. But not every filter is separable. Averaging along other shapres (S, O, +, X) is not separable. You need to actually compute a 2D convolution for these.
As for performance, you can speed up your 1D averages by properly implementing a "moving average". A proper "moving average" implementation only requires a fixed amount of little work per pixel regardless of the averaging "window". This can be done by recognizing that neighbouring pixels of the target image are computed by an average of almost the same pixels. You can reuse these sums for the neighbouring target pixel by adding one new pixel intensity and subtracting an older one (for the 1D case).
In case of arbitrary non-separable filters your best bet performance-wise is "fast convolution" which is FFT-based. Checkout www.dspguide.com. If I recall correctly, there is even a chapter on how to properly do "fast convolution" using the FFT algorithm. Although, they explain it for 1-dimensional signals, it also applies to 2-dimensional signals. For images you have to perform 2D-FFT/iFFT transforms.
To add to sellibitze's answer, you can use a summed area table for your O, S and + kernels (not for the X one though). That way you can convolve a pixel in constant time, and it's probably the fastest method to do it for kernel shapes that allow it.
Basically, a SAT is a data structure that lets you calculate the sum of any axis-aligned rectangle. For the O kernel, after you've built a SAT, you'd take the sum of the outer rect's pixels and subtract the sum of the inner rect's pixels. The S and + kernels can be implemented similarly.
For the X kernel you can use a different approach. A skewed box filter is separable:
You can convolve with two long, thin skewed box filters, then add the two resulting images together. The center of the X will be counted twice, so will you need to convolve with another skewed box filter, and subtract that.
Apart from that, you can optimize your box blur in many ways.
Remove the two ifs from the inner loop by splitting that loop into three loops - two short loops that do checks, and one long loop that doesn't. Or you could pad your array with extra elements from all directions - that way you can simplify your code.
Calculate values like h * 2 + 1 outside the loops.
An expression like f_temp[ky*fwidth + x] does two adds and one multiplication. You can initialize a pointer to &f_temp[ky*fwidth] outside the loop, and just increment that pointer in the loop.
Don't do the division by h * 2 + 1 in the horizontal step. Instead, divide by the square of that in the vertical step.
I'm making a software rasterizer, and I've run into a bit of a snag: I can't seem to get perspective-correct texture mapping to work.
My algorithm is to first sort the coordinates to plot by y. This returns a highest, lowest and center point. I then walk across the scanlines using the delta's:
// ordering by y is put here
order[0] = &a_Triangle.p[v_order[0]];
order[1] = &a_Triangle.p[v_order[1]];
order[2] = &a_Triangle.p[v_order[2]];
float height1, height2, height3;
height1 = (float)((int)(order[2]->y + 1) - (int)(order[0]->y));
height2 = (float)((int)(order[1]->y + 1) - (int)(order[0]->y));
height3 = (float)((int)(order[2]->y + 1) - (int)(order[1]->y));
// x
float x_start, x_end;
float x[3];
float x_delta[3];
x_delta[0] = (order[2]->x - order[0]->x) / height1;
x_delta[1] = (order[1]->x - order[0]->x) / height2;
x_delta[2] = (order[2]->x - order[1]->x) / height3;
x[0] = order[0]->x;
x[1] = order[0]->x;
x[2] = order[1]->x;
And then we render from order[0]->y to order[2]->y, increasing the x_start and x_end by a delta. When rendering the top part, the delta's are x_delta[0] and x_delta[1]. When rendering the bottom part, the delta's are x_delta[0] and x_delta[2]. Then we linearly interpolate between x_start and x_end on our scanline. UV coordinates are interpolated in the same way, ordered by y, starting at begin and end, to which delta's are applied each step.
This works fine except when I try to do perspective correct UV mapping. The basic algorithm is to take UV/z and 1/z for each vertex and interpolate between them. For each pixel, the UV coordinate becomes UV_current * z_current. However, this is the result:
The inversed part tells you where the delta's are flipped. As you can see, the two triangles both seem to be going towards different points in the horizon.
Here's what I use to calculate the Z at a point in space:
float GetZToPoint(Vec3 a_Point)
{
Vec3 projected = m_Rotation * (a_Point - m_Position);
// #define FOV_ANGLE 60.f
// static const float FOCAL_LENGTH = 1 / tanf(_RadToDeg(FOV_ANGLE) / 2);
// static const float DEPTH = HALFHEIGHT * FOCAL_LENGTH;
float zcamera = DEPTH / projected.z;
return zcamera;
}
Am I right, is it a z buffer issue?
ZBuffer has nothing to do with it.
THe ZBuffer is only useful when triangles are overlapping and you want to make sure that they are drawn correctly (e.g. correctly ordered in the Z). The ZBuffer will, for every pixel of the triangle, determine if a previously placed pixel is nearer to the camera, and if so, not draw the pixel of your triangle.
Since you are drawing 2 triangles which don't overlap, this can not be the issue.
I've made a software rasterizer in fixed point once (for a mobile phone), but I don't have the sources on my laptop. So let me check tonight, how I did it. In essence what you've got is not bad! A thing like this could be caused by a very small error
General tips in debugging this is to have a few test triangles (slope left-side, slope right-side, 90 degree angles, etc etc) and step through it with the debugger and see how your logic deals with the cases.
EDIT:
peudocode of my rasterizer (only U, V and Z are taken into account... if you also want to do gouraud you also have to do everything for R G and B similar as to what you are doing for U and V and Z:
The idea is that a triangle can be broken down in 2 parts. The top part and the bottom part. The top is from y[0] to y[1] and the bottom part is from y[1] to y[2]. For both sets you need to calculate the step variables with which you are interpolating. The below example shows you how to do the top part. If needed I can supply the bottom part too.
Please note that I do already calculate the needed interpolation offsets for the bottom part in the below 'pseudocode' fragment
first order the coords(x,y,z,u,v) in the order so that coord[0].y < coord[1].y < coord[2].y
next check if any 2 sets of coordinates are identical (only check x and y). If so don't draw
exception: does the triangle have a flat top? if so, the first slope will be infinite
exception2: does the triangle have a flat bottom (yes triangles can have these too ;^) ) then the last slope too will be infinite
calculate 2 slopes (left side and right side)
leftDeltaX = (x[1] - x[0]) / (y[1]-y[0]) and rightDeltaX = (x[2] - x[0]) / (y[2]-y[0])
the second part of the triangle is calculated dependent on: if the left side of the triangle is now really on the leftside (or needs swapping)
code fragment:
if (leftDeltaX < rightDeltaX)
{
leftDeltaX2 = (x[2]-x[1]) / (y[2]-y[1])
rightDeltaX2 = rightDeltaX
leftDeltaU = (u[1]-u[0]) / (y[1]-y[0]) //for texture mapping
leftDeltaU2 = (u[2]-u[1]) / (y[2]-y[1])
leftDeltaV = (v[1]-v[0]) / (y[1]-y[0]) //for texture mapping
leftDeltaV2 = (v[2]-v[1]) / (y[2]-y[1])
leftDeltaZ = (z[1]-z[0]) / (y[1]-y[0]) //for texture mapping
leftDeltaZ2 = (z[2]-z[1]) / (y[2]-y[1])
}
else
{
swap(leftDeltaX, rightDeltaX);
leftDeltaX2 = leftDeltaX;
rightDeltaX2 = (x[2]-x[1]) / (y[2]-y[1])
leftDeltaU = (u[2]-u[0]) / (y[2]-y[0]) //for texture mapping
leftDeltaU2 = leftDeltaU
leftDeltaV = (v[2]-v[0]) / (y[2]-y[0]) //for texture mapping
leftDeltaV2 = leftDeltaV
leftDeltaZ = (z[2]-z[0]) / (y[2]-y[0]) //for texture mapping
leftDeltaZ2 = leftDeltaZ
}
set the currentLeftX and currentRightX both on x[0]
set currentLeftU on leftDeltaU, currentLeftV on leftDeltaV and currentLeftZ on leftDeltaZ
calc start and endpoint for first Y range: startY = ceil(y[0]); endY = ceil(y[1])
prestep x,u,v and z for the fractional part of y for subpixel accuracy (I guess this is also needed for floats)
For my fixedpoint algorithms this was needed to make the lines and textures give the illusion of moving in much finer steps then the resolution of the display)
calculate where x should be at y[1]: halfwayX = (x[2]-x[0]) * (y[1]-y[0]) / (y[2]-y[0]) + x[0]
and same for U and V and z: halfwayU = (u[2]-u[0]) * (y[1]-y[0]) / (y[2]-y[0]) + u[0]
and using the halfwayX calculate the stepper for the U and V and z:
if(halfwayX - x[1] == 0){ slopeU=0, slopeV=0, slopeZ=0 } else { slopeU = (halfwayU - U[1]) / (halfwayX - x[1])} //(and same for v and z)
do clipping for the Y top (so calculate where we are going to start to draw in case the top of the triangle is off screen (or off the clipping rectangle))
for y=startY; y < endY; y++)
{
is Y past bottom of screen? stop rendering!
calc startX and endX for the first horizontal line
leftCurX = ceil(startx); leftCurY = ceil(endy);
clip the line to be drawn to the left horizontal border of the screen (or clipping region)
prepare a pointer to the destination buffer (doing it through array indexes everytime is too slow)
unsigned int buf = destbuf + (ypitch) + startX; (unsigned int in case you are doing 24bit or 32 bits rendering)
also prepare your ZBuffer pointer here (if you are using this)
for(x=startX; x < endX; x++)
{
now for perspective texture mapping (using no bilineair interpolation you do the following):
code fragment:
float tv = startV / startZ
float tu = startU / startZ;
tv %= texturePitch; //make sure the texture coordinates stay on the texture if they are too wide/high
tu %= texturePitch; //I'm assuming square textures here. With fixed point you could have used &=
unsigned int *textPtr = textureBuf+tu + (tv*texturePitch); //in case of fixedpoints one could have shifted the tv. Now we have to multiply everytime.
int destColTm = *(textPtr); //this is the color (if we only use texture mapping) we'll be needing for the pixel
dummy line
dummy line
dummy line
optional: check the zbuffer if the previously plotted pixel at this coordinate is higher or lower then ours.
plot the pixel
startZ += slopeZ; startU+=slopeU; startV += slopeV; //update all interpolators
} end of x loop
leftCurX+= leftDeltaX; rightCurX += rightDeltaX; leftCurU+= rightDeltaU; leftCurV += rightDeltaV; leftCurZ += rightDeltaZ; //update Y interpolators
} end of y loop
//this is the end of the first part. We now have drawn half the triangle. from the top, to the middle Y coordinate.
// we now basically do the exact same thing but now for the bottom half of the triangle (using the other set of interpolators)
sorry about the 'dummy lines'.. they were needed to get the markdown codes in sync. (took me a while to get everything sort off looking as intended)
let me know if this helps you solve the problem you are facing!
I don't know that I can help with your question, but one of the best books on software rendering that I had read at the time is available online Graphics Programming Black Book by Michael Abrash.
If you are interpolating 1/z, you need to multiply UV/z by z, not 1/z. Assuming you have this:
UV = UV_current * z_current
and z_current is interpolating 1/z, you should change it to:
UV = UV_current / z_current
And then you might want to rename z_current to something like one_over_z_current.