For my work I have to convert a point cloud to a grey scale (depth) image meaning that the z coordinate of each XYZ point in the cloud represents a shade of grey. For mapping a Z coordinate from the [z_min, z_max] interval to the [0..255] interval I used the map function of Arduino:
float map(float x, float in_min, float in_max, float out_min, float out_max)
{ return (x - in_min) * (out_max - out_min) / (in_max - in_min) + out_min; }
With that done I need to write the result to an image, the problem being that the clouds that I have can have millions of points so I can't just write them 1 by 1 to an image in order. Let's say that I have 3000x1000 ordered XY points. How would I do if I wanted to write them to a 700x300 pixels image? I hope the question is clear, thanks in advance for answering.
I have managed to find a solution to my problem. It is a fairy long algorithm for stack overflow but bear with me. The idea is write a vector of XY grey scale points as a pgm file.
Step 1: cloud_to_greyscale - function that converts an XYZ Point Cloud into a vector of XY grey scale points and that receives a cloud as a parameter:
for each point pt in cloud
point_xy_greyscale.x <- pt.x
point_xy_greyscale.y <- pt.y
point_xy_greyscale.greyscale <- map(pt.z, z_min, z_max, 0, 255)
greyscale_vector.add(point_xy_greyscale)
loop
return greyscale_vector
Step 2: greyscale_to_image - function that writes the previously returned vector as a greyscale_image, a class that has a width, a height and a _pixels member corresponding to a double dimensional array of unsigned short usually. The function receives the following parameters: a greyscale_vector (to be turned into the image) and an x_epsilon that will help us delimit the x pixel coordinates for our points, knowing that the x point coordinates are floats (and thus not suitable as array indices).
A little background info: I work on something called widop clouds so in my 3D space x is the width, y is the depth and z is the height. Also worth noting is the fact that y is an integer so for my problem, the height of the image is easy to find: it's y_max - y_min. To find the width of the image, follow the algorithm below and if it isn't clear I will answer any questions and I'm open to suggestions.
img_width <- 0; // image width
img_height <- y_max - y_min + 1 // image height
// determining image width
for each point greyscale_xy_point in greyscale_vector
point_x_cell <- (pt.x - x_min) * x_epsilon * 10
if point_x_cell > img_width
img_width <- point_x_cell + 1
loop
// defining and initializing image with the calculated height and width
greyscale_img(img_width, img_height)
// initializing greyscale image points
for y <- 0 to greyscale_img.height
for x <- 0 to greyscale_img.width
greyscale_img[y][x] = 0
loop
loop
// filling image with vector data
for each point point_xy_greyscale in greyscale_vector
image_x = (point_xy_greyscale.x - x_min) * x_epsilon * 10
image_y = point_xy_greyscale.y - y_min
greyscale_image[image_y][image_x] = point_xy_greyscale.greyscale
loop
return greyscale_image
The only thing left to do is to write the image to the file, but that is easy to do, you can just find the format rules in the previous link related to the pgm format. I hope this helps someone.
EDIT_1: I added a picture of the result. It is supposed to be a railway and the reason it's fairly dark is that there are some objects that are tall so ground objects are darker.
depth image of railway
Related
I have been using this github repo: https://github.com/aim-uofa/AdelaiDepth/blob/main/LeReS/Minist_Test/tools/test_shape.py
To figure out how this piece of code can be used to get x,y,z coordinates:
def reconstruct_3D(depth, f):
"""
Reconstruct depth to 3D pointcloud with the provided focal length.
Return:
pcd: N X 3 array, point cloud
"""
cu = depth.shape[1] / 2
cv = depth.shape[0] / 2
width = depth.shape[1]
height = depth.shape[0]
row = np.arange(0, width, 1)
u = np.array([row for i in np.arange(height)])
col = np.arange(0, height, 1)
v = np.array([col for i in np.arange(width)])
v = v.transpose(1, 0)
I want to use these coordinates to find distance between 2 people in 3D for an object detection model. Does anyone have any advice?
I know how to use 2d images with yolo to figure out distance between 2 people. Based on this link: Compute the centroid of a rectangle in python
My thinking is i can use the bounding boxes to get corners and then find the centroid and do that for 2 bounding boxes of people and use triangulation to find the hypotenuse between the 2 points (which is their distance).
However, i am having a tricky time on how to use a set of 3d coordinates to find distance between 2 people. I can get the relative distance from my 2d model.
By having a 2D depth image and camera's intrinsic matrix, you can convert each pixel to 3D point cloud as:
z = d
x = (u - cx) * z / f
y = (v - cy) * z / f
// where (cx, cy) is the principle point and f is the focal length.
In the meantime, you can use third party library like open3d for doing the same:
xyz = open3d.geometry.create_point_cloud_from_depth_image(depth, intrinsic)
I just wrote a small netpbm parser and I am having fun with it, drawing mostly parametric equations. They look OK for a first time thing, but how can I expand upon this and have something that looks legit? This picture is how my method recreated the Arctic Monkeys logo which was just
0.5[cos(19t) - cos(21t)]
(I was trying to plot both cosines first before superpositioning them)
It obviously looks very "crispy" and sharp. I used as small of a step size as I could without it taking forever to finish. (0.0005, takes < 5 sec)
The only idea I had was that when drawing a white pixel, I should also draw its immediate neighbors with some slightly lighter gray. And then draw the neighbors of THOSE pixels with even lighter gray. Almost like the white color is "dissolving" or "dissipating".
I didn't try to implement this because it felt like a really bad way to do it and I am not even sure it'd produce anything near the desirable effect so I thought I'd ask first.
EDIT: here's a sample code that draws just a small spiral
the draw loop:
for (int t = 0; t < 6 * M_PI; t += 0.0005)
{
double r = t;
new_x = 10 * r * cosf(0.1 * M_PI * t);
new_y = -10 * r * sinf(0.1 * M_PI * t);
img.SetPixel(new_x + img.img_width / 2, new_y + img.img_height / 2, 255);
}
//img is a PPM image with magic number P5 (binary grayscale)
SetPixel:
void PPMimage::SetPixel(const uint16_t x, const uint16_t y, const uint16_t pixelVal)
{
assert(pixelVal >= 0 && pixelVal <= max_greys && "pixelVal larger than image's maximum max_grey\n%d");
assert(x >= 0 && x < img_width && "X value larger than image width\n");
assert(y >= 0 && y < img_height && "Y value larger than image height\n");
img_raster[y * img_width + x] = pixelVal;
}
This is what this code produces
A very basic form of antialiasing for a scatter plot (made of points rather than lines) can be achieved by applying something like stochastic rounding: consider the brush to be a pixel-sized square (but note the severe limitations of this model), centered at the non-integer coordinates of the plotted point, and compute its overlap with the four pixels that share the corner closest to that point. Treat that overlap fraction as a grayscale fraction and set each pixel to the largest value for a large number of points approximating a line, or do alpha blending for a small number of discrete points.
I am trying to produce random equilateral triangles on the console screen.
The method I am using is creating a center point for the triangle (randomly positioned), moving the center point to the origin (0,0) and then creating 3 points from the center (adding the radius(random number) of the triangle to the Y axis of each point). Then I rotate 2 of the points, one at 120 degrees and the other at 240 making an equilateral triangle then draw lines between the points. Then bring the points back to the original plot relating to the centroid.
This for the most past of the time works and I get an equilateral triangle, however other times I don't quite get an equilateral triangle and I am at a complete loss as to why.
I am using Brensenham's line algorithm to draw the line between points.
Image of working triangle: http://imgur.com/GpF406O
Image of broken triangle: http://imgur.com/Oa2BYun
Here is the code that plots the coords for the triangle:
void Triangle::createVertex(Vertex cent)
{
// angle of 120 in radians
double s120 = sin(2.0943951024);
double c120 = cos(2.0943951024);
// angle of 240 in radians
double s240 = sin(4.1887902048);
double c240 = cos(4.1887902048);
// bringing centroid to the origin and saving old pos to move later on
int x = cent.getX();
int y = cent.getY();
cent.setX(0);
cent.setY(0);
// creating the points all equal distance from the centroid
Vertex v1(cent.getX(), cent.getY() + radius);
Vertex v2(cent.getX(), cent.getY() + radius);
Vertex v3(cent.getX(), cent.getY() + radius);
// rotate points
double newx = v1.getX() * c120 - v1.getY() * s120;
double newy = v1.getY() * c120 + v1.getX() * s120;
double xnew = v2.getX() * c240 - v2.getY() * s240;
double ynew = v2.getY() * c240 + v2.getX() * s240;
// giving the points the actual location in relation the the old pos of the centroid
v1.setX(newx + x);
v1.setY(newy + y);
v2.setX(xnew + x);
v2.setY(ynew + y);
v3.setX(x);
v3.setY(y + radius);
// adding the to a list (list is used in a function to draw the lines)
vertices.push_back(v1);
vertices.push_back(v2);
vertices.push_back(v3);
}
Looking at the images of your two triangles (and looking at the line drawing algorithm) you are drawing lines as a series of discrete pixels. That means a vertex must fall in a pixel (it can't be on a boundary) like in this image.
So what happens if your vertex falls on* a border between pixels? Your line drawing algorithm has to make a decision on which pixel to put the vertex in.
Looking at the algorithm description on wikipedia and the c++ implementation on a page a www.cs.helsinki.fi
I see that both list implementations using integer arithmetic** which in this case is not unreasonable given you have discreet rows of pixels. This means that if your floating point calculations put one vertex above the threshold of the integer label for the next row of pixels when the floor (conversion from float to int) is done, but the other vertex is below that threshold then the two vertices will be placed on different rows.
think v1.y = 5.00000000000000000001 and v2.y = 4.99999999999999999999 which leads to v1 being placed on row 5 and v2 being placed on row 4.
This explains why you only see the issue occurring occasionally, you only occasionally have your vertices land on a boundary like this.
In order to fix a couple of things come to mind:
Fix it when you assign values to your vertices, the y values are the same anyways.
given:
v1.getX() = v2.getX() = 0 (defined by your code)
v1.getY() = v2.getY() = radius (defined by your code)
cos(120 degrees) = cos(240 degrees) ('tis true)
This reduces your two y values to
double newy = v1.getY() * c120
double ynew = v1.getY() * c120
ergo:
v1.setY(newy + y);
v2.setY(newy + y);
If you wrote your own Brensenham's algorithm implementation you could add a check in that code to make sure your vertices are at the same height, but that seems like a really bad place to put that kind of check since the height of the endpoints is specific to your problem and not drawing lines in general.
*Or not exactly on, but close enough you can't tell the difference after accounting for floating point error
**The algorithm is not restricted to integer arithmetic, but I suspect given the irregularity of your problem and the way the algorithm has been presented, along with the fact that you are using discreet characters for the lines in your images the integer arithmetic is the issue.
I wanted to detect ellipse in an image. Since I was learning Mathematica at that time, I asked a question here and got a satisfactory result from the answer below, which used the RANSAC algorithm to detect ellipse.
However, recently I need to port it to OpenCV, but there are some functions that only exist in Mathematica. One of the key function is the "GradientOrientationFilter" function.
Since there are five parameters for a general ellipse, I need to sample five points to determine one. Howevere, the more sampling points indicates the lower chance to have a good guess, which leads to the lower success rate in ellipse detection. Therefore, the answer from Mathematica add another condition, that is the gradient of the image must be parallel to the gradient of the ellipse equation. Anyway, we'll only need three points to determine one ellipse using least square from the Mathematica approach. The result is quite good.
However, when I try to find the image gradient using Sobel or Scharr operator in OpenCV, it is not good enough, which always leads to the bad result.
How to calculate the gradient or the tangent of an image accurately? Thanks!
Result with gradient, three points
Result without gradient, five points
----------updated----------
I did some edge detect and median blur beforehand and draw the result on the edge image. My original test image is like this:
In general, my final goal is to detect the ellipse in a scene or on an object. Something like this:
That's why I choose to use RANSAC to fit the ellipse from edge points.
As for your final goal, you may try
findContours and [fitEllipse] in OpenCV
The pseudo code will be
1) some image process
2) find all contours
3) fit each contours by fitEllipse
here is part of code I use before
[... image process ....you get a bwimage ]
vector<vector<Point> > contours;
findContours(bwimage, contours, CV_RETR_LIST, CV_CHAIN_APPROX_NONE);
for(size_t i = 0; i < contours.size(); i++)
{
size_t count = contours[i].size();
Mat pointsf;
Mat(contours[i]).convertTo(pointsf, CV_32F);
RotatedRect box = fitEllipse(pointsf);
/* You can put some limitation about size and aspect ratio here */
if( box.size.width > 20 &&
box.size.height > 20 &&
box.size.width < 80 &&
box.size.height < 80 )
{
if( MAX(box.size.width, box.size.height) > MIN(box.size.width, box.size.height)*30 )
continue;
//drawContours(SrcImage, contours, (int)i, Scalar::all(255), 1, 8);
ellipse(SrcImage, box, Scalar(0,0,255), 1, CV_AA);
ellipse(SrcImage, box.center, box.size*0.5f, box.angle, 0, 360, Scalar(200,255,255), 1, CV_AA);
}
}
imshow("result", SrcImage);
If you focus on ellipse(no other shape), you can treat the value of the pixels of the ellipse as mass of the points.
Then you can calculate the moment of inertial Ixx, Iyy, Ixy to find out the angle, theta, which can rotate a general ellipse back to a canonical form (X-Xc)^2/a + (Y-Yc)^2/b = 1.
Then you can find out Xc and Yc by the center of mass.
Then you can find out a and b by min X and min Y.
--------------- update -----------
This method can apply to filled ellipse too.
More than one ellipse on a single image will fail unless you segment them first.
Let me explain more,
I will use C to represent cos(theta) and S to represent sin(theta)
After rotation to canonical form, the new X is [eq0] X=xC-yS and Y is Y=xS+yC where x and y are original positions.
The rotation will give you min IYY.
[eq1]
IYY= Sum(m*Y*Y) = Sum{m*(xS+yC)(xS+yC)} = Sum{ m(xxSS+yyCC+xySC) = Ixx*S^2 + Iyy*C^2 + Ixy*S*C
For min IYY, d(IYY)/d(theta) = 0 that is
2IxxSC - 2IyySC + Ixy(CC-SS) = 0
2(Ixx-Iyy)/Ixy = (SS-CC)/SC = S/C+C/S = Z+1/Z
While programming, the LHS is just a number, let's said N
Z^2 - NZ +1 =0
So there are two roots of Z hence theta, let's said Z1 and Z2, one will min the IYY and the other will max the IYY.
----------- pseudo code --------
Compute Ixx, Iyy, Ixy for a hollow or filled ellipse.
Compute theta1=atan(Z1) and theta2=atan(Z2)
Put These two theta into eq1 find which is smaller. Then you get theta.
Go back to those non-zero pixels, transfer them to new X and Y by the theta you found.
Find center of mass Xc Yc and min X and min Y by sort().
-------------- by hand -----------
If you need the original equation of the ellipse
Just put [eq0] into the canonical form
You're using terms in an unusual way.
Normally for images, the term "gradient" is interpreted as if the image is a mathematical function f(x,y). This gives us a (df/dx, df/dy) vector in each point.
Yet you're looking at the image as if it's a function y = f(x) and the gradient would be f(x)/dx.
Now, if you look at your image, you'll see that the two interpretations are definitely related. Your ellipse is drawn as a set of contrasting pixels, and as a result there are two sharp gradients in the image - the inner and outer. These of course correspond to the two normal vectors, and therefore are in opposite directions.
Also note that your image has pixels. The gradient is also pixelated. The way your ellipse is drawn, with a single pixel width means that your local gradient takes on only values that are a multiple of 45 degrees:
▄▄ ▄▀ ▌ ▀▄
I'm making a software rasterizer, and I've run into a bit of a snag: I can't seem to get perspective-correct texture mapping to work.
My algorithm is to first sort the coordinates to plot by y. This returns a highest, lowest and center point. I then walk across the scanlines using the delta's:
// ordering by y is put here
order[0] = &a_Triangle.p[v_order[0]];
order[1] = &a_Triangle.p[v_order[1]];
order[2] = &a_Triangle.p[v_order[2]];
float height1, height2, height3;
height1 = (float)((int)(order[2]->y + 1) - (int)(order[0]->y));
height2 = (float)((int)(order[1]->y + 1) - (int)(order[0]->y));
height3 = (float)((int)(order[2]->y + 1) - (int)(order[1]->y));
// x
float x_start, x_end;
float x[3];
float x_delta[3];
x_delta[0] = (order[2]->x - order[0]->x) / height1;
x_delta[1] = (order[1]->x - order[0]->x) / height2;
x_delta[2] = (order[2]->x - order[1]->x) / height3;
x[0] = order[0]->x;
x[1] = order[0]->x;
x[2] = order[1]->x;
And then we render from order[0]->y to order[2]->y, increasing the x_start and x_end by a delta. When rendering the top part, the delta's are x_delta[0] and x_delta[1]. When rendering the bottom part, the delta's are x_delta[0] and x_delta[2]. Then we linearly interpolate between x_start and x_end on our scanline. UV coordinates are interpolated in the same way, ordered by y, starting at begin and end, to which delta's are applied each step.
This works fine except when I try to do perspective correct UV mapping. The basic algorithm is to take UV/z and 1/z for each vertex and interpolate between them. For each pixel, the UV coordinate becomes UV_current * z_current. However, this is the result:
The inversed part tells you where the delta's are flipped. As you can see, the two triangles both seem to be going towards different points in the horizon.
Here's what I use to calculate the Z at a point in space:
float GetZToPoint(Vec3 a_Point)
{
Vec3 projected = m_Rotation * (a_Point - m_Position);
// #define FOV_ANGLE 60.f
// static const float FOCAL_LENGTH = 1 / tanf(_RadToDeg(FOV_ANGLE) / 2);
// static const float DEPTH = HALFHEIGHT * FOCAL_LENGTH;
float zcamera = DEPTH / projected.z;
return zcamera;
}
Am I right, is it a z buffer issue?
ZBuffer has nothing to do with it.
THe ZBuffer is only useful when triangles are overlapping and you want to make sure that they are drawn correctly (e.g. correctly ordered in the Z). The ZBuffer will, for every pixel of the triangle, determine if a previously placed pixel is nearer to the camera, and if so, not draw the pixel of your triangle.
Since you are drawing 2 triangles which don't overlap, this can not be the issue.
I've made a software rasterizer in fixed point once (for a mobile phone), but I don't have the sources on my laptop. So let me check tonight, how I did it. In essence what you've got is not bad! A thing like this could be caused by a very small error
General tips in debugging this is to have a few test triangles (slope left-side, slope right-side, 90 degree angles, etc etc) and step through it with the debugger and see how your logic deals with the cases.
EDIT:
peudocode of my rasterizer (only U, V and Z are taken into account... if you also want to do gouraud you also have to do everything for R G and B similar as to what you are doing for U and V and Z:
The idea is that a triangle can be broken down in 2 parts. The top part and the bottom part. The top is from y[0] to y[1] and the bottom part is from y[1] to y[2]. For both sets you need to calculate the step variables with which you are interpolating. The below example shows you how to do the top part. If needed I can supply the bottom part too.
Please note that I do already calculate the needed interpolation offsets for the bottom part in the below 'pseudocode' fragment
first order the coords(x,y,z,u,v) in the order so that coord[0].y < coord[1].y < coord[2].y
next check if any 2 sets of coordinates are identical (only check x and y). If so don't draw
exception: does the triangle have a flat top? if so, the first slope will be infinite
exception2: does the triangle have a flat bottom (yes triangles can have these too ;^) ) then the last slope too will be infinite
calculate 2 slopes (left side and right side)
leftDeltaX = (x[1] - x[0]) / (y[1]-y[0]) and rightDeltaX = (x[2] - x[0]) / (y[2]-y[0])
the second part of the triangle is calculated dependent on: if the left side of the triangle is now really on the leftside (or needs swapping)
code fragment:
if (leftDeltaX < rightDeltaX)
{
leftDeltaX2 = (x[2]-x[1]) / (y[2]-y[1])
rightDeltaX2 = rightDeltaX
leftDeltaU = (u[1]-u[0]) / (y[1]-y[0]) //for texture mapping
leftDeltaU2 = (u[2]-u[1]) / (y[2]-y[1])
leftDeltaV = (v[1]-v[0]) / (y[1]-y[0]) //for texture mapping
leftDeltaV2 = (v[2]-v[1]) / (y[2]-y[1])
leftDeltaZ = (z[1]-z[0]) / (y[1]-y[0]) //for texture mapping
leftDeltaZ2 = (z[2]-z[1]) / (y[2]-y[1])
}
else
{
swap(leftDeltaX, rightDeltaX);
leftDeltaX2 = leftDeltaX;
rightDeltaX2 = (x[2]-x[1]) / (y[2]-y[1])
leftDeltaU = (u[2]-u[0]) / (y[2]-y[0]) //for texture mapping
leftDeltaU2 = leftDeltaU
leftDeltaV = (v[2]-v[0]) / (y[2]-y[0]) //for texture mapping
leftDeltaV2 = leftDeltaV
leftDeltaZ = (z[2]-z[0]) / (y[2]-y[0]) //for texture mapping
leftDeltaZ2 = leftDeltaZ
}
set the currentLeftX and currentRightX both on x[0]
set currentLeftU on leftDeltaU, currentLeftV on leftDeltaV and currentLeftZ on leftDeltaZ
calc start and endpoint for first Y range: startY = ceil(y[0]); endY = ceil(y[1])
prestep x,u,v and z for the fractional part of y for subpixel accuracy (I guess this is also needed for floats)
For my fixedpoint algorithms this was needed to make the lines and textures give the illusion of moving in much finer steps then the resolution of the display)
calculate where x should be at y[1]: halfwayX = (x[2]-x[0]) * (y[1]-y[0]) / (y[2]-y[0]) + x[0]
and same for U and V and z: halfwayU = (u[2]-u[0]) * (y[1]-y[0]) / (y[2]-y[0]) + u[0]
and using the halfwayX calculate the stepper for the U and V and z:
if(halfwayX - x[1] == 0){ slopeU=0, slopeV=0, slopeZ=0 } else { slopeU = (halfwayU - U[1]) / (halfwayX - x[1])} //(and same for v and z)
do clipping for the Y top (so calculate where we are going to start to draw in case the top of the triangle is off screen (or off the clipping rectangle))
for y=startY; y < endY; y++)
{
is Y past bottom of screen? stop rendering!
calc startX and endX for the first horizontal line
leftCurX = ceil(startx); leftCurY = ceil(endy);
clip the line to be drawn to the left horizontal border of the screen (or clipping region)
prepare a pointer to the destination buffer (doing it through array indexes everytime is too slow)
unsigned int buf = destbuf + (ypitch) + startX; (unsigned int in case you are doing 24bit or 32 bits rendering)
also prepare your ZBuffer pointer here (if you are using this)
for(x=startX; x < endX; x++)
{
now for perspective texture mapping (using no bilineair interpolation you do the following):
code fragment:
float tv = startV / startZ
float tu = startU / startZ;
tv %= texturePitch; //make sure the texture coordinates stay on the texture if they are too wide/high
tu %= texturePitch; //I'm assuming square textures here. With fixed point you could have used &=
unsigned int *textPtr = textureBuf+tu + (tv*texturePitch); //in case of fixedpoints one could have shifted the tv. Now we have to multiply everytime.
int destColTm = *(textPtr); //this is the color (if we only use texture mapping) we'll be needing for the pixel
dummy line
dummy line
dummy line
optional: check the zbuffer if the previously plotted pixel at this coordinate is higher or lower then ours.
plot the pixel
startZ += slopeZ; startU+=slopeU; startV += slopeV; //update all interpolators
} end of x loop
leftCurX+= leftDeltaX; rightCurX += rightDeltaX; leftCurU+= rightDeltaU; leftCurV += rightDeltaV; leftCurZ += rightDeltaZ; //update Y interpolators
} end of y loop
//this is the end of the first part. We now have drawn half the triangle. from the top, to the middle Y coordinate.
// we now basically do the exact same thing but now for the bottom half of the triangle (using the other set of interpolators)
sorry about the 'dummy lines'.. they were needed to get the markdown codes in sync. (took me a while to get everything sort off looking as intended)
let me know if this helps you solve the problem you are facing!
I don't know that I can help with your question, but one of the best books on software rendering that I had read at the time is available online Graphics Programming Black Book by Michael Abrash.
If you are interpolating 1/z, you need to multiply UV/z by z, not 1/z. Assuming you have this:
UV = UV_current * z_current
and z_current is interpolating 1/z, you should change it to:
UV = UV_current / z_current
And then you might want to rename z_current to something like one_over_z_current.