Kinect 3D to 2D bias - c++

I am struggling with the interpretation of kinect depth data.
In order to obtain real world distance from kinect, i used the following formula :
if(i<2047){
depthToMeterTable[i] = i * -0.0030711016 + 3.3309495161;
}
else{
depthToMeterTable[i] = 0;
}
This formula gives something pretty good as a distance estimator.
However i do obtain strange output from a 90° wall corner visualisation.
On the following image is two different information. First, the violet lines represent the wall as i SHOULD see it. A 90° corner. The red dots represent the wall seen from the kinect. As you can see, the angle of the two planes is now bigger.
http://img843.imageshack.us/img843/4061/kinectbias.jpg
Do you have any idea where i could correct this bias, and how to do it ?
Thank you for reading,
Al_th

I'm not familiar with that conversion formula (also not sure how your depthToMeterTable gets filled - what formula is used there).
There's a built-in function in libfreenect for that though: freenect_camera_to_world
Before that utility function was added I used Matt Fischer's conversion functions(RawDepthToMeters and DepthToWorld).
HTH

Related

Project point from point cloud to Image in OpenCV

I'm trying to project a point from 3D to 2D in OpenCV with C++. At the Moment, I'm using cv::projectPoints() but it's just not working out.
But first things first. I'm trying to write a program, that finds an intersection between a point cloud and a line in space. So I calibrated two cameras, did rectification and matching using SGBM. Finally I projected the disparity map to 3d using reprojectTo3D(). That all works very well and in meshlab, I can visualize my point cloud.
After that I wrote an algorithm to find the intersection between the point cloud and a line which I coded manually. That works fine, too. I found a point in the point cloud about 1.5 mm away from the line, which is good enough for the beginning. So I took this point and tried to project it back to the image, so I could mark it. But here is the problem.
Now the point is not inside the image anymore. As I took an intersection in the middle of the image, this is not possible. I think the problem could be in the coordinate systems, as I don't know in which coordinate system the point cloud is written (left camera, right camera or something else).
My projectPoints function looks like:
projectPoints(intersectionPoint3D, R, T, cameraMatrixLeft, distortionCoeffsLeft, intersectionPoint2D, noArray(), 0);
R and T are the rotation and translation from one camera to another (got that from stereoCalibrate). Here might be my mistake, but how can I fix it? I also tried to set these to (0,0,0) but it doesn't work either. Also I tried to transform the R Matrix using Rodrigues to a vector. Still same problem.
I'm sorry if this has been asked before, but I'm not sure how to search for this problem. I hope my text is clear enought to help me... if you need more information, I will gladly provide it.
Many thanks in advance.
You have a 3D point and you want to get the corresponding 2D location of it right? If you have the camera calibration matrix (3x3 matrix), you will be able to project the point to the image
cv::Point2d get2DFrom3D(cv::Point3d p, cv::Mat1d CameraMat)
{
cv::Point2d pix;
pix.x = (p.x * CameraMat(0, 0)) / p.z + CameraMat(0, 2);
pix.y = ((p.y * CameraMat(1, 1)) / p.z + CameraMat(1, 2));
return pix;
}

Disparity Map Block Matching

I am writing a disparity matching algorithm using block matching, but I am not sure how to find the corresponding pixel values in the secondary image.
Given a square window of some size, what techniques exist to find the corresponding pixels? Do I need to use feature matching algorithms or is there a simpler method, such as summing the pixel values and determining whether they are within some threshold, or perhaps converting the pixel values to binary strings where the values are either greater than or less than the center pixel?
I'm going to assume you're talking about Stereo Disparity, in which case you will likely want to use a simple Sum of Absolute Differences (read that wiki article before you continue here). You should also read this tutorial by Chris McCormick before you read more here.
side note: SAD is not the only method, but it's really common and should solve your problem.
You already have the right idea. Make windows, move windows, sum pixels, find minimums. So I'll give you what I think might help:
To start:
If you have color images, first you will want to convert them to black and white. In python you might use a simple function like this per pixel, where x is a pixel that contains RGB.
def rgb_to_bw(x):
return int(x[0]*0.299 + x[1]*0.587 + x[2]*0.114)
You will want this to be black and white to make the SAD easier to computer. If you're wondering why you don't loose significant information from this, you might be interested in learning what a Bayer Filter is. The Bayer Filter, which is typically RGGB, also explains the multiplication ratios of the Red, Green, and Blue portions of the pixel.
Calculating the SAD:
You already mentioned that you have a window of some size, which is exactly what you want to do. Let's say this window is n x n in size. You would also have some window in your left image WL and some window in your right image WR. The idea is to find the pair that has the smallest SAD.
So, for each left window pixel pl at some location in the window (x,y) you would the absolute value of difference of the right window pixel pr also located at (x,y). you would also want some running value, which is the sum of these absolute differences. In sudo code:
SAD = 0
from x = 0 to n:
from y = 0 to n:
SAD = SAD + absolute_value|pl - pr|
After you calculate the SAD for this pair of windows, WL and WR you will want to "slide" WR to a new location and calculate another SAD. You want to find the pair of WL and WR with the smallest SAD - which you can think of as being the most similar windows. In other words, the WL and WR with the smallest SAD are "matched". When you have the minimum SAD for the current WL you will "slide" WL and repeat.
Disparity is calculated by the distance between the matched WL and WR. For visualization, you can scale this distance to be between 0-255 and output that to another image. I posted 3 images below to show you this.
Typical Results:
Left Image:
Right Image:
Calculated Disparity (from the left image):
you can get test images here: http://vision.middlebury.edu/stereo/data/scenes2003/

About generalized hough transform code

I was looking for an implementation of Generalized Hough Transform,and then I found this website,which showed me a complete implementation of GHT .
I can totally understand how the algorithm processes except this:
Vec2i referenceP = Vec2i(id_max[0]*rangeXY+(rangeXY+1)/2, id_max[1]*rangeXY+(rangeXY+1)/2);
which calculates the reference point of the object based on the maximum value of the hough space,then mutiplied by rangXY to get back to the corresponding position of origin image.(rangeXY is the dimensions in pixels of the squares in which the image is divided. )
I edited the code to
Vec2i referenceP = Vec2i(id_max[0]*rangeXY, id_max[1]*rangeXY);
and I got another reference point then show all edgePoints in the image,which apparently not fit the shape.
I just cannot figure out what the factor(rangeXY+1)/2means.
Is there anyone who has implemented this code or familiared with the rationale of GHT can tell me what the factor rangeXYmeans? Thanks~
I am familiar with the classic Hough Transform, though not with the generalised one. However, I believe you give enough information in your question for me to answer it without being familiar with the algorithm in question.
(rangeXY+1)/2 is simply integer division by 2 with rounding. For instance (4+1)/2 gives 2 while (5+1)/2 gives 3 (2.5 rounds up). Now, since rangeXY is the side of a square block of pixels and id_max is the position (index) of such a block, then id_max[dim]*rangeXY+(rangeXY+1)/2 gives the position of the central pixel in that block.
On the other hand, when you simplified the expression to id_max[dim]*rangeXY, you were getting the position of the top-left rather than the central pixel.

Scalable Ambient Obscurance rendering issue

I am trying to implement this SAO algorithm.
I am getting the following result :
I can't figure out why I have the nose on top of the walls, it seems to be a z-buffer issue.
Here are my input values :
const float projScale = 100.0;
const float radius = 0.9;
const float bias = 0.0005;
const float intensityDivR6 = pow(radius, 6);
I am using the original shader without modifications, except that I disable the usage of mipmaps of the depth buffer.
My depth buffer (on different scene, sorry) :
It should be an issue with the zbuffer linearization or it's not between -1 and 1.
Thank you Bruno, I finally figure out what were the issues.
The first was that I didn't transform my Z correctly, they use a specific pre-pass to make the Z linear and put it between -1 and 1. I was using an incompatible method to do it.
I also had to negate my near and far planes values directly in the projection matrix to compute correctly some uniforms.
Result :
I had a similar problem, having visual wrong occlusion, linked to the near/far, so I decided to give you what I've done to fix it.
The problem I had is discribed in a previous comment. I was getting self occlusion, when the camera was close to an object or when the radius was really too big.
If you take a closer look at the conversion from depth buffer value to camera-space value (the reconstructCSZ function from the g3d engine), you will see that replacing the depth by 0 will give you the near plane if you work with positive near/far. So, what it means is that every time you will get a tap outside the model, you will get a z component equals to near, which will give you wrong occlusion for fragments having a z close to 0.
You basically have to discard each taps that are located on the near plane, to avoid them being taken into account when comptuing the full contribution.

How to create a depth map from PointGrey BumbleBee2 stereo camera using Triclops and FlyCapture SDKs?

I've got the BumbleBee 2 stereo camera and two mentioned SDKs.
I've managed to capture a video from it in my program, rectify stereo images and get a disparity map. Next thing I'd like to have is a depth map similar to one, the Kinect gives.
The Triclops' documentation is rather short, it only references functions, without typical workflow description. The workflow is described in examples.
Up to now I've found 2 relevant functions: family of triclopsRCDxxToXYZ() functions and triclopsExtractImage3d() function.
Functions from the first family calculate x, y and z coordinate for a single pixel. Z coordinate perfectly corresponds to the depth in meters. However, to use this function I should create two nested loops, as shown in the stereo3dpoints example. That gives too much overhead, because each call returns two more coordinates.
The second function, triclopsExtractImage3d(), always returns error TriclopsErrorInvalidParameter. The documentation says only that "there is a geometry mismatch between the context and the TriclopsImage3d", which is not clear for me.
Examples of Triclops 3.3.1 SDK do not show how to use it. Google brings example from Triclops SDK 3.2, which is absent in 3.3.1.
I've tried adding lines 253-273 from the link above to current stereo3dpoints - got that error.
Does anyone have an experience with it?
Is it valid to use triclopsExtractImage3d() or is it obsolete?
I also tried plotting values of disparity vs. z, obtained from triclopsRCDxxToXYZ().
The plot shows almost exact inverse proportionality: .
That is z = k / disparity. But k is not constant across the image, it varies from approximately 2.5e-5 to 1.4e-3, that is two orders of magnitude. Therefore, it is incorrect to calculate this value once and use forever.
Maybe it is a bit to late and you figured it out by yourself but:
To use triclopsExtractImage3d you have to create a 3dImage first.
TriclopsImage3d *depthImage;
triclopsCreateImage3d(triclopsContext, &depthImage);
triclopsExtractImage3d(triclopsContext, depthImage);
triclopsDestroyImage3d(&depthImage);