Converting Kinect depth image to Real world coordinate - c++

I'm working with the kinect, using OpenNI 2.x, c++, OpenCV.
I am able to get the kinect depth streaming and obtain a grey-scale cv::Mat. just to show how it is defined:
cv::Mat m_depthImage;
m_depthImage= cvCreateImage(cvSize(640, 480), 8, 1);
I suppose that the closest value is represented by "0" and the farthest by "255".
After that, I make a conversion between depth coordinates to World coordinates. I do it element by element of the cv::Mat grey-scale matrix, and i collect data in PointsWorld[640*480].
In order to display these data, I adjust the scale in order to collect value in a 2000x2000x2000 matrix.
cv::Point3f depthPoint;
cv::Point3f PointsWorld[640*480];
for (int j=0;j<m_depthImage.rows;j++)
{
for(int i=0;i<m_depthImage.cols; i++)
{
depthPoint.x = (float) i;
depthPoint.y = (float) j;
depthPoint.z = (float) m_depthImage.at<unsigned char>(j, i);
if (depthPoint.z!=255)
{
openni::CoordinateConverter::convertDepthToWorld(*m_depth,depthPoint.x,depthPoint.y,depthPoint.z, &wx,&wy,&wz);
wx = wx*7,2464; //138->1000
if (wx<-999) wx = -999;
if (wx>999) wx = 999;
wy = wy*7,2464; //111->1000 with 9,009
if (wy<-999) wy = -999;
if (wy>999) wy = 999;
wz=wz*7,8431; //255->2000
if (wz>1999) wy = 1999;
Xsp = P-floor(wx);
Ysp = P+floor(wy);
Zsp = 2*P-floor(wz);
PointsWorld[k].x = Xsp;
PointsWorld[k].y = Ysp;
PointsWorld[k].z = Zsp;
k++;
}
}
}
But i'm sure that doing that do not allow me to have a comprehension of the real distance between points. An x,y,z coordinate what will mean?
There is a way to know the real distance between points, to know how much far is, for example, the grey value of the matrix "255"? and the wx,wy,wz what they are for?

If you have OpenCV built with OpenNI support you should be able to do something like:
int ptcnt;
cv::Mat real;
cv::Point3f PointsWorld[640*480];
if( capture.retrieve(real, CV_CAP_OPENNI_POINT_CLOUD_MAP)){
for (int j=0;j<m_depthImage.rows;j++)
{
for(int i=0;i<m_depthImage.cols; i++){
PointsWorld[ptcnt] = real.at<cv::Vec3f>(i,j);
ptcnt++;
}
}
}

Related

Object detection with every pixel information OpenCV

Input and Output Images of my code are here.
I want output as complete object detection with every pixel. Here I get with some shadows as well as other background pixels and missing some object points.
So can anybody have idea how can I get complete object detection (foreground detection) with this input images (object image and background image)?
Below is the code I have tried.
cv::Mat ImgObject, ImgBck;
ImgObject = imread("Object.jpg");
ImgBck = imread("Background.jpg");
imwrite("ImgObject.jpg", ImgObject);
imwrite("ImgBck.jpg", ImgBck);
cv::Mat diffImage;
ImgBck = ImgBck + Scalar(-20, -20 - 20);/* decrease brightness of background
because of brightness changes after putting object */
cv::absdiff(ImgObject, ImgBck, diffImage);
float threshold = (float)50;
float dist = 0.0f;
for (int j = 0; j < diffImage.rows; ++j)
{
for (int i = 0; i<diffImage.cols; ++i)
{
cv::Vec3b pix = diffImage.at<cv::Vec3b>(j, i);
dist = (pix[0] * pix[0] + pix[1] * pix[1] + pix[2] * pix[2]);
dist = sqrt(dist);
cv::Point3_<uchar>* pFinal = ImgObject.ptr<Point3_<uchar> >(j, i);
if (dist <= threshold)
{
pFinal->x = 255; // fill blue as background
pFinal->y = 0;
pFinal->z = 0;
}
}
}
imwrite("Obj.jpg", ImgObject);
ImgObject.release();
ImgBck.release();
Do not use direct light on the object(To reduce Shadow and Reflection).
Firstly, I need to say that this is not an object detection task, but a saliency detection or segmentation task.
Second, as #Kartik Maheshwari said, you are facing a lightning issue which is not a solved problem in Computer Vision.
As an alternative answer, take a look at this.

3D reconstruction from multiple images with one camera

So, I've been trying to get a 3D cloud point from a sequence of images of an object. I have successfully obtained a decent point cloud with two images. I got that from matching features on both images, finding the fundamental matrix and from that, extracting P' (the camera matrix for the second view). For the first view, I set P = K(I | 0), where K is the matrix for the camera intrinsics. But I haven't been able to extend this approach to several images. My idea was to do this sliding the two image window through the sequence of images(e.g. match image1 with image2, find 3d points, match image2 with image3 and then find the more 3d points, and so on). For the following image pairs, P would be made of a cumulative rotation matrix and a cumulative translation vector (this would allow me to keep bringing the points to the first camera coordinate system). But this is not working at all. I'm using OpenCV. What I wanna know is if this approach makes sense at all.
In the code, P_prev is P and Pl is P'. This is just the part that I think it's relevant.
Mat combinedPointCloud;
Mat P_prev;
P_prev = (Mat_<double>(3,4) << cameraMatrix.at<double>(0,0), cameraMatrix.at<double>(0,1), cameraMatrix.at<double>(0,2), 0,
cameraMatrix.at<double>(1,0), cameraMatrix.at<double>(1,1), cameraMatrix.at<double>(1,2), 0,
cameraMatrix.at<double>(2,0), cameraMatrix.at<double>(2,1), cameraMatrix.at<double>(2,2), 0);
for(int i = 1; i < images.size(); i++) {
Mat points3D;
image1 = images[i-1];
image2 = images[i];
matchTwoImages(image1, image2, imgpts1, imgpts2);
P = findSecondProjectionMatrix(cameraMatrix, imgpts1, imgpts2);
P.col(0).copyTo(R.col(0));
P.col(1).copyTo(R.col(1));
P.col(2).copyTo(R.col(2));
P.col(3).copyTo(t.col(0));
if(i == 1) {
Pl = P;
triangulatePoints(P_prev, Pl, imgpts1, imgpts2, points3D); //points3D is 4xN
//Transforming to euclidean by hand, because couldn't make
// opencv's convertFromHomogeneous work
aux.create(3, points3D.cols, CV_64F);// aux is 3xN
for(int i = 0; i < points3D.cols; i++) {
aux.at<float>(0, i) = points3D.at<float>(0, i)/points3D.at<float>(3, i);
aux.at<float>(1, i) = points3D.at<float>(1, i)/points3D.at<float>(3, i);
aux.at<float>(2, i) = points3D.at<float>(2, i)/points3D.at<float>(3, i);
}
points3D.create(3, points3D.cols, CV_64F);
aux.copyTo(points3D);
}
else {
R_aux = R_prev * R;
t_aux = t_prev + t;
R_aux.col(0).copyTo(Pl.col(0));
R_aux.col(1).copyTo(Pl.col(1));
R_aux.col(2).copyTo(Pl.col(2));
t_aux.col(0).copyTo(Pl.col(3));
triangulatePoints(P_prev, Pl, imgpts1, imgpts2, points3D);
//Transforming to euclidean by hand, because couldn't make
// opencv's convertFromHomogeneous work
aux.create(3, points3D.cols, CV_64F);// aux is 3xN
for(int i = 0; i < points3D.cols; i++) {
aux.at<float>(0, i) = points3D.at<float>(0, i)/points3D.at<float>(3, i);
aux.at<float>(1, i) = points3D.at<float>(1, i)/points3D.at<float>(3, i);
aux.at<float>(2, i) = points3D.at<float>(2, i)/points3D.at<float>(3, i);
}
points3D.create(3, points3D.cols, CV_64F);
aux.copyTo(points3D);
}
Pl.col(0).copyTo(R_prev.col(0));
Pl.col(1).copyTo(R_prev.col(1));
Pl.col(2).copyTo(R_prev.col(2));
Pl.col(3).copyTo(t_prev.col(0));
P_prev = Pl;
if(i==1) {
points3D.copyTo(combinedPointCloud);
} else {
hconcat(combinedPointCloud, points3D, combinedPointCloud);
}
}
show3DCloud(comninedPointCloud);

Affine transformation (shear) opencv c++

So I have an image which I want to apply a shear transformation on using OpenCV.
Here's a function I wrote,
Mat shearMat(Mat img){
Mat temp;
img.copyTo(temp);
Mat M(2,3,CV_32F);
M.at<uint16_t>(0,0) = 1;
M.at<uint16_t>(0,1) = 0;
M.at<uint16_t>(0,2) = 0;
M.at<uint16_t>(1,0) = 0;
M.at<uint16_t>(1,1) = 1;
M.at<uint16_t>(1,2) = 0;
warpAffine(temp, temp, M , Size(temp.cols,temp.rows));
imshow("Sheared_Image", temp);
M is the inputArray for warpAffine(<#InputArray src#>, <#OutputArray dst#>, <#InputArray M#>, <#Size dsize#>);
I need to know how M should look like if I want to apply a shear transformation. It says it should be a 2x3 Matrix. When I try to show the image now, nothing appears. I know that M.at<uint16_t>(1,0) should be the shear factor, but its not working with warpAffine.
Thanks!
You are using wrong data type.
M.at<float>(0,0) = 1;
M.at<float>(0,1) = 0;
M.at<float>(0,2) = 0;
M.at<float>(1,0) = 0.5;
M.at<float>(1,1) = 1;
M.at<float>(1,2) = 0;

OpenCV templates in 2D point data set

I was wandering what the best approach would be for detecting 'figures' in an array of 2D points.
In this example I have two 'templates'. Figure 1 is a template and figure 2 is a template.
Each of these templates exists only as a vector of points with an x,y coordinate.
Let's say we have a third vector with points with x,y coordinate
What would be the best way to find out and isolate points matching one of the first two arrays in the third one. (including scaling, rotation)?
I have been trying nearest neigbours(FlannBasedMatcehr) or even SVM implementation but it doesn't seem to get me any result, template matching doesn't seem to be the way to go either, I think. I am not working on images but only on 2D points in memory...
Especially because the input vector always has more points than the original data set to be compared with.
All it needs to do is find points in array that match a template.
I am not a 'specialist' in machine learning or opencv. I guess I am overlooking something from the beginning...
Thank you very much for your help/suggestions.
just for fun I tried this:
Choose two points of the point dataset and compute the transformation mapping the first two pattern points to those points.
Test whether all transformed pattern points can be found in the data set.
This approach is very naive and has a complexity of O(m*n²) with n data points and a single pattern of size m (points). This complexity might be increased for some nearest neighbor search methods. So you have to consider whether it's not efficient enough for your appplication.
Some improvements could include some heuristic to not choose all n² combinations of points but, but you need background information of maximal pattern scaling or something like that.
For evaluation I first created a pattern:
Then I create random points and add the pattern somewhere within (scaled, rotated and translated):
After some computation this method recognizes the pattern. The red line shows the chosen points for transformation computation.
Here's the code:
// draw a set of points on a given destination image
void drawPoints(cv::Mat & image, std::vector<cv::Point2f> points, cv::Scalar color = cv::Scalar(255,255,255), float size=10)
{
for(unsigned int i=0; i<points.size(); ++i)
{
cv::circle(image, points[i], 0, color, size);
}
}
// assumes a 2x3 (affine) transformation (CV_32FC1). does not change the input points
std::vector<cv::Point2f> applyTransformation(std::vector<cv::Point2f> points, cv::Mat transformation)
{
for(unsigned int i=0; i<points.size(); ++i)
{
const cv::Point2f tmp = points[i];
points[i].x = tmp.x * transformation.at<float>(0,0) + tmp.y * transformation.at<float>(0,1) + transformation.at<float>(0,2) ;
points[i].y = tmp.x * transformation.at<float>(1,0) + tmp.y * transformation.at<float>(1,1) + transformation.at<float>(1,2) ;
}
return points;
}
const float PI = 3.14159265359;
// similarity transformation uses same scaling along both axes, rotation and a translation part
cv::Mat composeSimilarityTransformation(float s, float r, float tx, float ty)
{
cv::Mat transformation = cv::Mat::zeros(2,3,CV_32FC1);
// compute rotation matrix and scale entries
float rRad = PI*r/180.0f;
transformation.at<float>(0,0) = s*cosf(rRad);
transformation.at<float>(0,1) = s*sinf(rRad);
transformation.at<float>(1,0) = -s*sinf(rRad);
transformation.at<float>(1,1) = s*cosf(rRad);
// translation
transformation.at<float>(0,2) = tx;
transformation.at<float>(1,2) = ty;
return transformation;
}
// create random points
std::vector<cv::Point2f> createPointSet(cv::Size2i imageSize, std::vector<cv::Point2f> pointPattern, unsigned int nRandomDots = 50)
{
// subtract center of gravity to allow more intuitive rotation
cv::Point2f centerOfGravity(0,0);
for(unsigned int i=0; i<pointPattern.size(); ++i)
{
centerOfGravity.x += pointPattern[i].x;
centerOfGravity.y += pointPattern[i].y;
}
centerOfGravity.x /= (float)pointPattern.size();
centerOfGravity.y /= (float)pointPattern.size();
pointPattern = applyTransformation(pointPattern, composeSimilarityTransformation(1,0,-centerOfGravity.x, -centerOfGravity.y));
// create random points
//unsigned int nRandomDots = 0;
std::vector<cv::Point2f> pointset;
srand (time(NULL));
for(unsigned int i =0; i<nRandomDots; ++i)
{
pointset.push_back( cv::Point2f(rand()%imageSize.width, rand()%imageSize.height) );
}
cv::Mat image = cv::Mat::ones(imageSize,CV_8UC3);
image = cv::Scalar(255,255,255);
drawPoints(image, pointset, cv::Scalar(0,0,0));
cv::namedWindow("pointset"); cv::imshow("pointset", image);
// add point pattern to a random location
float scaleFactor = rand()%30 + 10.0f;
float translationX = rand()%(imageSize.width/2)+ imageSize.width/4;
float translationY = rand()%(imageSize.height/2)+ imageSize.height/4;
float rotationAngle = rand()%360;
std::cout << "s: " << scaleFactor << " r: " << rotationAngle << " t: " << translationX << "/" << translationY << std::endl;
std::vector<cv::Point2f> transformedPattern = applyTransformation(pointPattern,composeSimilarityTransformation(scaleFactor,rotationAngle,translationX,translationY));
//std::vector<cv::Point2f> transformedPattern = applyTransformation(pointPattern,trans);
drawPoints(image, transformedPattern, cv::Scalar(0,0,0));
drawPoints(image, transformedPattern, cv::Scalar(0,255,0),3);
cv::imwrite("dataPoints.png", image);
cv::namedWindow("pointset + pattern"); cv::imshow("pointset + pattern", image);
for(unsigned int i=0; i<transformedPattern.size(); ++i)
pointset.push_back(transformedPattern[i]);
return pointset;
}
void programDetectPointPattern()
{
cv::Size2i imageSize(640,480);
// create a point pattern, this can be in any scale and any relative location
std::vector<cv::Point2f> pointPattern;
pointPattern.push_back(cv::Point2f(0,0));
pointPattern.push_back(cv::Point2f(2,0));
pointPattern.push_back(cv::Point2f(4,0));
pointPattern.push_back(cv::Point2f(1,2));
pointPattern.push_back(cv::Point2f(3,2));
pointPattern.push_back(cv::Point2f(2,4));
// transform the pattern so it can be drawn
cv::Mat trans = cv::Mat::ones(2,3,CV_32FC1);
trans.at<float>(0,0) = 20.0f; // scale x
trans.at<float>(1,1) = 20.0f; // scale y
trans.at<float>(0,2) = 20.0f; // translation x
trans.at<float>(1,2) = 20.0f; // translation y
// draw the pattern
cv::Mat drawnPattern = cv::Mat::ones(cv::Size2i(128,128),CV_8U);
drawnPattern *= 255;
drawPoints(drawnPattern,applyTransformation(pointPattern, trans), cv::Scalar(0),5);
// display and save pattern
cv::imwrite("patternToDetect.png", drawnPattern);
cv::namedWindow("pattern"); cv::imshow("pattern", drawnPattern);
// draw the points and the included pattern
std::vector<cv::Point2f> pointset = createPointSet(imageSize, pointPattern);
cv::Mat image = cv::Mat(imageSize, CV_8UC3);
image = cv::Scalar(255,255,255);
drawPoints(image,pointset, cv::Scalar(0,0,0));
// normally we would have to use some nearest neighbor distance computation, but to make it easier here,
// we create a small area around every point, which allows to test for point existence in a small neighborhood very efficiently (for small images)
// in the real application this "inlier" check should be performed by k-nearest neighbor search and threshold the distance,
// efficiently evaluated by a kd-tree
cv::Mat pointImage = cv::Mat::zeros(imageSize,CV_8U);
float maxDist = 3.0f; // how exact must the pattern be recognized, can there be some "noise" in the position of the data points?
drawPoints(pointImage, pointset, cv::Scalar(255),maxDist);
cv::namedWindow("pointImage"); cv::imshow("pointImage", pointImage);
// choose two points from the pattern (can be arbitrary so just take the first two)
cv::Point2f referencePoint1 = pointPattern[0];
cv::Point2f referencePoint2 = pointPattern[1];
cv::Point2f diff1; // difference vector
diff1.x = referencePoint2.x - referencePoint1.x;
diff1.y = referencePoint2.y - referencePoint1.y;
float referenceLength = sqrt(diff1.x*diff1.x + diff1.y*diff1.y);
diff1.x = diff1.x/referenceLength; diff1.y = diff1.y/referenceLength;
std::cout << "reference: " << std::endl;
std::cout << referencePoint1 << std::endl;
// now try to find the pattern
for(unsigned int j=0; j<pointset.size(); ++j)
{
cv::Point2f targetPoint1 = pointset[j];
for(unsigned int i=0; i<pointset.size(); ++i)
{
cv::Point2f targetPoint2 = pointset[i];
cv::Point2f diff2;
diff2.x = targetPoint2.x - targetPoint1.x;
diff2.y = targetPoint2.y - targetPoint1.y;
float targetLength = sqrt(diff2.x*diff2.x + diff2.y*diff2.y);
diff2.x = diff2.x/targetLength; diff2.y = diff2.y/targetLength;
// with nearest-neighborhood search this line will be similar or the maximal neighbor distance must be relative to targetLength!
if(targetLength < maxDist) continue;
// scale:
float s = targetLength/referenceLength;
// rotation:
float r = -180.0f/PI*(atan2(diff2.y,diff2.x) + atan2(diff1.y,diff1.x));
// scale and rotate the reference point to compute the translation needed
std::vector<cv::Point2f> origin;
origin.push_back(referencePoint1);
origin = applyTransformation(origin, composeSimilarityTransformation(s,r,0,0));
// compute the translation which maps the two reference points on the two target points
float tx = targetPoint1.x - origin[0].x;
float ty = targetPoint1.y - origin[0].y;
std::vector<cv::Point2f> transformedPattern = applyTransformation(pointPattern,composeSimilarityTransformation(s,r,tx,ty));
// now test if all transformed pattern points can be found in the dataset
bool found = true;
for(unsigned int i=0; i<transformedPattern.size(); ++i)
{
cv::Point2f curr = transformedPattern[i];
// here we check whether there is a point drawn in the image. If you have no image you will have to perform a nearest neighbor search.
// this can be done with a balanced kd-tree in O(log n) time
// building such a balanced kd-tree has to be done once for the whole dataset and needs O(n*(log n)) afair
if((curr.x >= 0)&&(curr.x <= pointImage.cols-1)&&(curr.y>=0)&&(curr.y <= pointImage.rows-1))
{
if(pointImage.at<unsigned char>(curr.y, curr.x) == 0) found = false;
// if working with kd-tree: if nearest neighbor distance > maxDist => found = false;
}
else found = false;
}
if(found)
{
std::cout << composeSimilarityTransformation(s,r,tx,ty) << std::endl;
cv::Mat currentIteration;
image.copyTo(currentIteration);
cv::circle(currentIteration,targetPoint1,5, cv::Scalar(255,0,0),1);
cv::circle(currentIteration,targetPoint2,5, cv::Scalar(255,0,255),1);
cv::line(currentIteration,targetPoint1,targetPoint2,cv::Scalar(0,0,255));
drawPoints(currentIteration, transformedPattern, cv::Scalar(0,0,255),4);
cv::imwrite("detectedPattern.png", currentIteration);
cv::namedWindow("iteration"); cv::imshow("iteration", currentIteration); cv::waitKey(-1);
}
}
}
}

Conversion from PointCloud to Mat

Let's say I initialize a point-cloud. I want to store its RGB channels in opencv's Mat data-type. How can I do that?
pcl::PointCloud<pcl::PointXYZRGBA>::Ptr cloud (new pcl::PointCloud<pcl::PointXYZRGBA>); //Create a new cloud
pcl::io::loadPCDFile<pcl::PointXYZRGBA> ("cloud.pcd", *cloud);
Do I understand it right, that you are only interested in the RGB-values of the point-cloud and don't care about its XYZ-values?
Then you can do:
pcl::PointCloud<pcl::PointXYZRGBA>::Ptr cloud (new pcl::PointCloud<pcl::PointXYZRGBA>);
//Create a new cloud
pcl::io::loadPCDFile<pcl::PointXYZRGBA> ("cloud.pcd", *cloud);
cv::Mat result;
if (cloud->isOrganized()) {
result = cv::Mat(cloud->height, cloud->width, CV_8UC3);
if (!cloud->empty()) {
for (int h=0; h<result.rows; h++) {
for (int w=0; w<result.cols; w++) {
pcl::PointXYZRGBA point = cloud->at(w, h);
Eigen::Vector3i rgb = point.getRGBVector3i();
result.at<cv::Vec3b>(h,w)[0] = rgb[2];
result.at<cv::Vec3b>(h,w)[1] = rgb[1];
result.at<cv::Vec3b>(h,w)[2] = rgb[0];
}
}
}
}
I think it's enough to show the basic idea.
BUT this only works, if your point-cloud is organized:
An organized point cloud dataset is the name given to point clouds
that resemble an organized image (or matrix) like structure, where the
data is split into rows and columns. Examples of such point clouds
include data coming from stereo cameras or Time Of Flight cameras. The
advantages of a organized dataset is that by knowing the relationship
between adjacent points (e.g. pixels), nearest neighbor operations are
much more efficient, thus speeding up the computation and lowering the
costs of certain algorithms in PCL. (Source)
I know how to convert from Mat(3D Image) to XYZRGB. I think you can figure out the other way. Here Q is disparity to depth Matrix.
pcl::PointCloud<pcl::PointXYZRGB>::Ptr point_cloud_ptr (new pcl::PointCloud<pcl::PointXYZRGB>);
double px, py, pz;
uchar pr, pg, pb;
for (int i = 0; i < img_rgb.rows; i++)
{
uchar* rgb_ptr = img_rgb.ptr<uchar>(i);
uchar* disp_ptr = img_disparity.ptr<uchar>(i);
double* recons_ptr = recons3D.ptr<double>(i);
for (int j = 0; j < img_rgb.cols; j++)
{
//Get 3D coordinates
uchar d = disp_ptr[j];
if ( d == 0 ) continue; //Discard bad pixels
double pw = -1.0 * static_cast<double>(d) * Q32 + Q33;
px = static_cast<double>(j) + Q03;
py = static_cast<double>(i) + Q13;
pz = Q23;
// Normalize the points
px = px/pw;
py = py/pw;
pz = pz/pw;
//Get RGB info
pb = rgb_ptr[3*j];
pg = rgb_ptr[3*j+1];
pr = rgb_ptr[3*j+2];
//Insert info into point cloud structure
pcl::PointXYZRGB point;
point.x = px;
point.y = py;
point.z = pz;
uint32_t rgb = (static_cast<uint32_t>(pr) << 16 |
static_cast<uint32_t>(pg) << 8 | static_cast<uint32_t>(pb));
point.rgb = *reinterpret_cast<float*>(&rgb);
point_cloud_ptr->points.push_back (point);
}
}
point_cloud_ptr->width = (int) point_cloud_ptr->points.size();
point_cloud_ptr->height = 1;
I have the same problem and I succeed to solve it!
You should firstly transform the coordinate so that your 'ground plane' is the X-O-Y plane.
The core api is pcl::getTransformationFromTwoUnitVectorsAndOrigin
You can have a look at my question:
good luck!