Calculate 3D gradients with openCv - c++

I'm using OpenCV and C++ to make a 3D HOG function (I already made the 2D one). I have a 3 dimensional Mat (x,y,z) where the z value is the depth info of the pixel (x,y) (a combination of the 2D color Image and the depthImage from the Kinect). The next step is to calculate gradients of x, y and z, in the 2D HOG function I used Sobel to do that :
Mat gx, gy;
Sobel(img, gx, CV_32F, 1, 0, 1);
Sobel(img, gy, CV_32F, 0, 1, 1);
But when it comes to the 3D gradients, I did not find a way to do it. Is there any simple way for that???

Related

How to get the difference of Aruco Markers' axis-angel in the world coordinate?

I am new to opencv and Aruco. I attempted to find out the differences in axis-angles of two different Aruco markers. For example, the angle difference b/w the two (1, 0, 0) vectors of markers in the real world. In my understanding, the transform happens in the following order: local coordinate-> camera coordinate -> world coordinate. And then I get angle difference of two aruco markers because they are now both in the same world coordinate. Can someone explain how this process is done? Or is there any better way to find out the angle difference? I am coding with python and opencv
Known:
Translation and rotation vectors(1x3) from estimatePoseSingleMarkers() function in aruco module. (rotation and translation vectors can be transformed to matrix(3x3) with Rodrigues() )
camera Matrix(3x3) and dist_coefs matrix(1x5) from camera calibration.
Using 6x6_250 aruco markers
Upate:
world coordinate = camera coordinate
The following is the function to draw the aruco's XYZ components. It looks like the origin or the camera is (0, 0, 0). Apply the translation and rotation matrix of aruco marker to the x, y, z components of origin gets me the x, y, z components of aruco marker. Is that right?
/**
*/
void drawAxis(InputOutputArray _image, InputArray _cameraMatrix, InputArray _distCoeffs,
InputArray _rvec, InputArray _tvec, float length) {
CV_Assert(_image.getMat().total() != 0 &&
(_image.getMat().channels() == 1 || _image.getMat().channels() == 3));
CV_Assert(length > 0);
// project axis points
vector< Point3f > axisPoints;
axisPoints.push_back(Point3f(0, 0, 0));
axisPoints.push_back(Point3f(length, 0, 0));
axisPoints.push_back(Point3f(0, length, 0));
axisPoints.push_back(Point3f(0, 0, length));
vector< Point2f > imagePoints;
projectPoints(axisPoints, _rvec, _tvec, _cameraMatrix, _distCoeffs, imagePoints);
// draw axis lines
line(_image, imagePoints[0], imagePoints[1], Scalar(0, 0, 255), 3);
line(_image, imagePoints[0], imagePoints[2], Scalar(0, 255, 0), 3);
line(_image, imagePoints[0], imagePoints[3], Scalar(255, 0, 0), 3);
}
Technically it's a work around to my question. So I figured ARUCO library has this function called projectPoints which maps the 3D objectPoints into 2D imagePoints.
imagePoints[0] and imagePoints[1] gives you the 2D projection of (length, 0, 0). By using arccos(np.dot(v1,v2)), you can get the difference in x-axis angle of the two markers.

How to get the z coordinate (distance) from a Kinect v2 RGB coordinate?

I am trying to convert libfreenect RGB coordinates into DEPTH coordinates, in order to find the z coordinate (distance of object). I tried so many things but I was not able to get the resolutions right. Any ideas?
I am using rgbmat in order to apply Kalman filter. So I only have the RGB center coordinates of an object:
libfreenect2::Frame undistorted(512, 424, 4), registered(512, 424, 4), depth2rgb(1920, 1080 + 2, 4);
libfreenect2::Frame *rgb = frames[libfreenect2::Frame::Color];
Mat(rgb->height, rgb->width, CV_8UC4, rgb->data).copyTo(rgbmat);
registration->apply(rgb, depth, &undistorted, &registered, true, &depth2rgb);
Finding xyz coordinates from undistorted frame, with input centerX and centerY from rgb frame:
float centerX, centerY, centerZ;
registration->getPointXYZ(&undistorted, center.y, center.x, centerX, centerY, centerZ);
With this code the output is always wrong. Most of the time "inf" or "-inf" values. When I perform my tracking algorithm on registered frames, the output (distance) is correct, although I cannot use registered frames because I need to analyse RGB frames.
I also tried changing "undistorted" in getPointXYZ() to "depth2rgb", but no success :(
Is there a way to get RGB values out of registered frames?

OpenCV triangulatePoints handedness

I have two cameras attached rigidly side by side looking in parallel directions.
Projection matrix for left camera
Projection matrix for right camera
When I perform triangulatePoints on the two vectors of corresponding points, I get the collection of points in 3D space. All of the points in 3D space have a negative Z coordinate.
So, to get to the bottom of this...
My assumption was that OpenCV uses Right Hand coordinate system.
The handedness reminder:
I assume that the initial orientation of each camera is directed in the positive Z axis direction.
So, by using the projection matrices like the ones I presented in the beginning, I would assume that the cameras are positioned in space like this:
This assumption is in conflict with what I observe when I get negative Z corrdinates for the traingulated points. The only explanation I can think of is that OpenCV in fact uses the Left Hand coordinate system. So, with the projection matrices I stated in the beginning, this is how the cameras are positioned in space:
This would indicate that my left camera in this situation is not on the left side. And this is why I am getting negative depth for points.
Moreover, if I try to combine triangulatePoints with solvePnP I run into problems.
I use the output of triangulatePoints as an input to solvePnP. I expect to get the camera coordinates near the origin of 3D coordinate system. I expect for calculated camera position to match the projection matrices used in the beginning. But this is not happening. I get some completely wild results, missing the expected values by over 10 times the baseline length.
Example
This example is more complete representation of the problem than what is stated above.
points3D
Here is the code for generating these points.
Movin on, setting up camera A and camera D...
Mat cameraMatrix = (Mat_<double>(3, 3) <<
716.731, 0, 660.749,
0, 716.731, 360.754,
0, 0, 1);
Mat distCoeffs = (Mat_<double>(5, 1) << 0, 0, 0, 0, 0);
Mat rotation_a = Mat::eye(3, 3, CV_64F); // no rotation
Mat translation_a = (Mat_<double>(3, 1) << 0, 0, 0); // no translation
Mat rt_a;
hconcat(rotation_a, translation_a, rt_a);
Mat projectionMatrix_a = cameraMatrix * rt_a;
Mat rotation_d = (Mat_<double>(3, 1) << 0, CV_PI / 6.0, 0); // 30° rotation about Y axis
Rodrigues(rotation_d, rotation_d); // convert to 3x3 matrix
Mat translation_d = (Mat_<double>(3, 1) << 100, 0, 0);
Mat rt_d;
hconcat(rotation_d, translation_d, rt_d);
Mat projectionMatrix_d = cameraMatrix * rt_d;
What are the pixel coordinates of points3D when observed by projections A and D.
Mat points2D_a = projectionMatrix_a * points3D;
Mat points2D_d = projectionMatrix_d * points3D;
I put them in vectors:
vector<Point2f> points2Dvector_a, points2Dvector_d;
After that, I generate the 3D points again.
Mat points3DHomogeneous;
triangulatePoints(projectionMatrix_a, projectionMatrix_d, points2Dvector_a, points2Dvector_d, points3DHomogeneous);
Mat triangulatedPoints3D;
transpose(points3DHomogeneous, triangulatedPoints3D);
convertPointsFromHomogeneous(triangulatedPoints3D, triangulatedPoints3D);
Now, triangulatedPoints3D start out like this:
and they are identical to points3D.
And then the last step.
Mat rvec, tvec;
solvePnP(triangulatedPoints3D, points2Dvector_d, cameraMatrix, distCoeffs, rvec, tvec);
Resulting rvec and tvec:
I had hoped to get something more similar to the transformations used in creation of projectionMatrix_d, i.e. translation of (100, 0, 0) and rotation of 30° about Y axis.
If I use inverted transformations when creating projection matrix, like this:
Mat rotation_d = (Mat_<double>(3, 1) << 0, CV_PI / 6.0, 0); // 30° rotation about Y axis
Rodrigues(-rotation_d, rotation_d); // NEGATIVE ROTATION
Mat translation_d = (Mat_<double>(3, 1) << 100, 0, 0);
Mat rt_d;
hconcat(rotation_d, -translation_d, rt_d); // NEGATIVE TRANSLATION
Mat projectionMatrix_d = cameraMatrix * rt_d;
then I get rvec and tvec:
And that makes much more sense. But then I change the starting transformation so that the rotation is negative CV_PI / 6.0 -> -CV_PI / 6.0 and the resulting rvec and tvec are:
I would like to find an explanation to why this is happening. Why am I getting such weird results from solvePnP.
OpenCV coordinate system is right-handed, the answer here gives an illustrative example about OpenCV camera system. I guess the confusion is about rvec and tvec, the latter does not give the translation of the camera but it points to the world origin. The first answer here explains it based on an example. You can get the actual projection matrix from the output of solvePnP by a simple matrix multiplication, the details are here in the first answer.

How to Compute the Structure Tensor of an Image using OpenCV

I am trying to implement an application that uses images to search for similar images in a large image database. I am developing an image descriptor to use for this search and I would like to combine color information with some gradient information. I have seen structure tensors used in this domain to find the main gradient direction in images or sub-images.
I would like to take an image, divide it into grid of sub-images, for example, 4x4 grid (in total 16 sub-images) and then find the leading gradient direction of each cell. To find the leading gradient direction I want to see if computing the structure tensor for each cell can give good representation of the image gradient and lead to improved image matching. Is this a good idea or a bad idea? The idea was to get a feature vector similar to the idea in section 3.2 in this paper http://cybertron.cg.tu-berlin.de/eitz/pdf/2009_sbim.pdf
Dividing the image into sub-images(cells) is trivial and with opencv I can compute the partial derivatives using the Sobel function.
Mat dx, dy;
Sobel(im, dx, CV_32F, 1, 0, 3, 1, 0, BORDER_DEFAULT);
Sobel(im, dy, CV_32F, 0, 1, 3, 1, 0, BORDER_DEFAULT);
Computing dx^2, dy^2 and dxy should not be a problem, but I am not sure how I can compute the structure tensor matrix and use the tensor matrix to find the main gradient direction for an image or sub-image. How can I implement this with OpenCV?
EDIT
Okay, this is what I have done.
Mat _im; // Image to compute main gradient direction for.
cvtColor(im, _im, CV_BGR2GRAY);
GaussianBlur(_im, _im, Size(3, 3), 0, 0, BORDER_DEFAULT); //Blur the image to remove unnecessary details.
GaussianBlur(_im, _im, Size(5, 5), 0, 0, BORDER_DEFAULT);
GaussianBlur(_im, _im, Size(7, 7), 0, 0, BORDER_DEFAULT);
// Calculate image derivatives
Mat dx2, dy2, dxy;
Sobel(_im, dx2, CV_32F, 2, 0, 3, 1, 0, BORDER_DEFAULT);
Sobel(_im, dy2, CV_32F, 0, 2, 3, 1, 0, BORDER_DEFAULT);
Sobel(_im, dxy, CV_32F, 1, 1, 3, 1, 0, BORDER_DEFAULT);
Mat t(2, 2, CV_32F); // tensor matrix
// Insert values to the tensor matrix.
t.at<float>(0, 0) = sum(dx2)[0];
t.at<float>(0, 1) = sum(dxy)[0];
t.at<float>(1, 0) = sum(dxy)[0];
t.at<float>(1, 1) = sum(dy2)[0];
// eigen decomposition to get the main gradient direction.
Mat eigVal, eigVec;
eigen(t, eigVal, eigVec);
// This should compute the angle of the gradient direction based on the first eigenvector.
float* eVec1 = eigVec.ptr<float>(0);
float* eVec2 = eigVec.ptr<float>(1);
cout << fastAtan2(eVec1[0], eVec1[1]) << endl;
cout << fastAtan2(eVec2[0], eVec2[1]) << endl;
Is this approach correct?
Using this image the application outputs 44.9905, 135.01.
This gives 0, 90.
When I use a part of a real image I get 342.743, 72.7425, which I find odd. I expected to get an angle along the color change (90ish).
After testing I am not sure if my implementation is correct, so any feedback or comments on this are welcomed.
I believe your problem is that you are computing second order derivatives instead of squaring first order derivatives. It should be something like this instead:
// Calculate image derivatives
Mat dx, dy;
Mat dx2, dy2, dxy;
Sobel(_im, dx, CV_32F, 1, 0);
Sobel(_im, dy, CV_32F, 0, 1);
multiply(dx, dx, dx2);
multiply(dy, dy, dy2);
multiply(dx, dy, dxy);
P.S.
Oh, by the way, there is no need to do Gaussian blurring over and over again. Just use a bigger kernel and blur once.
D.S.

How to find correspondence of 3d points and 2d points

I have a set of 3d points in world coordinates and respective correspondences with 2d points in an image. I want to find a matrix that gives me the transformation between these set of points. How can I do this in OpenCV?
cv::solvePnP() is what you are looking for, it finds an object pose from 3D-2D point correspondences and results a rotation vector (rvec), that, together with translation vector (tvec), brings points from the model coordinate system to the camera coordinate system.
you can use solvePnP for this:
// camMatrix based on img size
int max_d = std::max(img.rows,img.cols);
Mat camMatrix = (Mat_<double>(3,3) <<
max_d, 0, img.cols/2.0,
0, max_d, img.rows/2.0,
0, 0, 1.0);
// 2d -> 3d correspondence
vector<Point2d> pts2d = ...
vector<Point3d> pts3d = ...
Mat rvec,tvec;
solvePnP(pts3d, pts2d, camMatrix, Mat(1,4,CV_64F,0.0), rvec, tvec, false, SOLVEPNP_EPNP);
// get 3d rot mat
Mat rotM(3, 3, CV_64F);
Rodrigues(rvec, rotM);
// push tvec to transposed Mat
Mat rotMT = rotM.t();
rotMT.push_back(tvec.reshape(1, 1));
// transpose back, and multiply
return camMatrix * rotMT.t();