How to Compute the Structure Tensor of an Image using OpenCV

How to Compute the Structure Tensor of an Image using OpenCV - c++

I am trying to implement an application that uses images to search for similar images in a large image database. I am developing an image descriptor to use for this search and I would like to combine color information with some gradient information. I have seen structure tensors used in this domain to find the main gradient direction in images or sub-images.
I would like to take an image, divide it into grid of sub-images, for example, 4x4 grid (in total 16 sub-images) and then find the leading gradient direction of each cell. To find the leading gradient direction I want to see if computing the structure tensor for each cell can give good representation of the image gradient and lead to improved image matching. Is this a good idea or a bad idea? The idea was to get a feature vector similar to the idea in section 3.2 in this paper http://cybertron.cg.tu-berlin.de/eitz/pdf/2009_sbim.pdf
Dividing the image into sub-images(cells) is trivial and with opencv I can compute the partial derivatives using the Sobel function.
Mat dx, dy;
Sobel(im, dx, CV_32F, 1, 0, 3, 1, 0, BORDER_DEFAULT);
Sobel(im, dy, CV_32F, 0, 1, 3, 1, 0, BORDER_DEFAULT);
Computing dx^2, dy^2 and dxy should not be a problem, but I am not sure how I can compute the structure tensor matrix and use the tensor matrix to find the main gradient direction for an image or sub-image. How can I implement this with OpenCV?
EDIT
Okay, this is what I have done.
Mat _im; // Image to compute main gradient direction for.
cvtColor(im, _im, CV_BGR2GRAY);
GaussianBlur(_im, _im, Size(3, 3), 0, 0, BORDER_DEFAULT); //Blur the image to remove unnecessary details.
GaussianBlur(_im, _im, Size(5, 5), 0, 0, BORDER_DEFAULT);
GaussianBlur(_im, _im, Size(7, 7), 0, 0, BORDER_DEFAULT);
// Calculate image derivatives
Mat dx2, dy2, dxy;
Sobel(_im, dx2, CV_32F, 2, 0, 3, 1, 0, BORDER_DEFAULT);
Sobel(_im, dy2, CV_32F, 0, 2, 3, 1, 0, BORDER_DEFAULT);
Sobel(_im, dxy, CV_32F, 1, 1, 3, 1, 0, BORDER_DEFAULT);
Mat t(2, 2, CV_32F); // tensor matrix
// Insert values to the tensor matrix.
t.at<float>(0, 0) = sum(dx2)[0];
t.at<float>(0, 1) = sum(dxy)[0];
t.at<float>(1, 0) = sum(dxy)[0];
t.at<float>(1, 1) = sum(dy2)[0];
// eigen decomposition to get the main gradient direction.
Mat eigVal, eigVec;
eigen(t, eigVal, eigVec);
// This should compute the angle of the gradient direction based on the first eigenvector.
float* eVec1 = eigVec.ptr<float>(0);
float* eVec2 = eigVec.ptr<float>(1);
cout << fastAtan2(eVec1[0], eVec1[1]) << endl;
cout << fastAtan2(eVec2[0], eVec2[1]) << endl;
Is this approach correct?
Using this image the application outputs 44.9905, 135.01.
This gives 0, 90.
When I use a part of a real image I get 342.743, 72.7425, which I find odd. I expected to get an angle along the color change (90ish).
After testing I am not sure if my implementation is correct, so any feedback or comments on this are welcomed.

I believe your problem is that you are computing second order derivatives instead of squaring first order derivatives. It should be something like this instead:
// Calculate image derivatives
Mat dx, dy;
Mat dx2, dy2, dxy;
Sobel(_im, dx, CV_32F, 1, 0);
Sobel(_im, dy, CV_32F, 0, 1);
multiply(dx, dx, dx2);
multiply(dy, dy, dy2);
multiply(dx, dy, dxy);
P.S.
Oh, by the way, there is no need to do Gaussian blurring over and over again. Just use a bigger kernel and blur once.
D.S.

Related

Calculate 3D gradients with openCv

I'm using OpenCV and C++ to make a 3D HOG function (I already made the 2D one). I have a 3 dimensional Mat (x,y,z) where the z value is the depth info of the pixel (x,y) (a combination of the 2D color Image and the depthImage from the Kinect). The next step is to calculate gradients of x, y and z, in the 2D HOG function I used Sobel to do that :
Mat gx, gy;
Sobel(img, gx, CV_32F, 1, 0, 1);
Sobel(img, gy, CV_32F, 0, 1, 1);
But when it comes to the 3D gradients, I did not find a way to do it. Is there any simple way for that???

How to get the difference of Aruco Markers' axis-angel in the world coordinate?

I am new to opencv and Aruco. I attempted to find out the differences in axis-angles of two different Aruco markers. For example, the angle difference b/w the two (1, 0, 0) vectors of markers in the real world. In my understanding, the transform happens in the following order: local coordinate-> camera coordinate -> world coordinate. And then I get angle difference of two aruco markers because they are now both in the same world coordinate. Can someone explain how this process is done? Or is there any better way to find out the angle difference? I am coding with python and opencv
Known:
Translation and rotation vectors(1x3) from estimatePoseSingleMarkers() function in aruco module. (rotation and translation vectors can be transformed to matrix(3x3) with Rodrigues() )
camera Matrix(3x3) and dist_coefs matrix(1x5) from camera calibration.
Using 6x6_250 aruco markers
Upate:
world coordinate = camera coordinate
The following is the function to draw the aruco's XYZ components. It looks like the origin or the camera is (0, 0, 0). Apply the translation and rotation matrix of aruco marker to the x, y, z components of origin gets me the x, y, z components of aruco marker. Is that right?
/**
*/
void drawAxis(InputOutputArray _image, InputArray _cameraMatrix, InputArray _distCoeffs,
InputArray _rvec, InputArray _tvec, float length) {
CV_Assert(_image.getMat().total() != 0 &&
(_image.getMat().channels() == 1 || _image.getMat().channels() == 3));
CV_Assert(length > 0);
// project axis points
vector< Point3f > axisPoints;
axisPoints.push_back(Point3f(0, 0, 0));
axisPoints.push_back(Point3f(length, 0, 0));
axisPoints.push_back(Point3f(0, length, 0));
axisPoints.push_back(Point3f(0, 0, length));
vector< Point2f > imagePoints;
projectPoints(axisPoints, _rvec, _tvec, _cameraMatrix, _distCoeffs, imagePoints);
// draw axis lines
line(_image, imagePoints[0], imagePoints[1], Scalar(0, 0, 255), 3);
line(_image, imagePoints[0], imagePoints[2], Scalar(0, 255, 0), 3);
line(_image, imagePoints[0], imagePoints[3], Scalar(255, 0, 0), 3);
}

Technically it's a work around to my question. So I figured ARUCO library has this function called projectPoints which maps the 3D objectPoints into 2D imagePoints.
imagePoints[0] and imagePoints[1] gives you the 2D projection of (length, 0, 0). By using arccos(np.dot(v1,v2)), you can get the difference in x-axis angle of the two markers.

Transform a frame to be as if it was taken from above using OpenCV

I am working on a project for estimating a UAV (quadcopter) location using optical-flow technique. I currently have a code that is using farneback algorithm from OpenCV. The current code is working fine when the camera is always pointing to the ground.
Now, I want to add support to the case when the camera is not pointing straight down - meaning that the quadcopter now has a pitch / roll / yaw (Euler angles). The quadcopters Euler angles are known and I am searching for a method to compute and apply the transformation needed based on the known current Euler angles. So that the result image will be as if it was taken from above (see image below).
I found methods that calculates the transformation when having 2 sets (source and destination) of 4 corners via findHomography or getPerspectiveTransform functions from OpenCV. But I couldn't find any methods that can do it with knowing only Euler angle (because I don't know the destination image corenrs).
So my question is what method can I use and how in order to transform a frame to be as if it was taken from above using only Euler angles and camera height from ground if necessary?
In order to demonstrate what I need:
The relevant part of my current code is below:
for(;;)
{
Mat m, disp, warp;
vector<Point2f> corners;
// take out frame- still distorted
cap >> origFrame;
// undistort the frame using the calibration parameters
cv::undistort(origFrame, undistortFrame, cameraMatrix, distCoeffs, noArray());
// lower the process effort by transforming the picture to gray
cvtColor(undistortFrame, gray, COLOR_BGR2GRAY);
if( !prevgray.empty() )
{
// calculate flow
calcOpticalFlowFarneback(prevgray, gray, uflow, 0.5, 3/*def 3 */, 10/* def 15*/, 3, 3, 1.2 /* def 1.2*/, 0);
uflow.copyTo(flow);
// get average
calcAvgOpticalFlow(flow, 16, corners);
// calculate range of view - 2*tan(fov/2)*distance
rovX = 2*0.44523*distanceSonar*100; // 2 * tan(48/2) * dist(cm)
rovY = 2*0.32492*distanceSonar*100; // 2 * tan(36/2) * dist(cm)
// calculate final x, y location
location[0] += (currLocation.x/WIDTH_RES)*rovX;
location[1] += (currLocation.y/HEIGHT_RES)*rovY;
}
//break conditions
if(waitKey(1)>=0)
break;
if(end_run)
break;
std::swap(prevgray, gray);
}
UPDATE:
After successfully adding the rotation, I still need my image to be centered (and not to go outside of the frame window as shown below). I guess I need some kind of translation. I want the center of the source image to be at the center of the destination image. How can I add this as well?
The rotation function that works:
void rotateFrame(const Mat &input, Mat &output, Mat &A , double roll, double pitch, double yaw){
Mat Rx = (Mat_<double>(3, 3) <<
1, 0, 0,
0, cos(roll), -sin(roll),
0, sin(roll), cos(roll));
Mat Ry = (Mat_<double>(3, 3) <<
cos(pitch), 0, sin(pitch),
0, 1, 0,
-sin(pitch), 0, cos(pitch));
Mat Rz = (Mat_<double>(3, 3) <<
cos(yaw), -sin(yaw), 0,
sin(yaw), cos(yaw), 0,
0, 0, 1);
Mat R = Rx*Ry*Rz;
Mat trans = A*R*A.inv();
warpPerspective(input, output, trans, input.size());
}
When I run it with rotateFrame(origFrame, processedFrame, cameraMatrix, 0, 0, 0); I get image as expected:
But when I run it with 10 degrees in roll rotateFrame(origFrame, processedFrame, cameraMatrix, 20*(M_PI/180), 0, 0);. The image is getting out of the frame window:

If you have a calibration intrinsics matrix A (3x3), and there is no translation between camara poses, all you need to find homography H (3x3) is to construct rotation matrix R (3x3) from euler angles and apply the following formula:
H = A * R * A.inv()
Where .inv() is matrix invertion.
UPDATED:
If you want to center the image, you should just add translation this way:
(this is finding the warped position of the center and translation of this point back to the center)
|dx| | 320 / 2 |
|dy| = H * | 240 / 2 |
|1 | | 1 |
| 1 0 (320/2-dx) |
W = | 0 1 (240/2-dy) | * H
| 0 0 1 |
W is your final transformation.

I came to a conclusion that I had to use the 4x4 Homography matrix in order to be able to get what I wanted. In order to find the right homography matrix we need:
3D Rotation matrix R.
Camera calibration intrinsic matrix A1 and its inverted matrix A2.
Translation matrix T.
We can compose the 3D rotation matrix R by multiplying the rotation matrices around axes X,Y,Z:
Mat R = RZ * RY * RX
In order to apply the transformation on the image and keep it centered we need to add translation given by a 4x4 matrix, where dx=0; dy=0; dz=1 :
Mat T = (Mat_<double>(4, 4) <<
1, 0, 0, dx,
0, 1, 0, dy,
0, 0, 1, dz,
0, 0, 0, 1);
Given all these matrices we can compose our homography matrix H:
Mat H = A2 * (T * (R * A1))
With this homography matrix we can then use warpPerspective function from OpenCV to apply the transformation.
warpPerspective(input, output, H, input.size(), INTER_LANCZOS4);
For conclusion and completeness of this solution here is the full code:
void rotateImage(const Mat &input, UMat &output, double roll, double pitch, double yaw,
double dx, double dy, double dz, double f, double cx, double cy)
{
// Camera Calibration Intrinsics Matrix
Mat A2 = (Mat_<double>(3,4) <<
f, 0, cx, 0,
0, f, cy, 0,
0, 0, 1, 0);
// Inverted Camera Calibration Intrinsics Matrix
Mat A1 = (Mat_<double>(4,3) <<
1/f, 0, -cx/f,
0, 1/f, -cy/f,
0, 0, 0,
0, 0, 1);
// Rotation matrices around the X, Y, and Z axis
Mat RX = (Mat_<double>(4, 4) <<
1, 0, 0, 0,
0, cos(roll), -sin(roll), 0,
0, sin(roll), cos(roll), 0,
0, 0, 0, 1);
Mat RY = (Mat_<double>(4, 4) <<
cos(pitch), 0, sin(pitch), 0,
0, 1, 0, 0,
-sin(pitch), 0, cos(pitch), 0,
0, 0, 0, 1);
Mat RZ = (Mat_<double>(4, 4) <<
cos(yaw), -sin(yaw), 0, 0,
sin(yaw), cos(yaw), 0, 0,
0, 0, 1, 0,
0, 0, 0, 1);
// Translation matrix
Mat T = (Mat_<double>(4, 4) <<
1, 0, 0, dx,
0, 1, 0, dy,
0, 0, 1, dz,
0, 0, 0, 1);
// Compose rotation matrix with (RX, RY, RZ)
Mat R = RZ * RY * RX;
// Final transformation matrix
Mat H = A2 * (T * (R * A1));
// Apply matrix transformation
warpPerspective(input, output, H, input.size(), INTER_LANCZOS4);
}
Result:

This how I do it in Eigen and using 4 corners:
// Desired four corners
std::vector<Eigen::Vector2f> Normalized_Reference_Pattern = { Eigen::Vector2f(0, 0), Eigen::Vector2f(0, 2), Eigen::Vector2f(2, 0), Eigen::Vector2f(2, 2) };
// Current four points
std::vector<Eigen::Vector2f> CurrentCentroids = { /* Whatever four corners you want, but relative sueqnece to above */ };
// Transform for current to desired
auto Master_Transform = get_perspective_transform(CurrentCentroids, Normalized_Reference_Pattern);
// abilitu to use the same tranformation for other points (other than the corners) in the image
Eigen::Vector2f Master_Transform_Centroid = Master_Transform * Eigen::Vector2f(currentX, currentY);
And here is my black box:
Eigen::Matrix3f get_perspective_transform(const std::vector<Eigen::Vector2f>& points_from,const std::vector<Eigen::Vector2f>& points_to)
{
cv::Mat transform_cv = cv::getPerspectiveTransform(
convert::to_cv(points_from),
convert::to_cv(points_to));
Eigen::Matrix3f transform_eigen;
cv::cv2eigen(transform_cv, transform_eigen);
return transform_eigen;
}

OpenCV triangulatePoints handedness

I have two cameras attached rigidly side by side looking in parallel directions.
Projection matrix for left camera
Projection matrix for right camera
When I perform triangulatePoints on the two vectors of corresponding points, I get the collection of points in 3D space. All of the points in 3D space have a negative Z coordinate.
So, to get to the bottom of this...
My assumption was that OpenCV uses Right Hand coordinate system.
The handedness reminder:
I assume that the initial orientation of each camera is directed in the positive Z axis direction.
So, by using the projection matrices like the ones I presented in the beginning, I would assume that the cameras are positioned in space like this:
This assumption is in conflict with what I observe when I get negative Z corrdinates for the traingulated points. The only explanation I can think of is that OpenCV in fact uses the Left Hand coordinate system. So, with the projection matrices I stated in the beginning, this is how the cameras are positioned in space:
This would indicate that my left camera in this situation is not on the left side. And this is why I am getting negative depth for points.
Moreover, if I try to combine triangulatePoints with solvePnP I run into problems.
I use the output of triangulatePoints as an input to solvePnP. I expect to get the camera coordinates near the origin of 3D coordinate system. I expect for calculated camera position to match the projection matrices used in the beginning. But this is not happening. I get some completely wild results, missing the expected values by over 10 times the baseline length.
Example
This example is more complete representation of the problem than what is stated above.
points3D
Here is the code for generating these points.
Movin on, setting up camera A and camera D...
Mat cameraMatrix = (Mat_<double>(3, 3) <<
716.731, 0, 660.749,
0, 716.731, 360.754,
0, 0, 1);
Mat distCoeffs = (Mat_<double>(5, 1) << 0, 0, 0, 0, 0);
Mat rotation_a = Mat::eye(3, 3, CV_64F); // no rotation
Mat translation_a = (Mat_<double>(3, 1) << 0, 0, 0); // no translation
Mat rt_a;
hconcat(rotation_a, translation_a, rt_a);
Mat projectionMatrix_a = cameraMatrix * rt_a;
Mat rotation_d = (Mat_<double>(3, 1) << 0, CV_PI / 6.0, 0); // 30° rotation about Y axis
Rodrigues(rotation_d, rotation_d); // convert to 3x3 matrix
Mat translation_d = (Mat_<double>(3, 1) << 100, 0, 0);
Mat rt_d;
hconcat(rotation_d, translation_d, rt_d);
Mat projectionMatrix_d = cameraMatrix * rt_d;
What are the pixel coordinates of points3D when observed by projections A and D.
Mat points2D_a = projectionMatrix_a * points3D;
Mat points2D_d = projectionMatrix_d * points3D;
I put them in vectors:
vector<Point2f> points2Dvector_a, points2Dvector_d;
After that, I generate the 3D points again.
Mat points3DHomogeneous;
triangulatePoints(projectionMatrix_a, projectionMatrix_d, points2Dvector_a, points2Dvector_d, points3DHomogeneous);
Mat triangulatedPoints3D;
transpose(points3DHomogeneous, triangulatedPoints3D);
convertPointsFromHomogeneous(triangulatedPoints3D, triangulatedPoints3D);
Now, triangulatedPoints3D start out like this:
and they are identical to points3D.
And then the last step.
Mat rvec, tvec;
solvePnP(triangulatedPoints3D, points2Dvector_d, cameraMatrix, distCoeffs, rvec, tvec);
Resulting rvec and tvec:
I had hoped to get something more similar to the transformations used in creation of projectionMatrix_d, i.e. translation of (100, 0, 0) and rotation of 30° about Y axis.
If I use inverted transformations when creating projection matrix, like this:
Mat rotation_d = (Mat_<double>(3, 1) << 0, CV_PI / 6.0, 0); // 30° rotation about Y axis
Rodrigues(-rotation_d, rotation_d); // NEGATIVE ROTATION
Mat translation_d = (Mat_<double>(3, 1) << 100, 0, 0);
Mat rt_d;
hconcat(rotation_d, -translation_d, rt_d); // NEGATIVE TRANSLATION
Mat projectionMatrix_d = cameraMatrix * rt_d;
then I get rvec and tvec:
And that makes much more sense. But then I change the starting transformation so that the rotation is negative CV_PI / 6.0 -> -CV_PI / 6.0 and the resulting rvec and tvec are:
I would like to find an explanation to why this is happening. Why am I getting such weird results from solvePnP.

OpenCV coordinate system is right-handed, the answer here gives an illustrative example about OpenCV camera system. I guess the confusion is about rvec and tvec, the latter does not give the translation of the camera but it points to the world origin. The first answer here explains it based on an example. You can get the actual projection matrix from the output of solvePnP by a simple matrix multiplication, the details are here in the first answer.

How to find correspondence of 3d points and 2d points

I have a set of 3d points in world coordinates and respective correspondences with 2d points in an image. I want to find a matrix that gives me the transformation between these set of points. How can I do this in OpenCV?

cv::solvePnP() is what you are looking for, it finds an object pose from 3D-2D point correspondences and results a rotation vector (rvec), that, together with translation vector (tvec), brings points from the model coordinate system to the camera coordinate system.

you can use solvePnP for this:
// camMatrix based on img size
int max_d = std::max(img.rows,img.cols);
Mat camMatrix = (Mat_<double>(3,3) <<
max_d, 0, img.cols/2.0,
0, max_d, img.rows/2.0,
0, 0, 1.0);
// 2d -> 3d correspondence
vector<Point2d> pts2d = ...
vector<Point3d> pts3d = ...
Mat rvec,tvec;
solvePnP(pts3d, pts2d, camMatrix, Mat(1,4,CV_64F,0.0), rvec, tvec, false, SOLVEPNP_EPNP);
// get 3d rot mat
Mat rotM(3, 3, CV_64F);
Rodrigues(rvec, rotM);
// push tvec to transposed Mat
Mat rotMT = rotM.t();
rotMT.push_back(tvec.reshape(1, 1));
// transpose back, and multiply
return camMatrix * rotMT.t();

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to Compute the Structure Tensor of an Image using OpenCV - c++

Related

Calculate 3D gradients with openCv

How to get the difference of Aruco Markers' axis-angel in the world coordinate?

Transform a frame to be as if it was taken from above using OpenCV

OpenCV triangulatePoints handedness

How to find correspondence of 3d points and 2d points

Categories

Resources