Getting transformation matrix giving similar results to Matlab camera setup - c++

So, I have a Matlab script that does some work with data and then shows it on the screen. The camera and box are set using these commands:
Xlim = [-33 33];
Ylim = [-2 60];
Zlim = [-1 60];
Cam_Pos = [-0.5 -1.2 -0];
Cam_Tar = [-1.7 100 -3.5];
Cam_Ang = 30;
axis('off');
grid 'on'
xlim(Xlim)
ylim(Ylim)
zlim(Zlim)
campos(Cam_Pos);
camtarget(Cam_Tar);
camproj('perspective');
ax0.XColor = [1 1 1];
ax0.YColor = [1 1 1];
ax0.ZColor = [1 1 1];
set(gca,'CameraViewAngle',Cam_Ang, 'Clipping', 'on', 'ClippingStyle', '3dbox')
Now the code that was previously written in Matlab is being rewritten to C++, using OpenCV for image processing. In the end I have the array of points in 3D space, that should be drawn over the image and the result should look similar to the one, produced by Matlab script.
Is it possible to get object transformation matrix from this data that if matrix with point coordinates is multiplied by it will produce the result required?

Related

opencv how to get middle line of a contour

I have an image from which I get the contour of using findContours. This products something that looks like the following: (showing the "inner and outer contour").
Is there a way for me to get the "midpoint" of these two contours? ie some kind of polyline that would fit exactly in between the two lines seen in the image, such that the distance at any point on the resultant time is the same from it to the top contour as is from it to the bottom contour?
More complicated example would be something as follows:
Please note, that it doesnt matter too much what happens at intersections, so long as nothing traces back on itself, so the result of the more complicated example would need multiple lines.
There is a way to get the "midpoint" of the two contours, but I don't think there is an existing OpenCV solution.
You may use the following stages:
Convert image to Grayscale, and apply binary threshold.
You may use cvtColor(... COLOR_BGR2GRAY) and threshold(...) OpenCV functions.
Fill the pixels outsize the area between lines with white color.
You may use floodFill OpenCV function.
Apply "distance transform" to the binary image.
You may use distanceTransform OpenCV function.
Use CV_DIST_L2 for euclidean distance.
Apply Dijkstra's algorithm for finding the shortest paths between most left and most right nodes.
Representing "distance transform" result (image) as weighted graph and applying Dijkstra's algorithm is the most challenging stage.
I implemented the solution in MATLAB.
The MATLAB implemented is used as a "proof of concept".
I know you were expecting C++ implementation, but it requires a lot of work.
The MATLAB implementation uses im2graph function, I downloaded from here.
Here is the MATLAB implementation:
origI = imread('two_contours.png'); % Read input image
I = rgb2gray(origI); % Convert RGB to Grayscale.
BW = imbinarize(I); % Convert from Grayscale to binary image.
% Fill pixels outsize the area between lines.
BW2 = imfill(BW, ([1, size(I,2)/2; size(I,1), size(I,2)/2]));
% Apply "distance transform" (find compute euclidean distance from closest white pixel)
D = bwdist(BW2);
% Mark all pixels outsize the area between lines with zero.
D(BW2 == 1) = 0;
figure;imshow(D, []);impixelinfo % Display D matrix as image
[M, N] = size(D);
% Find starting point and end point - assume we need to find a path from left side to right side.
x0 = 1;
[~, y0] = max(D(:, x0));
x1 = N;
[~, y1] = max(D(:, x1));
% https://www.mathworks.com/matlabcentral/fileexchange/46088-dijkstra-lowest-cost-for-images
StartNode = y0;
EndNode = M*N - (M-y1-1);
conn = 8;%4 or 8 - connected neighborhood for linking pixels
% Use 100 - D, because graphshortestpath searches for minimum weight (and we are looking for maximum weight path).
CostMat = 100 - D;
G = im2graph(CostMat, conn);
%Find "shortest" path from StartNode to EndNode
[dist, path, pred] = graphshortestpath(G, StartNode, EndNode);
% Mark white path in image J image
J = origI;R = J(:,:,1);G = J(:,:,2);B = J(:,:,3);
R(path) = 255;G(path) = 255;B(path) = 255;
J = cat(3, R, G, B);
figure;imshow(J);impixelinfo % Display J image
Result:
D - Result of distance transform:
J - Original image with "path" marked with white color:
Update:
For the new example you can define three paths.
The solution becomes more complicated.
The example is not generalized to solve all the cases.
There must be a simpler solution, I just can't think of one.
tmpI = imread('three_contours.png'); % Read input image
origI = permute(tmpI, [2, 1, 3]); % Transpose image
I = rgb2gray(origI); % Convert RGB to Grayscale.
BW = imbinarize(I); % Convert from Grayscale to binary image.
% Fill pixels outsize the area between lines.
%BW2 = imfill(BW, ([1, size(I,2)/2; size(I,1), size(I,2)/2]));
BW2 = imfill(BW, ([1, 1; size(I,1), size(I,2); size(I,2)/2, 1]));
% Apply "distance transform" (find compute euclidean distance from closest white pixel)
D = bwdist(BW2);
% Mark all pixels outsize the area between lines with zero.
D(BW2 == 1) = 0;
figure;imshow(D, []);impixelinfo % Display D matrix as image
[M, N] = size(D);
% Find starting point and end point - assume we need to find a path from left side to right side.
x0 = 1;
[~, y0a] = max(D(1:M/2, x0));
% Y coordinate of second point
[~, y0b] = max(D(M/2:M, x0));
y0b = y0b + M/2;
x1 = N;
[~, y1] = max(D(:, x1));
% https://www.mathworks.com/matlabcentral/fileexchange/46088-dijkstra-lowest-cost-for-images
StartNodeA = y0a;
StartNodeB = y0b;
EndNode = M*N - (M-y1-1);
conn = 8;%4 or 8 - connected neighborhood for linking pixels
% Use 100 - D, because graphshortestpath searches for minimum weight (and we are looking for maximum weight path).
D(D==0) = -10000; % Increase the "cost" where D is zero
CostMat = 1000 - D;
G = im2graph(CostMat, conn);
%Find "shortest" path from StartNode to EndNode
[dist, pathA, pred] = graphshortestpath(G, StartNodeA, EndNode);
[dist, pathB, pred] = graphshortestpath(G, StartNodeB, EndNode);
[dist, pathC, pred] = graphshortestpath(G, StartNodeA, StartNodeB);
% Mark white path in image J image
J = origI;R = J(:,:,1);G = J(:,:,2);B = J(:,:,3);
R(pathA) = 255;
G(pathB) = 255;
B(pathC) = 255;
J = cat(3, R, G, B);
J = permute(J, [2, 1, 3]); % Transpose image
figure;imshow(J);impixelinfo % Display J image
Three lines:

Python cv2 Image Pyramids

Trying to implement the famous Orange/Apple pyramids blending (cv2 Image Pyramids).
Note: Both images shape is 307x307.
However, since the result image is blurred due to clipping values in cv2.subtract and cv2.add (as stated in cv2 vs numpy Matrix Arithmetics), I have used numpy arithmetics instead as suggested in StackOverflow: Reconstructed Image after Laplacian Pyramid Not the same as original image.
I have tested this by performing pyramids on one image and the result image constructed back using pyramids has the same Max,Min,Average pixels values as opposed to using cv2 arithmetics.
However, on pyramids level 7, the result image gets a 'noise' of a red dot and on level 9 the result image gets a lot of green pixels noises. Images of levels 6, 7, 9 - Imgur Album.
Any ideas why would this happen? The pyramid level 9 green noise I would say happened because the image went below 1x1 shape. But what about the red dot on 7 level pyramid?
EDIT : Code Added
numberOfPyramids = 9
# generate Gaussian pyramids for A and B Images
GA = A.copy()
GB = B.copy()
gpA = [GA]
gpB = [GB]
for i in xrange(numberOfPyramids):
GA = cv2.pyrDown(GA)
GB = cv2.pyrDown(GB)
gpA.append(GA)
gpB.append(GB)
# generate Laplacian Pyramids for A and B Images
lpA = [gpA[numberOfPyramids - 1]]
lpB = [gpB[numberOfPyramids - 1]]
for i in xrange(numberOfPyramids - 1, 0, -1):
geA = cv2.pyrUp(gpA[i], dstsize = np.shape(gpA[i-1])[:2])
geB = cv2.pyrUp(gpB[i], dstsize = np.shape(gpB[i-1])[:2])
laplacianA = gpA[i - 1] - geA if i != 1 else cv2.subtract(gpA[i-1], geA)
laplacianB = gpB[i - 1] - geB if i != 1 else cv2.subtract(gpB[i-1], geB)
lpA.append(laplacianA)
lpB.append(laplacianB)
# Now add left and right halves of images in each level
LS = []
for la, lb in zip(lpA, lpB):
_, cols, _ = la.shape
ls = np.hstack((la[:, : cols / 2], lb[:, cols / 2 :]))
LS.append(ls)
# now reconstruct
ls_ = LS[0]
for i in xrange(1, numberOfPyramids):
ls_ = cv2.pyrUp(ls_, dstsize = np.shape(LS[i])[:2])
ls_ = ls_ + LS[i] if i != numberOfPyramids - 1 else cv2.add(ls_, LS[i])
cv2.imshow(namedWindowName, ls_)
cv2.waitKey()
After read the original article about laplacian pyramid, I find I misunderstood this method, we can fully reconstruct the original image without blur, because we use of additional pix information. And It is true that clipping value lead to blurred. Well now we come back to the beginning again :)
So the code you post is still clipping value, I advise you use int16 to save the laplacian pyramid, and not use cv2.subtract. Hope it works.

How to merge 2 gray-scale images in Python with OpenCV

I want to merge 2 one-channel, gray-scale images with OpenCv merge method. It is the code below:
...
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
zeros = numpy.zeros(img_gray.shape)
merged = cv2.merge([img_gray, zeros])
...
The problem is that gray-scale image doesn't have depth attribute that should be 1 and merge function require the same size of images and the same depth. I get error:
error: /build/buildd/opencv-2.4.8+dfsg1/modules/core/src/convert.cpp:296: error: (-215) mv[i].size == mv[0].size && mv[i].depth() == depth in function merge
How can i merge this arrays?
Solved, i had to change dtype of img_gray from uint8 to float64
img_gray = numpy.float64(img_gray)
OpenCV Version 2.4.11
import numpy as np
# Load the image
img1 = cv2.imread(paths[0], cv2.IMREAD_UNCHANGED)
# could also use cv2.split() but per the docs (link below) it's time consuming
# split the channels using Numpy indexing, notice it's a zero based index unlike MATLAB
b = img1[:, :, 0]
g = img1[:, :, 1]
r = img1[:, :, 2]
# to avoid overflows and truncation in turn, clip the image in [0.0, 1.0] inclusive range
b = b.astype(np.float)
b /= 255
manipulate the channels ... in my case, adding Gaussian noise to blue channel ( b => b1 )
b1 = b1.astype(np.float)
g = g.astype(np.float)
r = r.astype(np.float)
# gotcha : notice the parameter is an array of channels
noisy_blue = cv2.merge((b1, g, r))
# store the outcome to disk
cv2.imwrite('output/NoisyBlue.png', noisy_blue)
N.B.:
Alternatively, you may also use np.double instead np.float in astype for type casting
Open CV Documentation Link

Align depth image to RGB image

I am trying to generate a point cloud using images captured by Kinect with Python and libfreenect, but I couldn't align the depth data to RGB data taken by Kinect.
I applied Nicolas Burrus's equation but the two images turned further away, is there something wrong with my code:
cx_d = 3.3930780975300314e+02
cy_d = 2.4273913761751615e+02
fx_d = 5.9421434211923247e+02
fy_d = 5.9104053696870778e+02
fx_rgb = 5.2921508098293293e+02
fy_rgb = 5.2556393630057437e+02
cx_rgb = 3.2894272028759258e+02
cy_rgb = 2.6748068171871557e+02
RR = np.array([
[0.999985794494467, -0.003429138557773, 0.00408066391266],
[0.003420377768765,0.999991835033557, 0.002151948451469],
[-0.004088009930192, -0.002137960469802, 0.999989358593300 ]
])
TT = np.array([ 1.9985242312092553e-02, -7.4423738761617583e-04,-1.0916736334336222e-02 ])
# uu, vv are indices in depth image
def depth_to_xyz_and_rgb(uu , vv):
# get z value in meters
pcz = depthLookUp[depths[vv , uu]]
# compute x,y values in meters
pcx = (uu - cx_d) * pcz / fx_d
pcy = (vv - cy_d) * pcz / fy_d
# apply extrinsic calibration
P3D = np.array( [pcx , pcy , pcz] )
P3Dp = np.dot(RR , P3D) - TT
# rgb indexes that P3D should match
uup = P3Dp[0] * fx_rgb / P3Dp[2] + cx_rgb
vvp = P3Dp[1] * fy_rgb / P3Dp[2] + cy_rgb
# return a point in point cloud and its corresponding color indices
return P3D , uup , vvp
Is there anything I did wrong? Any help is appreciated
First, check your calibration numbers. Your rotation matrix is approximately the identity and, assuming your calibration frame is metric, your translation vector says that the second camera is 2 centimeters to the side and one centimeter displaced in depth. Does that approximately match your setup? If not, you may be working with the wrong scaling (likely using a wrong number for the characteristic size of your calibration target - a checkerboard?).
Your code looks correct - you are re-projecting a pixel of the depth camera at a known depth, and the projecting it back in the second camera to get at the corresponding rgb value.
One think I would check is whether your using your coordinate transform in the right direction. IIRC, OpenCV produces it as [R | t], but you are using it as [R | -t], which looks suspicious. Perhaps you meant to use its inverse, which would be [R' | -R'*t ], where I use the apostrophe to mean transposition.

OpenCV estimateAffine3D breaks for coplanar points

I am trying to use OpenCV's estimateAffine3D() function to get the affine transformation between two sets of coplanar points in 3D. If I hold one variable constant, I find there is a constant error in the translation component of that variable.
My test code is:
std::vector<cv::Point3f> first, second;
std::vector<uchar> inliers;
cv::Mat aff(3,4,CV_64F);
for (int i = 0; i <6; i++)
{
first.push_back(cv::Point3f(i,i%3,1));
second.push_back(cv::Point3f(i,i%3,1));
}
int ret = cv::estimateAffine3D(first, second, aff, inliers);
std::cout << aff << std::endl;
The output I expect is:
[1 0 0 0]
[0 1 0 0]
[0 0 1 0]
Edit: My expectation is incorrect. The matrix does not decompose into [R|t] for the case of constant z-coordinates.
but what I get (with some rounding for readability) is:
[1 0 0 0]
[0 1 0 0]
[0 0 0.5 0.5]
Is there a way to fix this behavior? Is there a function which does the same on sets of 2D points?
No matter how I run your code I get fine output. For example when I run it exactly as you posted it I get.
[1,0,0 ,0]
[0,1,0 ,0]
[0,0,.5,.5]
which is correct because the 4th element of a homogeneous coordinate is assumed to be 1. When I run it with 2 as the z value I get
[1,0,0 ,0]
[0,1,0 ,0]
[0,0,.8,.4]
which also works (.8*2+.4 = 2). Are you sure you didn't just read aff(2,2) wrong?
The key problem is:
Your purpose is to estimate the rotation and translation between two sets of 3D points, but the OpenCV function estimateAffine3D() is not for that purpose. As its name suggests, this function is to compute the affine transformation between two sets of 3D points. When computing the affine transformation, the constraints on the rotation matrix is not considered. Of course, the result is not correct. To obtain the rotation and translation, you need to implement the SVD based algorithm.You may search "absolute orientation" in google. This is a classic and closed-form algorithm.