Trying to implement the famous Orange/Apple pyramids blending (cv2 Image Pyramids).
Note: Both images shape is 307x307.
However, since the result image is blurred due to clipping values in cv2.subtract and cv2.add (as stated in cv2 vs numpy Matrix Arithmetics), I have used numpy arithmetics instead as suggested in StackOverflow: Reconstructed Image after Laplacian Pyramid Not the same as original image.
I have tested this by performing pyramids on one image and the result image constructed back using pyramids has the same Max,Min,Average pixels values as opposed to using cv2 arithmetics.
However, on pyramids level 7, the result image gets a 'noise' of a red dot and on level 9 the result image gets a lot of green pixels noises. Images of levels 6, 7, 9 - Imgur Album.
Any ideas why would this happen? The pyramid level 9 green noise I would say happened because the image went below 1x1 shape. But what about the red dot on 7 level pyramid?
EDIT : Code Added
numberOfPyramids = 9
# generate Gaussian pyramids for A and B Images
GA = A.copy()
GB = B.copy()
gpA = [GA]
gpB = [GB]
for i in xrange(numberOfPyramids):
GA = cv2.pyrDown(GA)
GB = cv2.pyrDown(GB)
gpA.append(GA)
gpB.append(GB)
# generate Laplacian Pyramids for A and B Images
lpA = [gpA[numberOfPyramids - 1]]
lpB = [gpB[numberOfPyramids - 1]]
for i in xrange(numberOfPyramids - 1, 0, -1):
geA = cv2.pyrUp(gpA[i], dstsize = np.shape(gpA[i-1])[:2])
geB = cv2.pyrUp(gpB[i], dstsize = np.shape(gpB[i-1])[:2])
laplacianA = gpA[i - 1] - geA if i != 1 else cv2.subtract(gpA[i-1], geA)
laplacianB = gpB[i - 1] - geB if i != 1 else cv2.subtract(gpB[i-1], geB)
lpA.append(laplacianA)
lpB.append(laplacianB)
# Now add left and right halves of images in each level
LS = []
for la, lb in zip(lpA, lpB):
_, cols, _ = la.shape
ls = np.hstack((la[:, : cols / 2], lb[:, cols / 2 :]))
LS.append(ls)
# now reconstruct
ls_ = LS[0]
for i in xrange(1, numberOfPyramids):
ls_ = cv2.pyrUp(ls_, dstsize = np.shape(LS[i])[:2])
ls_ = ls_ + LS[i] if i != numberOfPyramids - 1 else cv2.add(ls_, LS[i])
cv2.imshow(namedWindowName, ls_)
cv2.waitKey()
After read the original article about laplacian pyramid, I find I misunderstood this method, we can fully reconstruct the original image without blur, because we use of additional pix information. And It is true that clipping value lead to blurred. Well now we come back to the beginning again :)
So the code you post is still clipping value, I advise you use int16 to save the laplacian pyramid, and not use cv2.subtract. Hope it works.
Related
I have an image from which I get the contour of using findContours. This products something that looks like the following: (showing the "inner and outer contour").
Is there a way for me to get the "midpoint" of these two contours? ie some kind of polyline that would fit exactly in between the two lines seen in the image, such that the distance at any point on the resultant time is the same from it to the top contour as is from it to the bottom contour?
More complicated example would be something as follows:
Please note, that it doesnt matter too much what happens at intersections, so long as nothing traces back on itself, so the result of the more complicated example would need multiple lines.
There is a way to get the "midpoint" of the two contours, but I don't think there is an existing OpenCV solution.
You may use the following stages:
Convert image to Grayscale, and apply binary threshold.
You may use cvtColor(... COLOR_BGR2GRAY) and threshold(...) OpenCV functions.
Fill the pixels outsize the area between lines with white color.
You may use floodFill OpenCV function.
Apply "distance transform" to the binary image.
You may use distanceTransform OpenCV function.
Use CV_DIST_L2 for euclidean distance.
Apply Dijkstra's algorithm for finding the shortest paths between most left and most right nodes.
Representing "distance transform" result (image) as weighted graph and applying Dijkstra's algorithm is the most challenging stage.
I implemented the solution in MATLAB.
The MATLAB implemented is used as a "proof of concept".
I know you were expecting C++ implementation, but it requires a lot of work.
The MATLAB implementation uses im2graph function, I downloaded from here.
Here is the MATLAB implementation:
origI = imread('two_contours.png'); % Read input image
I = rgb2gray(origI); % Convert RGB to Grayscale.
BW = imbinarize(I); % Convert from Grayscale to binary image.
% Fill pixels outsize the area between lines.
BW2 = imfill(BW, ([1, size(I,2)/2; size(I,1), size(I,2)/2]));
% Apply "distance transform" (find compute euclidean distance from closest white pixel)
D = bwdist(BW2);
% Mark all pixels outsize the area between lines with zero.
D(BW2 == 1) = 0;
figure;imshow(D, []);impixelinfo % Display D matrix as image
[M, N] = size(D);
% Find starting point and end point - assume we need to find a path from left side to right side.
x0 = 1;
[~, y0] = max(D(:, x0));
x1 = N;
[~, y1] = max(D(:, x1));
% https://www.mathworks.com/matlabcentral/fileexchange/46088-dijkstra-lowest-cost-for-images
StartNode = y0;
EndNode = M*N - (M-y1-1);
conn = 8;%4 or 8 - connected neighborhood for linking pixels
% Use 100 - D, because graphshortestpath searches for minimum weight (and we are looking for maximum weight path).
CostMat = 100 - D;
G = im2graph(CostMat, conn);
%Find "shortest" path from StartNode to EndNode
[dist, path, pred] = graphshortestpath(G, StartNode, EndNode);
% Mark white path in image J image
J = origI;R = J(:,:,1);G = J(:,:,2);B = J(:,:,3);
R(path) = 255;G(path) = 255;B(path) = 255;
J = cat(3, R, G, B);
figure;imshow(J);impixelinfo % Display J image
Result:
D - Result of distance transform:
J - Original image with "path" marked with white color:
Update:
For the new example you can define three paths.
The solution becomes more complicated.
The example is not generalized to solve all the cases.
There must be a simpler solution, I just can't think of one.
tmpI = imread('three_contours.png'); % Read input image
origI = permute(tmpI, [2, 1, 3]); % Transpose image
I = rgb2gray(origI); % Convert RGB to Grayscale.
BW = imbinarize(I); % Convert from Grayscale to binary image.
% Fill pixels outsize the area between lines.
%BW2 = imfill(BW, ([1, size(I,2)/2; size(I,1), size(I,2)/2]));
BW2 = imfill(BW, ([1, 1; size(I,1), size(I,2); size(I,2)/2, 1]));
% Apply "distance transform" (find compute euclidean distance from closest white pixel)
D = bwdist(BW2);
% Mark all pixels outsize the area between lines with zero.
D(BW2 == 1) = 0;
figure;imshow(D, []);impixelinfo % Display D matrix as image
[M, N] = size(D);
% Find starting point and end point - assume we need to find a path from left side to right side.
x0 = 1;
[~, y0a] = max(D(1:M/2, x0));
% Y coordinate of second point
[~, y0b] = max(D(M/2:M, x0));
y0b = y0b + M/2;
x1 = N;
[~, y1] = max(D(:, x1));
% https://www.mathworks.com/matlabcentral/fileexchange/46088-dijkstra-lowest-cost-for-images
StartNodeA = y0a;
StartNodeB = y0b;
EndNode = M*N - (M-y1-1);
conn = 8;%4 or 8 - connected neighborhood for linking pixels
% Use 100 - D, because graphshortestpath searches for minimum weight (and we are looking for maximum weight path).
D(D==0) = -10000; % Increase the "cost" where D is zero
CostMat = 1000 - D;
G = im2graph(CostMat, conn);
%Find "shortest" path from StartNode to EndNode
[dist, pathA, pred] = graphshortestpath(G, StartNodeA, EndNode);
[dist, pathB, pred] = graphshortestpath(G, StartNodeB, EndNode);
[dist, pathC, pred] = graphshortestpath(G, StartNodeA, StartNodeB);
% Mark white path in image J image
J = origI;R = J(:,:,1);G = J(:,:,2);B = J(:,:,3);
R(pathA) = 255;
G(pathB) = 255;
B(pathC) = 255;
J = cat(3, R, G, B);
J = permute(J, [2, 1, 3]); % Transpose image
figure;imshow(J);impixelinfo % Display J image
Three lines:
So, I have a Matlab script that does some work with data and then shows it on the screen. The camera and box are set using these commands:
Xlim = [-33 33];
Ylim = [-2 60];
Zlim = [-1 60];
Cam_Pos = [-0.5 -1.2 -0];
Cam_Tar = [-1.7 100 -3.5];
Cam_Ang = 30;
axis('off');
grid 'on'
xlim(Xlim)
ylim(Ylim)
zlim(Zlim)
campos(Cam_Pos);
camtarget(Cam_Tar);
camproj('perspective');
ax0.XColor = [1 1 1];
ax0.YColor = [1 1 1];
ax0.ZColor = [1 1 1];
set(gca,'CameraViewAngle',Cam_Ang, 'Clipping', 'on', 'ClippingStyle', '3dbox')
Now the code that was previously written in Matlab is being rewritten to C++, using OpenCV for image processing. In the end I have the array of points in 3D space, that should be drawn over the image and the result should look similar to the one, produced by Matlab script.
Is it possible to get object transformation matrix from this data that if matrix with point coordinates is multiplied by it will produce the result required?
I want to merge 2 one-channel, gray-scale images with OpenCv merge method. It is the code below:
...
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
zeros = numpy.zeros(img_gray.shape)
merged = cv2.merge([img_gray, zeros])
...
The problem is that gray-scale image doesn't have depth attribute that should be 1 and merge function require the same size of images and the same depth. I get error:
error: /build/buildd/opencv-2.4.8+dfsg1/modules/core/src/convert.cpp:296: error: (-215) mv[i].size == mv[0].size && mv[i].depth() == depth in function merge
How can i merge this arrays?
Solved, i had to change dtype of img_gray from uint8 to float64
img_gray = numpy.float64(img_gray)
OpenCV Version 2.4.11
import numpy as np
# Load the image
img1 = cv2.imread(paths[0], cv2.IMREAD_UNCHANGED)
# could also use cv2.split() but per the docs (link below) it's time consuming
# split the channels using Numpy indexing, notice it's a zero based index unlike MATLAB
b = img1[:, :, 0]
g = img1[:, :, 1]
r = img1[:, :, 2]
# to avoid overflows and truncation in turn, clip the image in [0.0, 1.0] inclusive range
b = b.astype(np.float)
b /= 255
manipulate the channels ... in my case, adding Gaussian noise to blue channel ( b => b1 )
b1 = b1.astype(np.float)
g = g.astype(np.float)
r = r.astype(np.float)
# gotcha : notice the parameter is an array of channels
noisy_blue = cv2.merge((b1, g, r))
# store the outcome to disk
cv2.imwrite('output/NoisyBlue.png', noisy_blue)
N.B.:
Alternatively, you may also use np.double instead np.float in astype for type casting
Open CV Documentation Link
I am trying to generate a point cloud using images captured by Kinect with Python and libfreenect, but I couldn't align the depth data to RGB data taken by Kinect.
I applied Nicolas Burrus's equation but the two images turned further away, is there something wrong with my code:
cx_d = 3.3930780975300314e+02
cy_d = 2.4273913761751615e+02
fx_d = 5.9421434211923247e+02
fy_d = 5.9104053696870778e+02
fx_rgb = 5.2921508098293293e+02
fy_rgb = 5.2556393630057437e+02
cx_rgb = 3.2894272028759258e+02
cy_rgb = 2.6748068171871557e+02
RR = np.array([
[0.999985794494467, -0.003429138557773, 0.00408066391266],
[0.003420377768765,0.999991835033557, 0.002151948451469],
[-0.004088009930192, -0.002137960469802, 0.999989358593300 ]
])
TT = np.array([ 1.9985242312092553e-02, -7.4423738761617583e-04,-1.0916736334336222e-02 ])
# uu, vv are indices in depth image
def depth_to_xyz_and_rgb(uu , vv):
# get z value in meters
pcz = depthLookUp[depths[vv , uu]]
# compute x,y values in meters
pcx = (uu - cx_d) * pcz / fx_d
pcy = (vv - cy_d) * pcz / fy_d
# apply extrinsic calibration
P3D = np.array( [pcx , pcy , pcz] )
P3Dp = np.dot(RR , P3D) - TT
# rgb indexes that P3D should match
uup = P3Dp[0] * fx_rgb / P3Dp[2] + cx_rgb
vvp = P3Dp[1] * fy_rgb / P3Dp[2] + cy_rgb
# return a point in point cloud and its corresponding color indices
return P3D , uup , vvp
Is there anything I did wrong? Any help is appreciated
First, check your calibration numbers. Your rotation matrix is approximately the identity and, assuming your calibration frame is metric, your translation vector says that the second camera is 2 centimeters to the side and one centimeter displaced in depth. Does that approximately match your setup? If not, you may be working with the wrong scaling (likely using a wrong number for the characteristic size of your calibration target - a checkerboard?).
Your code looks correct - you are re-projecting a pixel of the depth camera at a known depth, and the projecting it back in the second camera to get at the corresponding rgb value.
One think I would check is whether your using your coordinate transform in the right direction. IIRC, OpenCV produces it as [R | t], but you are using it as [R | -t], which looks suspicious. Perhaps you meant to use its inverse, which would be [R' | -R'*t ], where I use the apostrophe to mean transposition.
I have picture from front-view. and I want to turn this into bird's eye view.
Now I want to calculate for each point in the rectangle (x,y) what will be transformed x,y in the trapezoid.
there must be a formula for this transformation with a given x and y and also the angle of the trapezoid (a).
I am programming in C and using opencv.
Thanks a lot in advance.
Did you consider the homography transform. You use this to create or correct perspective in an image, I think that it is exactly what you want.
With OpenCV, you can use the method cv::findHomography(). The arguments are the 4 initial points (vertices of your rectangle) and the 4 final points (the vertices of the trapeze). You get a transformation matrix that you can then use with cv::warpPerspective() or cv::perspectiveTransform().
I was able to figure out a way for your problem.
Here is the code I used for the same:
Importing the required packages:
import cv2
import numpy as np
Reading the image to be used:
filename = '1.jpg'
img = cv2.imread(filename)
cv2.imwrite('img.jpg',img)
Storing the height and width of the image in separate variables:
ih, iw, _ = img.shape
Creating a black window whose size is bigger than that of the image and storing its height and width in separate variables:
black = np.zeros((ih + 300, iw + 300, 3), np.uint8)
cv2.imwrite('black.jpg',black)
bh, bw, _ = black.shape
Storing the 4 corner points of the image in an array:
pts_src = np.array([[0.0, 0.0],[float(iw), 0.0],[float(iw), float(ih)],[0.0,float(ih)]])
Storing the 4 corner points of the trapezoid to be obtained:
pts_dst = np.array([[bw * 0.25, 0],[bw * 0.75, 0.0],[float(bw), float(bh)],[0.0,float(bh)]])
Calculating the homography matrix using pts_src and pts_dst:
h, status = cv2.findHomography(pts_src, pts_dst)
Warping the given rectangular image into the trapezoid:
im_out = cv2.warpPerspective(img, h, (black.shape[1],black.shape[0]))
cv2.imwrite("im_outImage.jpg", im_out)
cv2.waitKey(0)
cv2.destroyAllWindows()
If you alter the values in the array pts_dst you will be able to get different kinds of quadrilaterals.