Im changing an image from front perspective to a bids eye view by using getHomography and warpPerspective.
It works in that the image warps to the desired perspective but the crop is off. It moves the warped image largely outside the image box. I assume the reason is because the operation results in negative coordinates.
I have calculated the points for calculation of the translation matrix manually and not by using any of opencv:s functions for doing that since i.e. the chessboard functions failed to detect the proper points.
I guess this can be fixed by doing additional changes to the transformation matrix. But how is that done? Also, is there a way to make sure the transformed image is centered along the x-axis and then let the y-axis be adjusted to a desired position?
Code snippet that does the job now:
cv::Mat image; // image is loaded with the original image
cv::Mat warpPers; // The container for the resulting image
cv::Mat H;
std::vector<cv::Point2f> src;
std::vector<cv::Point2f> dst;
// In reality several more points.
H = cv::findHomography(src, dst, CV_RANSAC);
cv::namedWindow("Warped persp", cv::WINDOW_AUTOSIZE );
cv::imshow( "Warped persp", newPers);
Opencv gives very convenient way to do perpective transform. The only thing you have to do is take care of the homography return by findHomography.
Indeed, maybe some points of the image you provide go in the negative part of the x or y axis.
So you have to do some check before warp the image.
step 1: find the homography H with findHomography
you will get a classic structure for homography
H = [ h00, h01, h02;
h10, h11, h12;
h20, h21, 1];
step 2: search the position of image's corners after warping
So let me define the order for the corner:
(0,0) ________ (0, w)
| |
(h,0) (h,w)
To do that, just create a matrix like that:
P = [0, w, w, 0;
0, 0, h, h;
1, 1, 1, 1]
Make the product with H and get the warped coordinates:
P' = H * P
step 3: check the minimum in x and y with these new 4 points and get the size of warped image
After, you have done the product you will receive something like that:
P' = [s1*x1, s2*x2, s3*x3, s4*x4;
s1*y1, s2*y2, s3*y3, s4*y4;
s1 , s2 , s3 , s4]
So to obtain, new valid coordinate just divide line 1 and 2 by the line 3
After that check the minimum for the column on the first line, and the minimum for the row on the second line (use cvReduce)
to find the bounding box that will contains the image (ie the dimension of the dst matrix for the warpPerspective function) just find with cvReduce the maximum over each line
let minx be the minimum on the first row (ie for column), maxx (the maximum for the 1 row)
miny and maxy for the second row.
So the size of the warped image should be cvSize(maxx-minx, maxy-miny)
step 4: add a correction to the homography
Check if minx and/or miny is/are negative, if minx < 0 then add -minx to h02 and if miny < 0, then add -miny to h12
so H should be:
H = [ h00, h01, h02-minx; //if minx <0
h10, h11, h12-miny; //if miny <0
h20, h21, 1];
step 5: warp the image
I think this question OpenCV warpperspective is similar to the current question cv::warpPerspective only shows part of warped image
So i give you my answer also here:
Try the below homography_warp.
void homography_warp(const cv::Mat& src, const cv::Mat& H, cv::Mat& dst);
src is the source image.
H is your homography.
dst is the warped image.
homography_warp adjust your homography as described by in his answer
// Convert a vector of non-homogeneous 2D points to a vector of homogenehous 2D points.
void to_homogeneous(const std::vector< cv::Point2f >& non_homogeneous, std::vector< cv::Point3f >& homogeneous)
for (size_t i = 0; i < non_homogeneous.size(); i++) {
homogeneous[i].x = non_homogeneous[i].x;
homogeneous[i].y = non_homogeneous[i].y;
homogeneous[i].z = 1.0;
// Convert a vector of homogeneous 2D points to a vector of non-homogenehous 2D points.
void from_homogeneous(const std::vector< cv::Point3f >& homogeneous, std::vector< cv::Point2f >& non_homogeneous)
for (size_t i = 0; i < non_homogeneous.size(); i++) {
non_homogeneous[i].x = homogeneous[i].x / homogeneous[i].z;
non_homogeneous[i].y = homogeneous[i].y / homogeneous[i].z;
// Transform a vector of 2D non-homogeneous points via an homography.
std::vector<cv::Point2f> transform_via_homography(const std::vector<cv::Point2f>& points, const cv::Matx33f& homography)
std::vector<cv::Point3f> ph;
to_homogeneous(points, ph);
for (size_t i = 0; i < ph.size(); i++) {
ph[i] = homography*ph[i];
std::vector<cv::Point2f> r;
from_homogeneous(ph, r);
return r;
// Find the bounding box of a vector of 2D non-homogeneous points.
cv::Rect_<float> bounding_box(const std::vector<cv::Point2f>& p)
cv::Rect_<float> r;
float x_min = std::min_element(p.begin(), p.end(), [](const cv::Point2f& lhs, const cv::Point2f& rhs) {return lhs.x < rhs.x; })->x;
float x_max = std::max_element(p.begin(), p.end(), [](const cv::Point2f& lhs, const cv::Point2f& rhs) {return lhs.x < rhs.x; })->x;
float y_min = std::min_element(p.begin(), p.end(), [](const cv::Point2f& lhs, const cv::Point2f& rhs) {return lhs.y < rhs.y; })->y;
float y_max = std::max_element(p.begin(), p.end(), [](const cv::Point2f& lhs, const cv::Point2f& rhs) {return lhs.y < rhs.y; })->y;
return cv::Rect_<float>(x_min, y_min, x_max - x_min, y_max - y_min);
// Warp the image src into the image dst through the homography H.
// The resulting dst image contains the entire warped image, this
// behaviour is the same of Octave's imperspectivewarp (in the 'image'
// package) behaviour when the argument bbox is equal to 'loose'.
// See
void homography_warp(const cv::Mat& src, const cv::Mat& H, cv::Mat& dst)
std::vector< cv::Point2f > corners;
corners.push_back(cv::Point2f(0, 0));
corners.push_back(cv::Point2f(src.cols, 0));
corners.push_back(cv::Point2f(0, src.rows));
corners.push_back(cv::Point2f(src.cols, src.rows));
std::vector< cv::Point2f > projected = transform_via_homography(corners, H);
cv::Rect_<float> bb = bounding_box(projected);
cv::Mat_<double> translation = (cv::Mat_<double>(3, 3) << 1, 0,, 0, 1,, 0, 0, 1);
cv::warpPerspective(src, dst, translation*H, bb.size());
If I understood correctly, basically question demands the method to calculate the correct offset for translation of the warped image. I will explain how to get the right offset for translation. Idea is that matching features in two images should have the same coordinate in the final stitched image.
Let's say we refer images as follows:
'source image' (si): the image which needs to be warped
'destination image' (di): the image to whose perspective 'source image' will be warped
'warped source image'(wsi): source image
after warping it to the destination image perspective
Following is what you need to do in order to calculate offset for translation:
After you have sampled the good matches and found the mask from homography, store the best match's keypoint(one with a minimum distance and being an inlier (should get the value of 1 in mask obtained from homography calculation)) in si and di. Let's say best match's keypoint in si and diisbm_siandbm_di` respectively..
bm_si = [x1, y1,1]
bm_di = [x2, y2, 1]
Find the position of bm_si in wsi by simply multiplying it with the homography matrix (H).
bm_wsi =,bm_si)
bm_wsi = [x/bm_wsi[2] for x in bm_wsi]
Depending on where you will be placing the di on the output of si warping (=wsi), adjust the bm_di
Let's say if you are warping from the left image to right image (such that left image is si and the right image is di) then you will placing di on the right side wsi and hence bm_di[0] += si.shape[0]
Now after the above steps
x_offset = bm_di[0] - bm_si[0]
y_offset = bm_di[1] - bm_si[1]
Using calculated offset find the new homography matrix and warp the si.
T = np.array([[1, 0, x_offset], [0, 1, y_offset], [0, 0, 1]])
translated_H =
wsi_frame_size = tuple(2*x for x in si.shape)
stitched = cv2.warpPerspective(si, translated_H, wsi_frame_size)
stitched[0:si.shape[0],si.shape[1]:] = di
I've currently trouble to understand what's necessary to transform a cv::RotatedRect after rotating an image without cropping using the following code by Lars Schillingmann in this question.
Here's the code he provided as answer:
#include "opencv2/opencv.hpp"
int main()
cv::Mat src = cv::imread("im.png", CV_LOAD_IMAGE_UNCHANGED);
double angle = -45;
// get rotation matrix for rotating the image around its center in pixel coordinates
cv::Point2f center((src.cols-1)/2.0, (src.rows-1)/2.0);
cv::Mat rot = cv::getRotationMatrix2D(center, angle, 1.0);
// determine bounding rectangle, center not relevant
cv::Rect2f bbox = cv::RotatedRect(cv::Point2f(), src.size(), angle).boundingRect2f();
// adjust transformation matrix<double>(0,2) += bbox.width/2.0 - src.cols/2.0;<double>(1,2) += bbox.height/2.0 - src.rows/2.0;
cv::Mat dst;
cv::warpAffine(src, dst, rot, bbox.size());
cv::imwrite("rotated_im.png", dst);
return 0;
In my case, I've a cv::RotatedRect which matches a certain position in the src image. This cv::RotatedRect should match the same postion after the transformation/rotation was applied to the src mat. Currently, I struggle with doing it the right way.
From what I know, to rotate a cv::RotatedRect, it's only necessary to directly modify the members of the structure e.g. angle. I'm quite sure that I only have to modify the center, but the new position is always a bit off from the expected location. I initially expected that I only have to add the difference between bbox and src dimensions to get what I'm looking for but it turns out to be not the case (inlcuding the rotation of course).
connected_components[i].center.x += ...
connected_components[i].center.y += ...
cv::RotatedRect newRect(connected_components[i].center, connected_components[i].size, connected_components[i].angle- median);
The answer is quite simple. We can reuse the transformation matrix for a point transform using cv::transform. Sample code is below:
cv::Point2f points[4];
std::vector<cv::Point2f> old_points;
old_points.insert(old_points.begin(), std::begin(points), std::end(points));
std::vector<cv::Point2f> new_points;
cv::transform(old_points, new_points, rotation_matrix);
for (unsigned int j = 0; j < 4; ++j) {
cv::line(dest, new_points[j], new_points[(j + 1) % 4], cv::Scalar(0, 255, 0));
I am writing my thesis and one part of the task is to interpolate between images to create intermediate images. The work has to be done in c++ using openCV 2.4.13.
The best solution I've found so far is computing optical flow and remapping. But this solution has two problems that I am unable to solve on my own:
There are pixels that should go out of view (bottom of image for example), but they do not.
Some pixels do not move, creating a distorted result (upper right part of the couch)
What has made the flow&remap approach better:
Equalizing the intensity. This i'm allowed to do. You can check the result by comparing the couch form (centre of remapped image and original).
Reducing size of image. This i'm NOT allowed to do, as I need the same size output. Is there a way to rescale the optical flow result to get the bigger remapped image?
Other approaches tried and failed:
cuda::interpolateFrames. Creates incredible ghosting.
blending images with cv::addWeighted. Even worse ghosting.
Below is the code I am using at the moment. And images: dropbox link with input and result images
int main(){
cv::Mat second, second_gray, cutout, cutout_gray, flow_n;
second = cv::imread( "/home/zuze/Desktop/forstack/second_L.jpg", 1 );
cutout = cv::imread("/home/zuze/Desktop/forstack/cutout_L.png", 1);
cvtColor(second, second_gray, CV_BGR2GRAY);
cvtColor(cutout, cutout_gray, CV_RGB2GRAY );
///----------COMPUTE OPTICAL FLOW AND REMAP -----------///
cv::calcOpticalFlowFarneback( second_gray, cutout_gray, flow_n, 0.5, 3, 15, 3, 5, 1.2, 0 );
cv::Mat remap_n; //looks like it's drunk.
createNewFrame(remap_n, flow_n, 1, second, cutout );
cv::Mat cflow_n;
cflow_n = cutout_gray;
cvtColor(cflow_n, cflow_n, CV_GRAY2BGR);
drawOptFlowMap(flow_n, cflow_n, 10, CV_RGB(0,255,0));
cv::Mat cutout_eq, second_eq;
cutout_eq= equalizeIntensity(cutout);
second_eq= equalizeIntensity(second);
cv::Mat flow_eq, cutout_eq_gray, second_eq_gray, cflow_eq;
cvtColor( cutout_eq, cutout_eq_gray, CV_RGB2GRAY );
cvtColor( second_eq, second_eq_gray, CV_RGB2GRAY );
cv::calcOpticalFlowFarneback( second_eq_gray, cutout_eq_gray, flow_eq, 0.5, 3, 15, 3, 5, 1.2, 0 );
cv::Mat remap_eq;
createNewFrame(remap_eq, flow_eq, 1, second, cutout_eq );
cflow_eq = cutout_eq_gray;
cvtColor(cflow_eq, cflow_eq, CV_GRAY2BGR);
drawOptFlowMap(flow_eq, cflow_eq, 10, CV_RGB(0,255,0));
cv::imshow("remap_n", remap_n);
cv::imshow("remap_eq", remap_eq);
cv::imshow("cflow_eq", cflow_eq);
cv::imshow("cflow_n", cflow_n);
cv::imshow("sec_eq", second_eq);
cv::imshow("cutout_eq", cutout_eq);
cv::imshow("cutout", cutout);
cv::imshow("second", second);
return 0;
Function for remapping, to be used for intermediate image creation:
void createNewFrame(cv::Mat & frame, const cv::Mat & flow, float shift, cv::Mat & prev, cv::Mat &next){
cv::Mat mapX(flow.size(), CV_32FC1);
cv::Mat mapY(flow.size(), CV_32FC1);
cv::Mat newFrame;
for (int y = 0; y < mapX.rows; y++){
for (int x = 0; x < mapX.cols; x++){
cv::Point2f f =<cv::Point2f>(y, x);<float>(y, x) = x + f.x*shift;<float>(y, x) = y + f.y*shift;
remap(next, newFrame, mapX, mapY, cv::INTER_LANCZOS4);
frame = newFrame;
Function to display optical flow in vector form:
void drawOptFlowMap (const cv::Mat& flow, cv::Mat& cflowmap, int step, const cv::Scalar& color) {
cv::Point2f sum; //zz
std::vector<float> all_angles;
int count=0; //zz
float angle, sum_angle=0; //zz
for(int y = 0; y < cflowmap.rows; y += step)
for(int x = 0; x < cflowmap.cols; x += step)
const cv::Point2f& fxy =< cv::Point2f>(y, x);
if((fxy.x != fxy.x)||(fxy.y != fxy.y)){ //zz, for SimpleFlow
//std::cout<<"meh"; //do nothing
line(cflowmap, cv::Point(x,y), cv::Point(cvRound(x+fxy.x), cvRound(y+fxy.y)),color);
circle(cflowmap, cv::Point(cvRound(x+fxy.x), cvRound(y+fxy.y)), 1, color, -1);
sum +=fxy;//zz
angle = atan2(fxy.y,fxy.x);
sum_angle +=angle;
count++; //zz
Function to equalize intensity of images, for better results:
cv::Mat equalizeIntensity(const cv::Mat& inputImage){
if(inputImage.channels() >= 3){
cv::Mat ycrcb;
std::vector<cv::Mat> channels;
cv::equalizeHist(channels[0], channels[0]);
cv::Mat result;
return result;
return cv::Mat();
So to recap, my questions:
Is it possible to resize Farneback optical flow to apply to 2xbigger image?
How to deal with pixels that go out of view like at the bottom of my images (the brown wooden part should disappear).
How to deal with distortion that is created because optical flow wasn't computed for those pixels, while many pixels around there have motion? (couch upper right, & lion figurine has a ghost hand in the remapped image).
With OpenCV's Farneback optical flow, you will only get a rough estimation of pixel displacement, hence the distortions that appear on the result images.
I don't think optical flow is the way to go for what you are trying to achieve IMHO. Instead I'd recommend you to have a look at Image / Pixel Registration for instace here :
Image / Pixel Registration is the science of matching pixels of two images. Active research is ongoing about this complex non-trivial subject that is not yet accurately resolved.
I have the following problem. I'm searching for eyes within an image using HaarClassifiers. Due to the rotation of the head I'm trying to find eyes within different angles. For that, I rotate the image by different angles. For rotating the frame, I use the code (written in C++):
Point2i rotCenter;
rotCenter.x = scaledFrame.cols / 2;
rotCenter.y = scaledFrame.rows / 2;
Mat rotationMatrix = getRotationMatrix2D(rotCenter, angle, 1);
warpAffine(scaledFrame, scaledFrame, rotationMatrix, Size(scaledFrame.cols, scaledFrame.rows));
This works fine and I am able to extract two ROI Rectangles for the eyes. So, I have the top/left coordinates of each ROI as well as their width and height. However, these coordinates are the coordinates in the rotated image. I don't know how I can backproject this rectangle onto the original frame.
Assuming I have the obtaind eye pair rois for the unscaled frame (full_image), but still roated.
eye0_roi and eye1_roi
How can I rotate them back, such that they map their correct position?
Best regards,
You can use the invertAffineTransform to get the inverse matrix and use this matrix to rotate point back:
Mat RotateImg(const Mat& img, double angle, Mat& invertMat)
Point center = Point( img.cols/2, img.rows/2);
double scale = 1;
Mat warpMat = getRotationMatrix2D( center, angle, scale );
Mat dst = Mat(img.size(), CV_8U, Scalar(128));
warpAffine( img, dst, warpMat, img.size(), 1, 0, Scalar(255, 255, 255));
invertAffineTransform(warpMat, invertMat);
return dst;
Point RotateBackPoint(const Point& dstPoint, const Mat& invertMat)
cv::Point orgPoint;
orgPoint.x =<double>(0,0)*dstPoint.x +<double>(0,1)*dstPoint.y +<double>(0,2);
orgPoint.y =<double>(1,0)*dstPoint.x +<double>(1,1)*dstPoint.y +<double>(1,2);
return orgPoint;
I have the gradients from the Sobel operator for each pixel. In my case 320x480. But how can I relate them with the orientation? For an example, I'm planning to draw an orientation map for fingerprints. So, how do I start?
Is it by dividing the gradients into blocks (example 16x24) then adding the gradients together and diving it by 384 to get the average gradients? Then from there draw a line from the center of the block using the average gradient?
Correct me if i'm wrong. Thank you.
Here are the codes that i used to find gradients
cv::Mat original_Mat=cv::imread("original.bmp", 1);
cv::Mat grad = cv::Mat::zeros(original_Mat.size(), CV_64F);
cv::Mat grad_x = cv::Mat::zeros(original_Mat.size(), CV_64F);
cv::Mat grad_y = cv::Mat::zeros(original_Mat.size(), CV_64F);
/// Gradient X
cv::Sobel(original_Mat, grad_x, CV_16S, 1, 0, 3);
/// Gradient Y
cv::Sobel(original_Mat, grad_y, CV_16S, 0, 1, 3);
short* pixelX = grad_x.ptr<short>(0);
short* pixelY = grad_y.ptr<short>(0);
int count = 0;
int min = 999999;
int max = -1;
int a=0,b=0;
for(int i = 0; i < grad_x.rows * grad_x.cols; i++)
double directionRAD = atan2(pixelY[i], pixelX[i]);
int directionDEG = (int)(180 + directionRAD / CV_PI * 180);
//printf("%d ",directionDEG);
if(directionDEG < min){min = directionDEG;}
if(directionDEG > max){max = directionDEG;}
if(directionDEG < 0 || directionDEG > 360)
cout<<"Weird gradient direction given in method: getGradients.";
There are several ways to visualize an orientation map:
As you suggested, you could draw it block-wise, but then you would have to be careful about "averaging" the directions. For example, what happens if you average the directions 0° and 180°?
More commonly, the direction is simply mapped to a grey value. This would visualize the gradient per pixel. For example as:
int v = (int)(128+directionRAD / CV_PI * 128);
(Disclaimer: not 100% sure about the 128, one of them might actually have to be a 127...
Or you could map the x and y gradient magnitudes to the rand gcomponents, respectively, ideally after normalizing the gradient vector to length 1. Assuming normX to be the normalized gradient in the x direction with values between -1 and 1:
int red = (int)((normX + 1)*127.5);
int green= (int)((normY + 1)*127.5);
Averaging depends on Sobel kernel size.
It'll be better to use CV_32FC or CV_64FC instead of CV_16S for results.
Also you can speed up your code using cv::phase method.
see my answer here: Sobel operator for gradient angle
I'm looking to undistort an image using the distortion coefficients that I've computed for my camera, without changing the camera matrix. This is exactly what undistort() does, but I wanted to draw the output to a larger canvas image.
When I tried this:
Mat drawtransform = getOptimalNewCameraMatrix(cameraMatrix, distCoeffs, size, 1.0, size * 2);
undistort(inputimage, undistorted, cameraMatrix, distCoeffs, drawtransform);
It still wrote out the same sized image, but only the top left quarter of the scaled-up-by-two undistorted result. Like the documentation says, undistort writes into a target image of the same size.
It's pretty obvious that I can just go copy out and reimplement a slightly tweaked version of undistort() but I am having some trouble understanding what it is doing. Here's the source:
void cv::undistort( InputArray _src, OutputArray _dst, InputArray _cameraMatrix,
InputArray _distCoeffs, InputArray _newCameraMatrix )
Mat src = _src.getMat(), cameraMatrix = _cameraMatrix.getMat();
Mat distCoeffs = _distCoeffs.getMat(), newCameraMatrix = _newCameraMatrix.getMat();
_dst.create( src.size(), src.type() );
Mat dst = _dst.getMat();
CV_Assert( != );
int stripe_size0 = std::min(std::max(1, (1 << 12) / std::max(src.cols, 1)), src.rows);
Mat map1(stripe_size0, src.cols, CV_16SC2), map2(stripe_size0, src.cols, CV_16UC1);
Mat_<double> A, Ar, I = Mat_<double>::eye(3,3);
cameraMatrix.convertTo(A, CV_64F);
if( )
distCoeffs = Mat_<double>(distCoeffs);
distCoeffs.create(5, 1, CV_64F);
distCoeffs = 0.;
if( )
newCameraMatrix.convertTo(Ar, CV_64F);
double v0 = Ar(1, 2);
for( int y = 0; y < src.rows; y += stripe_size0 )
int stripe_size = std::min( stripe_size0, src.rows - y );
Ar(1, 2) = v0 - y;
Mat map1_part = map1.rowRange(0, stripe_size),
map2_part = map2.rowRange(0, stripe_size),
dst_part = dst.rowRange(y, y + stripe_size);
initUndistortRectifyMap( A, distCoeffs, I, Ar, Size(src.cols, stripe_size),
map1_part.type(), map1_part, map2_part );
remap( src, dst_part, map1_part, map2_part, INTER_LINEAR, BORDER_CONSTANT );
About half of the lines here are for sanity checking and initializing input parameters. What I'm confused about is what's going on with map1 and map2. These names are sadly less descriptive than most. I must be missing some explanation, maybe it's tucked away in some introduction page, or under the doc for another function.
map1 is a two channel signed short integer matrix and map2 is an unsigned short integer matrix, both are of dimension (height, max(4096/width, 1)). The question is, why? What will these maps contain? What is the significance and purpose of this striping? What is the significance and purpose of the strange dimension of the stripes?
Use initUndistortRectifyMap to obtain the transformation to the scale you desire , then apply its output (the two matrices you mention) to remap .
The first map is used to compute the transform the x coordinate at each pixel position, the second is used to transform the y coordinate.
You might want to read the description for the function remap. The map represents the pixel X,Y location in the source image for every pixel in the destination image. Map1_part is every X location in the source, and Map2_part is every Y location in the source.
Without reading into it much, the striping could be a method of speeding up the transformation process.
Also, if you are looking to just scale your image to a larger dimension you could just re-size the output image.
double scaleX = 2.0;
double scaleY = 2.0;
cv::Mat undistortedScaled;
cv::resize(undistorted, undistortedScaled, cv::Size(0,0), scaleX, scaleY);