OpenCV Center Homography - c++

I am trying to create a stitching algorithm. I have been successful in creating it with a few tweaks needed. The photos below are examples of my stitching program so far. I am able to provide it with an unordered list of image (so long as image is in flight path or side by side it will work regardless of their orientation to one another.
The issue is if the images are reversed some of the image doesn't make it into the final product. Here is the code for the actual stitching. Assume that finding keypoints, matching, and homography is done correctly.
By altering this code is there a way to centre the first image to the destination blank image and still stitch to it. Also, I got this code on stack overflow (Opencv Image Stitching or Panorama ) and am not fully sure how it works and would love if someone could explain it.
Thanks for any help in advance!
Mat stitchMatches(Mat image1,Mat image2, Mat homography){
Mat result;
vector<Point2f> fourPoint;
//-Get the four corners of the first image (master)
fourPoint.push_back(Point2f (0,0));
fourPoint.push_back(Point2f (image1.size().width,0));
fourPoint.push_back(Point2f (0, image1.size().height));
fourPoint.push_back(Point2f (image1.size().width, image1.size().height));
Mat destination;
perspectiveTransform(Mat(fourPoint), destination, homography);
double min_x, min_y, tam_x, tam_y;
float min_x1, min_x2, min_y1, min_y2, max_x1, max_x2, max_y1, max_y2;
min_x1 = min(fourPoint.at(0).x, fourPoint.at(1).x);
min_x2 = min(fourPoint.at(2).x, fourPoint.at(3).x);
min_y1 = min(fourPoint.at(0).y, fourPoint.at(1).y);
min_y2 = min(fourPoint.at(2).y, fourPoint.at(3).y);
max_x1 = max(fourPoint.at(0).x, fourPoint.at(1).x);
max_x2 = max(fourPoint.at(2).x, fourPoint.at(3).x);
max_y1 = max(fourPoint.at(0).y, fourPoint.at(1).y);
max_y2 = max(fourPoint.at(2).y, fourPoint.at(3).y);
min_x = min(min_x1, min_x2);
min_y = min(min_y1, min_y2);
tam_x = max(max_x1, max_x2);
tam_y = max(max_y1, max_y2);
Mat Htr = Mat::eye(3,3,CV_64F);
if (min_x < 0){
tam_x = image2.size().width - min_x;
Htr.at<double>(0,2)= -min_x;
}
if (min_y < 0){
tam_y = image2.size().height - min_y;
Htr.at<double>(1,2)= -min_y;
}
result = Mat(Size(tam_x*2,tam_y*2), CV_32F);
warpPerspective(image2, result, Htr, result.size(), INTER_LINEAR, BORDER_CONSTANT, 0);
warpPerspective(image1, result, (Htr*homography), result.size(), INTER_LINEAR, BORDER_TRANSPARENT,0);
return result;`

It's normally easy to center an image; you simply create a bigger matrix padded with zeros (or whatever color you want), and define an ROI in the center with the same size of your image, and place it in there. However, you cannot in general do this with your two images. The problem is that if an image is shifted, or rotated, so that parts of it are outside your destination image bounds, then your returned warped image from warpPerspective is cut off at those bounds. What you need to do is create the padded image, insert the image that is not being warped wherever you like, and modify the transformation (homography, in this case) by adding in the translation to those pixels.
For example, if your centered image has it's top-left point at (400,500) in the padded image, then you need to add a translation of (400, 500) to your homography so the pixels get mapped to the correct space, and as long as your padded image is large enough, none of it will be cut off.
You will need to create a translational homography and compose it with your original homography to add the translation in. For example, suppose your anchor point for the non-warped image inside the padded image is at (x,y). Translation in an homography is given by the last two columns; if your homography is a 3x3 matrix H then (using normal mathematical indexing) H(1,3) is your translation in x and H(2,3) is the translation in y given by your homography. So we need to create a new identity homography H_t and add those translations in:
1 0 x
H_t = 0 1 y
0 0 1
Then you can compose this with your original homography H (using matrix multiplication): H_n = H_t * H. Using the new homography H_n we can warp the image into this padded space with that added translation to move it to the correct spot using warpPerspective as usual.
You can also automate this to pad the image precisely as much as it needs, so that you don't have excess padding and the padding will stretch only as needed. See my answer here for a detailed explanation of how to calculate that and warp your images into the padded space.

Related

3D Reconstruction Of Planar Markers usin OpenCV

I am trying to perform 3D Reconstruction(Structure From Motion) from Multiple Images of Planar Markers. I very new to MVG and openCV.
As far I have understood I have to do the following steps:
Identify corresponding 2D corner points in the one images.
Calculate the Camera Pose of the first image us cv::solvePNP(assuming the
origin to be center of the marker).
Repeat 1 and 2 for the second image.
Estimate the relative motion of the camera by Rot_relative = R2 - R1,
Trans_relative = T2-T1.
Now assume the first camera to be the origin construct the 3x4 Projection
Matrix for both views, P1 =[I|0]*CameraMatrix(known by Calibration) and P2 =
[Rot_relative |Trans_relative ].
Use the created projection matrices and 2D corner points to triangulate the
3D coordinate using cv::triangulatePoints(P1,P2,point1,point2,OutMat)
The 3D coordinate can be found by dividing the each rows of OutMat by the 4th
row.
I was hoping to keep my "First View" as my origin and iterate
through n views repeating steps from 1-7(I suppose its called Global SFM).
I was hoping to get (n-1)3D points of the corners with "The first View as origin" which we could optimize using Bundle Adjustment.
But the result I get is very disappointing the 3D points calculated are displaced by a huge factor.
These are questions:
1.Is there something wrong with the steps I followed?
2.Should I use cv::findHomography() and cv::decomposeHomographyMat() to find the
relative motion of the camera?
3.Should point1 and point2 in cv::triangulatePoints(P1,P2,point1,point2,OutMat)
be normalized and undistorted? If yes, how should the "Outmat" be interpreted?
Please anyone who has insights towards the topic, can you point out my mistake?
P.S. I have come to above understanding after reading "MultiView Geometry in Computer Vision"
Please find the code snippet below:
cv::Mat Reconstruction::Triangulate(std::vector<cv::Point2f>
ImagePointsFirstView, std::vector<cv::Point2f>ImagePointsSecondView)
{
cv::Mat rVectFirstView, tVecFristView;
cv::Mat rVectSecondView, tVecSecondView;
cv::Mat RotMatFirstView = cv::Mat(3, 3, CV_64F);
cv::Mat RotMatSecondView = cv::Mat(3, 3, CV_64F);
cv::solvePnP(RealWorldPoints, ImagePointsFirstView, cameraMatrix, distortionMatrix, rVectFirstView, tVecFristView);
cv::solvePnP(RealWorldPoints, ImagePointsSecondView, cameraMatrix, distortionMatrix, rVectSecondView, tVecSecondView);
cv::Rodrigues(rVectFirstView, RotMatFirstView);
cv::Rodrigues(rVectSecondView, RotMatSecondView);
cv::Mat RelativeRot = RotMatFirstView-RotMatSecondView ;
cv::Mat RelativeTrans = tVecFristView-tVecSecondView ;
cv::Mat RelativePose;
cv::hconcat(RelativeRot, RelativeTrans, RelativePose);
cv::Mat ProjectionMatrix_0 = cameraMatrix*cv::Mat::eye(3, 4, CV_64F);
cv::Mat ProjectionMatrix_1 = cameraMatrix* RelativePose;
cv::Mat X;
cv::undistortPoints(ImagePointsFirstView, ImagePointsFirstView, cameraMatrix, distortionMatrix, cameraMatrix);
cv::undistortPoints(ImagePointsSecondView, ImagePointsSecondView, cameraMatrix, distortionMatrix, cameraMatrix);
cv::triangulatePoints(ProjectionMatrix_0, ProjectionMatrix_1, ImagePointsFirstView, ImagePointsSecondView, X);
X.row(0) = X.row(0) / X.row(3);
X.row(1) = X.row(1) / X.row(3);
X.row(2) = X.row(2) / X.row(3);
return X;
}

Hausdorff Distance Object Detection

I have been struggling trying to implement the outlining algorithm described here and here.
The general idea of the paper is determining the Hausdorff distance of binary images and using it to find the template image from a test image.
For template matching, it is recommended to construct image pyramids along with sliding windows which you'll use to slide over your test image for detection. I was able to do both of these as well.
I am stuck on how to move forward from here on. Do I slide my template over the test image from different pyramid layers? Or is it the test image over the template? And with regards to the sliding window, is/are they meant to be a ROI of the test or template image?
In a nutshell, I have pieces to the puzzle but no idea of which direction to take to solve the puzzle
int distance(vector<Point>const& image, vector<Point>const& tempImage)
{
int maxDistance = 0;
for(Point imagePoint: image)
{
int minDistance = numeric_limits<int>::max();
for(Point tempPoint: tempImage)
{
Point diff = imagePoint - tempPoint;
int length = (diff.x * diff.x) + (diff.y * diff.y);
if(length < minDistance) minDistance = length;
if(length == 0) break;
}
maxDistance += minDistance;
}
return maxDistance;
}
double hausdorffDistance(vector<Point>const& image, vector<Point>const& tempImage)
{
double maxDistImage = distance(image, tempImage);
double maxDistTemp = distance(tempImage, image);
return sqrt(max(maxDistImage, maxDistTemp));
}
vector<Mat> buildPyramids(Mat& frame)
{
vector<Mat> pyramids;
int count = 6;
Mat prevFrame = frame, nextFrame;
while(count > 0)
{
resize(prevFrame, nextFrame, Size(), .85, .85);
prevFrame = nextFrame;
pyramids.push_back(nextFrame);
--count;
}
return pyramids;
}
vector<Rect> slidingWindows(Mat& image, int stepSize, int width, int height)
{
vector<Rect> windows;
for(size_t row = 0; row < image.rows; row += stepSize)
{
if((row + height) > image.rows) break;
for(size_t col = 0; col < image.cols; col += stepSize)
{
if((col + width) > image.cols) break;
windows.push_back(Rect(col, row, width, height));
}
}
return windows;
}
Edit I: More analysis on my solution can be found here
This is a bi-directional task.
Forward Direction
1. Translation
For each contour, calculate its moment. Then for each point in that contour, translate it about the moment i.e. contour.point[i] = contour.point[i] - contour.moment[i]. This moves all of the contour points to the origin.
PS: You need to keep track of each contour's produced moment because it will be used in the next section
2. Rotation
With the newly translated points, calculate their rotated rect. This will give you the angle of rotation. Depending on this angle, you would want to calculate the new angle which you want to rotate this contour by; this answer would be helpful.
After attaining the new angle, calculate the rotation matrix. Remember that your center here will be the origin i.e. (0, 0). I did not take scaling into account (that's where the pyramids come into play) when calculating the rotation matrix hence I passed 1.
PS: You need to keep track of each contour's produced matrix because it will be used in the next section
Using this matrix, you can go ahead and rotate each point in the contour by it as shown here*.
Once all of this is done, you can go ahead and calculate the Hausdorff distance and find contours which pass your set threshold.
Back Direction
Everything done in the first section, has to be undone in order for us to draw the valid contours onto our camera feed.
1. Rotation
Recall that each detected contour produced a rotation matrix. You want to undo the rotation of the valid contours. Just perform the same rotation but using the inverse matrix.
For each valid contour and corresponding matrix
inverse_matrix = matrix[i].inv(cv2.DECOMP_SVD)
Use * to rotate the points but with inverse_matrix as parameter
PS: When calculating the inverse, if the produced matrix was not a square one, it would fail. cv2.DECOMP_SVD will produce an inverse matrix even if the original matrix was a non-square.
2. Translation
With the valid contours' points rotated back, you just have to undo the previously performed translation. Instead of subtracting, just add the moment to each point.
You can now go ahead and draw these contours to your camera feed.
Scaling
This is were image pyramids come into play.
All you have to do is resize your template image by a fixed size/ratio upto your desired number of times (called layers). The tutorial found here does a good job of explaining how to do this in OpenCV.
It goes without saying that the values you choose to resize your image by and number of layers will and do play a huge role in how robust your program will be.
Put it all together
Template Image Operations
Create a pyramid consisting of n layers
For each layer in n
Find contours
Translate the contour points
Rotate the contour points
This operation should only be performed once and only store the results of the rotated points.
Camera Feed Operations
Assumptions
Let the rotated contours of the template image at each level be stored in templ_contours. So if I say templ_contours[0], this is going to give me the rotated contours at pyramid level 0.
Let the image's translated, rotated contours and moments be stored in transCont, rotCont and moment respectively.
image_contours = Find Contours
for each contour detected in image
moment = calculate moment
for each point in image_contours
transCont.thisPoint = forward_translate(image_contours.thisPoint)
rotCont.thisPoint = forward_rotate(transCont.thisPoint)
for each contour_layer in templ_contours
for each contour in rotCont
calculate Hausdorff Distance
valid_contours = contours_passing_distance_threshold
for each point in valid_contours
valid_point = backward_rotate(valid_point)
for each point in valid_contours
valid_point = backward_translate(valid_point)
drawContours(valid_contours, image)

Matching small grayscale images

I want to test whether two images match. Partial matches also interest me.
The problem is that the images suffer from strong noise. Another problem is that the images might be rotated with an unknown angle. The objects shown in the images will roughly always have the same scale!
The images show area scans from a top-shot perspective. "Lines" are mostly walls and other objects are mostly trees and different kinds of plants.
Another problem was, that the left image was very blurry and the right one's lines were very thin.
To compensate for this difference I used dilation. The resulting images are the ones I uploaded.
Although It can easily be seen that these images match almost perfectly I cannot convince my algorithm of this fact.
My first idea was a feature based matching, but the matches are horrible. It only worked for a rotation angle of -90°, 0° and 90°. Although most descriptors are rotation invariant (in past projects they really were), the rotation invariance seems to fail for this example.
My second idea was to split the images into several smaller segments and to use template matching. So I segmented the images and, again, for the human eye they are pretty easy to match. The goal of this step was to segment the different walls and trees/plants.
The upper row are parts of the left, and the lower are parts of the right image. After the segmentation the segments were dilated again.
As already mentioned: Template matching failed, as did contour based template matching and contour matching.
I think the dilation of the images was very important, because it was nearly impossible for the human eye to match the segments without dilation before the segmentation. Another dilation after the segmentation made this even less difficult.
Your first job should be to fix the orientation. I am not sure what is the best algorithm to do that but here is an approach I would use: fix one of the images and start rotating the other. For each rotation compute a histogram for the color intense on each of the rows/columns. Compute some distance between the resulting vectors(e.g. use cross product). Choose the rotation that results in smallest cross product. It may be good idea to combine this approach with hill climbing.
Once you have the images aligned in approximately the same direction, I believe matching should be easier. As the two images are supposed to be at the same scale, compute something analogous to the geometrical center for both images: compute weighted sum of all pixels - a completely white pixel would have a weight of 1, and a completely black - weight 0, the sum should be a vector of size 2(x and y coordinate). After that divide those values by the dimensions of the image and call this "geometrical center of the image". Overlay the two images in a way that the two centers coincide and then once more compute cross product for the difference between the images. I would say this should be their difference.
You can also try following methods to find rotation and similarity.
Use image moments to get the rotation as shown here.
Once you rotate the image, use cross-correlation to evaluate the similarity.
EDIT
I tried this with OpenCV and C++ for the two sample images. I'm posting the code and results below as it seems to work well at least for the given samples.
Here's the function to calculate the orientation vector using image moments:
Mat orientVec(Mat& im)
{
Moments m = moments(im);
double cov[4] = {m.mu20/m.m00, m.mu11/m.m00, m.mu11/m.m00, m.mu02/m.m00};
Mat covMat(2, 2, CV_64F, cov);
Mat evals, evecs;
eigen(covMat, evals, evecs);
return evecs.row(0);
}
Rotate and match sample images:
Mat im1 = imread(INPUT_FOLDER_PATH + string("WojUi.png"), 0);
Mat im2 = imread(INPUT_FOLDER_PATH + string("XbrsV.png"), 0);
// get the orientation vector
Mat v1 = orientVec(im1);
Mat v2 = orientVec(im2);
double angle = acos(v1.dot(v2))*180/CV_PI;
// rotate im2. try rotating with -angle and +angle. here using -angle
Mat rot = getRotationMatrix2D(Point(im2.cols/2, im2.rows/2), -angle, 1.0);
Mat im2Rot;
warpAffine(im2, im2Rot, rot, Size(im2.rows, im2.cols));
// add a border to rotated image
int borderSize = im1.rows > im2.cols ? im1.rows/2 + 1 : im1.cols/2 + 1;
Mat im2RotBorder;
copyMakeBorder(im2Rot, im2RotBorder, borderSize, borderSize, borderSize, borderSize,
BORDER_CONSTANT, Scalar(0, 0, 0));
// normalized cross-correlation
Mat& image = im2RotBorder;
Mat& templ = im1;
Mat nxcor;
matchTemplate(image, templ, nxcor, CV_TM_CCOEFF_NORMED);
// take the max
double max;
Point maxPt;
minMaxLoc(nxcor, NULL, &max, NULL, &maxPt);
// draw the match
Mat rgb;
cvtColor(image, rgb, CV_GRAY2BGR);
rectangle(rgb, maxPt, Point(maxPt.x+templ.cols-1, maxPt.y+templ.rows-1), Scalar(0, 255, 255), 2);
cout << "max: " << max << endl;
With -angle rotation in code, I get max = 0.758. Below is the rotated image in this case with the matching region.
Otherwise max = 0.293

Cumulative Homography Wrongly Scaling

I'm to build a panorama image of the ground covered by a downward facing camera (at a fixed height, around 1 metre above ground). This could potentially run to thousands of frames, so the Stitcher class' built in panorama method isn't really suitable - it's far too slow and memory hungry.
Instead I'm assuming the floor and motion is planar (not unreasonable here) and trying to build up a cumulative homography as I see each frame. That is, for each frame, I calculate the homography from the previous one to the new one. I then get the cumulative homography by multiplying that with the product of all previous homographies.
Let's say I get H01 between frames 0 and 1, then H12 between frames 1 and 2. To get the transformation to place frame 2 onto the mosaic, I need to get H01*H12. This continues as the frame count increases, such that I get H01*H12*H23*H34*H45*....
In code, this is something akin to:
cv::Mat previous, current;
// Init cumulative homography
cv::Mat cumulative_homography = cv::Mat::eye(3);
video_stream >> previous;
for(;;) {
video_stream >> current;
// Here I do some checking of the frame, etc
// Get the homography using my DenseMosaic class (using Farneback to get OF)
cv::Mat tmp_H = DenseMosaic::get_homography(previous,current);
// Now normalise the homography by its bottom right corner
tmp_H /= tmp_H.at<double>(2, 2);
cumulative_homography *= tmp_H;
previous = current.clone( );
}
It works pretty well, except that as the camera moves "up" in the viewpoint, the homography scale decreases. As it moves down, the scale increases again. This gives my panoramas a perspective type effect that I really don't want.
For example, this is taken on a few seconds of video moving forward then backward. The first frame looks ok:
The problem comes as we move forward a few frames:
Then when we come back again, you can see the frame gets bigger again:
I'm at a loss as to where this is coming from.
I'm using Farneback dense optical flow to calculate pixel-pixel correspondences as below (sparse feature matching doesn't work well on this data) and I've checked my flow vectors - they're generally very good, so it's not a tracking problem. I also tried switching the order of the inputs to find homography (in case I'd mixed up the frame numbers), still no better.
cv::calcOpticalFlowFarneback(grey_1, grey_2, flow_mat, 0.5, 6,50, 5, 7, 1.5, flags);
// Using the flow_mat optical flow map, populate grid point correspondences between images
std::vector<cv::Point2f> points_1, points_2;
median_motion = DenseMosaic::dense_flow_to_corresp(flow_mat, points_1, points_2);
cv::Mat H = cv::findHomography(cv::Mat(points_2), cv::Mat(points_1), CV_RANSAC, 1);
Another thing I thought it could be was the translation I include in the transformation to ensure my panorama is centred within the scene:
cv::warpPerspective(init.clone(), warped, translation*homography, init.size());
But having checked the values in the homography before the translation is applied, the scaling issue I mention is still present.
Any hints are gratefully received. There's a lot of code I could put in but it seems irrelevant, please do let me know if there's something missing
UPDATE
I've tried switching out the *= operator for the full multiplication and tried reversing the order the homographies are multiplied in, but no luck. Below is my code for calculating the homography:
/**
\brief Calculates the homography between the current and previous frames
*/
cv::Mat DenseMosaic::get_homography()
{
cv::Mat grey_1, grey_2; // Grayscale versions of frames
cv::cvtColor(prev, grey_1, CV_BGR2GRAY);
cv::cvtColor(cur, grey_2, CV_BGR2GRAY);
// Calculate the dense flow
int flags = cv::OPTFLOW_FARNEBACK_GAUSSIAN;
if (frame_number > 2) {
flags = flags | cv::OPTFLOW_USE_INITIAL_FLOW;
}
cv::calcOpticalFlowFarneback(grey_1, grey_2, flow_mat, 0.5, 6,50, 5, 7, 1.5, flags);
// Convert the flow map to point correspondences
std::vector<cv::Point2f> points_1, points_2;
median_motion = DenseMosaic::dense_flow_to_corresp(flow_mat, points_1, points_2);
// Use the correspondences to get the homography
cv::Mat H = cv::findHomography(cv::Mat(points_2), cv::Mat(points_1), CV_RANSAC, 1);
return H;
}
And this is the function I use to find the correspondences from the flow map:
/**
\brief Calculate pixel->pixel correspondences given a map of the optical flow across the image
\param[in] flow_mat Map of the optical flow across the image
\param[out] points_1 The set of points from #cur
\param[out] points_2 The set of points from #prev
\param[in] step_size The size of spaces between the grid lines
\return The median motion as a point
Uses a dense flow map (such as that created by cv::calcOpticalFlowFarneback) to obtain a set of point correspondences across a grid.
*/
cv::Point2f DenseMosaic::dense_flow_to_corresp(const cv::Mat &flow_mat, std::vector<cv::Point2f> &points_1, std::vector<cv::Point2f> &points_2, int step_size)
{
std::vector<double> tx, ty;
for (int y = 0; y < flow_mat.rows; y += step_size) {
for (int x = 0; x < flow_mat.cols; x += step_size) {
/* Flow is basically the delta between left and right points */
cv::Point2f flow = flow_mat.at<cv::Point2f>(y, x);
tx.push_back(flow.x);
ty.push_back(flow.y);
/* There's no need to calculate for every single point,
if there's not much change, just ignore it
*/
if (fabs(flow.x) < 0.1 && fabs(flow.y) < 0.1)
continue;
points_1.push_back(cv::Point2f(x, y));
points_2.push_back(cv::Point2f(x + flow.x, y + flow.y));
}
}
// I know this should be median, not mean, but it's only used for plotting the
// general motion direction so it's unimportant.
cv::Point2f t_median;
cv::Scalar mtx = cv::mean(tx);
t_median.x = mtx[0];
cv::Scalar mty = cv::mean(ty);
t_median.y = mty[0];
return t_median;
}
It turns out this was because my viewpoint was close to the features, meaning that the non-planarity of the tracked features was causing skew to the homography. I managed to prevent this (it's more of a hack than a method...) by using estimateRigidTransform instead of findHomography, as this does not estimate for perspective variations.
In this particular case, it makes sense to do so, as the view does only ever undergo rigid transformations.

OpenCV: Shift/Align face image relative to reference Image (Image Registration)

I am new to OpenCV2 and working on a project in emotion recognition and would like to align a facial image in relation to a reference facial image. I would like to get the image translation working before moving to rotation. Current idea is to run a search within a limited range on both x and y coordinates and use the sum of squared differences as error metric to select the optimal x/y parameters to align the image. I'm using the OpenCV face_cascade function to detect the face images, all images are resized to a fixed (128x128). Question: Which parameters of the Mat image do I need to modify to shift the image in a positive/negative direction on both x and y axis? I believe setImageROI is no longer supported by Mat datatypes? I have the ROIs for both faces available however I am unsure how to use them.
void alignImage(vector<Rect> faceROIstore, vector<Mat> faceIMGstore)
{
Mat refimg = faceIMGstore[1]; //reference image
Mat dispimg = faceIMGstore[52]; // "displaced" version of reference image
//Rect refROI = faceROIstore[1]; //Bounding box for face in reference image
//Rect dispROI = faceROIstore[52]; //Bounding box for face in displaced image
Mat aligned;
matchTemplate(dispimg, refimg, aligned, CV_TM_SQDIFF_NORMED);
imshow("Aligned image", aligned);
}
The idea for this approach is based on Image Alignment Tutorial by Richard Szeliski Working on Windows with OpenCV 2.4. Any suggestions are much appreciated.
cv::Mat does support ROI. (But it does not support COI - channel-of-interest.)
To apply ROI you can use operator() or special constructor:
Mat refimgROI = faceIMGstore[1](faceROIstore[1]); //reference image ROI
Mat dispimgROI(faceIMGstore[52], faceROIstore[52]); // "displaced" version of reference image ROI
And to find the best position inside a displaced image you can utilize matchTemplate function.
Based on your comments I can suggest the following code which will find the best position of reference patch nearby the second (displaced) patch:
Mat ref = faceIMGstore[1](faceROIstore[1]);
Mat disp = faceIMGstore[52](faceROIstore[52]);
disp = disp.adjustROI(5,5,5,5); //allow 5 pixel max adjustment in any direction
if(disp.cols < ref.cols || disp.rows < ref.rows)
return 0;
Mat map;
cv::matchTemplate( disp, ref, map, CV_TM_SQDIFF_NORMED );
Point minLoc;
cv::minMaxLoc( map, 0, &minLoc );
Mat adjusted = disp(Rect(minLoc.x, minLoc.y, ref.cols, ref.rows));