So I need help with OpenCV in C++
Basically I have a camera that has some radial distortion and I am able to undistort it using the provided examples/samples in OpenCV.
But currently I have to recalibrate the camera each time the program is run. But the example generates an XML file for a reason right... To make use of those values...
My problem is I'm not sure which values and how to use those values from the XML file to undistort the camera without having to go through the entire calibration again.
I tried finding examples of this use online but for some reason nothing related to my problem came up...
Supposedly we are supposed to be able to take the values from the output XML file and use them directly in the program so that we don't have to recalibrate the camera each time.
But currently that's exactly what my program is doing :/
I really hope someone can help me with this
Thanks a lot :)
First, You have to create Camera Matrix with Camera Matrix values from XML file.
Mat cameraMatrix = new Mat(new Size(3,3), CvType.CV_64FC1);
cameraMatrix.put(0,0,3275.907);
cameraMatrix.put(0,1,0);
cameraMatrix.put(0,2,2069.153);
cameraMatrix.put(1,0,0);
cameraMatrix.put(1,1,3270.752);
cameraMatrix.put(1,2,1139.271);
cameraMatrix.put(2,0,0);
cameraMatrix.put(2,1,0);
cameraMatrix.put(2,2,1);
Second, create Distortion Matrix with Distortion_Coefficients from XML file.
Mat distortionMatrix = new Mat(new Size(4,1), CvType.CV_64FC1);
distortionMatrix.put(0,0,-0.006934);
distortionMatrix.put(0,1,-0.047680);
distortionMatrix.put(0,2,0.002173);
distortionMatrix.put(0,3,0.002580);
Finally, use OpenCV method.
Mat map1 = new Mat();
Mat map2 = new Mat();
Mat temp = new Mat();
Imgproc.initUndistortRectifyMap(cameraMatrix, distortionMatrix, temp, cameraMatrix, src.size(), CvType.CV_32FC1, map1, map2);
And you can get two matrix map1, map2 which use for undistortion.
If you get this two matrix, you don't have to re-calibrate every time.
Just use remap and undistortion will be done.
Imgproc.remap(mat, undistortPicture, map1, map2, Imgproc.INTER_LINEAR);
refer this link.
Alright so I was able to extract the 4 things that I think was necessary from the output xml file. Essentially I made a new class that I named CalibSet and just extracted the data from the xml file via the "tfs[""] >> xxx;" at the bottom of the code.
class CalibSet
{
public:
Size Boardsize; // The size of the board -> Number of items by width and height
Size image; // image size
String calibtime;
Mat CamMat; // camera matrix
Mat DistCoeff; // distortion coefficient
Mat PViewReprojErr; // per view reprojection error
float SqSize; // The size of a square in your defined unit (point, millimeter,etc).
float avg_reproj_error;
int NrFrames; // The number of frames to use from the input for calibration
int Flags;
bool imagePoints; // Write detected feature points
bool ExtrinsicParams; // Write extrinsic parameters
bool GridPoints; // Write refined 3D target grid points
bool fisheyemodel; // use fisheye camera model for calibration
void write(FileStorage& fs) const //Write serialization for this class
{
fs << "{"
<<"nr_of_frames" << NrFrames
<<"image_width" << image.width
<<"image_height" << image.height
<<"board_width" << Boardsize.width
<<"board_height" << Boardsize.height
<<"square_size" << SqSize
<<"flags" << Flags
<<"fisheye_model" << fisheyemodel
<<"camera_matrix" << CamMat
<<"distortion_coefficients" << DistCoeff
<<"avg_reprojection_error" << avg_reproj_error
<<"per_view_reprojection_errors" << PViewReprojErr
<<"extrinsic_parameters" << ExtrinsicParams
<< "}";
}
void read(const FileNode& node) //Read serialization for this class
{
node["calibration_time"] >> calibtime;
node["nr_of_frames"] >> NrFrames;
node["image_width"] >> image.width;
node["image_height"] >> image.height;
node["board_width"] >> Boardsize.width;
node["board_height"] >> Boardsize.height;
node["square_size"] >> SqSize;
node["flags"] >> Flags;
node["fisheye_model"] >> fisheyemodel;
node["camera_matrix"] >> CamMat;
node["distortion_coefficients"] >> DistCoeff;
node["avg_reprojection_error"] >> avg_reproj_error;
node["per_view_reprojection_errors"] >> PViewReprojErr;
node["extrinsic_parameters"] >> ExtrinsicParams;
}
};
CalibSet CS;
FileStorage tfs(inputCalibFile, FileStorage::READ); // Read the settings
if (!tfs.isOpened())
{
cout << "Could not open the calibration file: \"" << inputCalibFile << "\"" << endl;
return -1;
}
tfs["camera_matrix"] >> CS.CamMat;
tfs["distortion_coefficients"] >> CS.DistCoeff;
tfs["image_width"] >> CS.image.width;
tfs["image_height"] >> CS.image.height;
tfs.release(); // close Settings file
And after this I use the function "undistort" to correct the live camera frames that I stored in frame and put the corrected image in rframe
flip(frame, frame, -1); // flip image vertically so that it's not upside down
cv::undistort(frame, rframe, CS.CamMat, CS.DistCoeff);
flip(rframe, rframe, +1); // flip image horizontally
It's important to make sure that the orientation of photos taken for sampling is exactly the same as the one used later on (including mirroring vertically or horizontally) else the image will still be distorted after using "undistort"
After this I can get an undistorted image as intended BUT the frame rate is extremely low (around 10-20FPS) and I'd appreciate any help in optimising the process if possible; to allow for higher frame rate from the live camera feed
Related
I'm currently using opencv library with c++, and my goal is to cancel a fisheye effect on an image ("make it plane")
I'm using the function "undistortImage" to cancel the effect but I need before to perform camera calibration in order to find the parameters K, Knew, and D, but I didn't understand exactly the documentation ( link: http://docs.opencv.org/master/db/d58/group__calib3d__fisheye.html#gga37375a2741e88052ce346884dfc9c6a0a0899eaa2f96d6eed9927c4b4f4464e05).
From my understanding, I should give two lists of points and the function "calibrate" is supposed to return the arrays I need. So my question is the following: given a fisheye image, how am I supposed to pick the two lists of points to get the result ? This is for the moment my code, very basic, just takes the picture, display it, performs the undistortion and displays the new image. The elements in the matrix are random, so currently the result is not as expected. Thanks for the answers.
#include "opencv2\core\core.hpp"
#include "opencv2\highgui\highgui.hpp"
#include "opencv2\calib3d\calib3d.hpp"
#include <stdio.h>
#include <iostream>
using namespace std;
using namespace cv;
int main(){
cout << " Usage: display_image ImageToLoadAndDisplay" << endl;
Mat image;
image = imread("C:/Users/Administrator/Downloads/eiffel.jpg", CV_LOAD_IMAGE_COLOR); // Read the file
if (!image.data) // Check for invalid input
{
cout << "Could not open or find the image" << endl;
return -1;
}
cout << "Input image depth: " << image.depth() << endl;
namedWindow("Display window", WINDOW_AUTOSIZE);// Create a window for display.
imshow("Display window", image); // Show our image inside it.
Mat Ka = Mat::eye(3, 3, CV_64F); // Creating distortion matrix
Mat Da = Mat::ones(1, 4, CV_64F);
Mat dstImage(image.rows, image.cols, CV_32F);
cout << "K matrix depth: " << Ka.depth() << endl;
cout << "D matrix depth: " << Da.depth() << endl;
Mat Knew = Mat::eye(3, 3, CV_64F);
std::vector<cv::Vec3d> rvec;
std::vector<cv::Vec3d> tvec;
int flag = 0;
std::vector<Point3d> objectPoints1 = { Point3d(0,0,0), Point3d(1,1,0), Point3d(2,2,0), Point3d(3,3,0), Point3d(4,4,0), Point3d(5,5,0),
Point3d(6,6,0), Point3d(7,7,0), Point3d(3,0,0), Point3d(4,1,0), Point3d(5,2,0), Point3d(6,3,0), Point3d(7,4,0), Point3d(8,5,0), Point3d(5,4,0), Point3d(0,7,0), Point3d(9,7,0), Point3d(9,0,0), Point3d(4,3,0), Point3d(7,2,0)};
std::vector<Point2d> imagePoints1 = { Point(107,84), Point(110,90), Point(116,96), Point(126,107), Point(142,123), Point(168,147),
Point(202,173), Point(232,192), Point(135,69), Point(148,73), Point(165,81), Point(189,93), Point(219,112), Point(248,133), Point(166,119), Point(96,183), Point(270,174), Point(226,56), Point(144,102), Point(206,75) };
std::vector<std::vector<cv::Point2d> > imagePoints(1);
imagePoints[0] = imagePoints1;
std::vector<std::vector<cv::Point3d> > objectPoints(1);
objectPoints[0] = objectPoints1;
fisheye::calibrate(objectPoints, imagePoints, image.size(), Ka, Da, rvec, tvec, flag); // Calibration
cout << Ka<< endl;
cout << Da << endl;
fisheye::undistortImage(image, dstImage, Ka, Da, Knew); // Performing distortion
namedWindow("Display window 2", WINDOW_AUTOSIZE);// Create a window for display.
imshow("Display window 2", dstImage); // Show our image inside it.
waitKey(0); // Wait for a keystroke in the window
return 0;
}
For calibration with cv::fisheye::calibrate you must provide
objectPoints vector of vectors of calibration pattern points in the calibration pattern coordinate space.
This means to provide KNOWN real-world coordinates of the points (must be corresponding points to the ones in imagePoints), but you can choose the coordinate system positon arbitrarily (but carthesian), so you must know your object - e.g. a planar test pattern.
imagePoints vector of vectors of the projections of calibration pattern points
These must be the same points as in objectPoints, but given in image coordinates, so where the projection of the object points hit your image (read/extract the coordinates from your image).
For example, if your camera did capture this image (taken from here ):
you must know the dimension of your testpattern (up to a scale), for example you could choose the top-left corner of the top-left square to be position (0,0,0), the top-right corner of the top-left square to be (1,0,0), and the bottom-left corner of the top-left square to be (1,1,0), so your whole testpattern would be placed on the xy-plane.
Then you could extract these correspondences:
pixel real-world
(144,103) (4,3,0)
(206,75) (7,2,0)
(109,151) (2,5,0)
(253,159) (8,6,0)
for these points (marked red):
The pixel position could be your imagePoints list while the real-world positions could be your objectPoints list.
Does this answer your question?
I am trying to write each frame from a camera into a video. Till here it is fine. However, I want my video to include the shape_predictor too at each frame, so when it is reproduced it also appears on the image. So far I have got this... Any ideas? Thank you
cap >> frame;
cv::VideoWriter oVideoWriter;
// . . .
cv_image<bgr_pixel> cimg(frame); //Mat to something dlib can deal with
frontal_face_detector detector = get_frontal_face_detector();
std::vector<rectangle> faces = detector(cimg);
pose_model(cimg, faces[0]);
oVideoWriter.write(dlib::toMat(cimg)); //Turn it into an Opencv Mat
The shape predictor is not the face detector. You have to first call the face detector, then the shape predictor.
See this example program: http://dlib.net/face_landmark_detection_ex.cpp.html
You initialized the face detector properly..then you have to initialize the tracker. Something like this:
shape_predictor sp;
deserialize("shape_predictor_68_face_landmarks.dat") >> sp;
The model can be found here: http://sourceforge.net/projects/dclib/files/dlib/v18.10/shape_predictor_68_face_landmarks.dat.bz2
The rest of the way, you can just follow the example program I linked above. Here's the portion where the tracker is run. You have to pass to the tracker the output (bounding box) return by the detector for it to work. The code below iterates through all the boxes returned by the detector.
// Now tell the face detector to give us a list of bounding boxes
// around all the faces in the image.
std::vector<rectangle> dets = detector(img);
cout << "Number of faces detected: " << dets.size() << endl;
// Now we will go ask the shape_predictor to tell us the pose of
// each face we detected.
std::vector<full_object_detection> shapes;
for (unsigned long j = 0; j < dets.size(); ++j)
{
full_object_detection shape = sp(img, dets[j]);
cout << "number of parts: "<< shape.num_parts() << endl;
cout << "pixel position of first part: " << shape.part(0) << endl;
cout << "pixel position of second part: " << shape.part(1) << endl;
// You get the idea, you can get all the face part locations if
// you want them. Here we just store them in shapes so we can
// put them on the screen.
shapes.push_back(shape);
}
I'm working on a project where I'm trying to detect multiple objects from multiple videos at the same time and I am also correcting the distortion in the videos so I can get an accurate reading for the bearing of the detections relative to the camera.
I'm working with OpenCV, in C++ using Visual Studio 2010
The code works but it is very slow, not real time. I am hoping someone may have suggestions as how it may be sped up, if it can. I'm not much of a coder at present and learning about image processing and don't know many tricks.
What the code does in a general sense is:
-Opens a video file
-Applies distortion correction to each image frame (for bearings)
-Crops the image (improves detections, time stamps give false positives)
-Applies a Haar type cascade to the cropped image to detect objects
-Draws a bounding box around the detections
-Displays the images
-It calculates the angle of the detections of the objects and prints to terminal
-It also draws an image like a radar, and displays the angle relative to the camera of each detection in each frame, the idea is to have it as a single source giving the detections from each camera on a single source mapping out the surrounding area.
I've included the main code that runs the video and detections for a single video, this is still pretty slow and takes approx. 18 seconds for each second of video. And when I have 4 videos attempting to run it's about 3 times longer.
Video dimensions are 704x576.
Any help or advice would be much appreciated, or even just knowing that it can only be sped up with purpose designed hardware.
Cheers,
Dave
int main(){
/////////////////////////////
//**Distortion Correction**//
/////////////////////////////
std::cout<< endl << "Reading:" << endl;
//stores a file
FileStorage fs;
//reads and stores the xml file with the camera calibration
fs.open("cal2.xml", FileStorage::READ);
if (fs.isOpened())
{
cout<<"File is opened\n";
}
//Mat objects to store the camera matrix and distortion coefficients
Mat CamMat, DistCoeff;
FileNode n = fs.root();
//takes the parameters from the xml file and stores them in the Mat objects
fs["Camera_Matrix"] >> CamMat;
fs["Distortion_Coefficients"] >> DistCoeff;
/////////////////////
//**Video Display**//
/////////////////////
//Mat objects to store the images
Mat Original, Vid1, Vid1Crop;
//Cropping Image to exclude time/camera stamps to improve detections
Rect roi(0, 35, 704, 490);
//for reading video or webcam
VideoCapture cap;
//for opening video file, give the location and name of the file seperating folders with "\\" instead of with a single "\"
cap.open("C:\\Users\\Desktop\\Run_01_005 two containers\\Video\\ch04_20140219124355.mp4");
//Windows for the images
namedWindow("New", CV_WINDOW_NORMAL);
namedWindow("Display", CV_WINDOW_NORMAL);
///////////////////////
//**Detection Setup**//
///////////////////////
// Cascade Classifier object
CascadeClassifier boat_cascade;
//loads the xml file for the classifier, put the address and name of the xml file in the brackets
boat_cascade.load( "boatcascadeAttp3.xml" );
/////////////////////////////////////
//**Single Source Display Image**////
/////////////////////////////////////
Mat Output;
//loop to continually capture/update images
while (1){
Output = Display();
cap>>Original;
//applies the distortion correction to input image and outputs to New image
undistort(Original, Vid1, CamMat, DistCoeff, noArray());
//Image excluding the time/camera stamps in video which caused a lot of false positives
Vid1Crop = Vid1(roi);
//Set.NewCrop(New, roi);
// Detect boats
std::vector<Rect> boats;
//Parameters may need some further adjustment, currently seems to work well
//Detection performed on Region of Interest that excludes the time stamp which caused a number of False Positives
boat_cascade.detectMultiScale( Vid1Crop, boats, 1.1, 15, 0|CV_HAAR_SCALE_IMAGE, Size(25, 25), Size(75,75) );
// Draw circles on the detected boats
for( int i = 0; i < boats.size(); i++ )
{
//Draws a box around the detected object
rectangle( Vid1Crop, Point(boats[i].x, boats[i].y), Point(boats[i].x+boats[i].width, boats[i].y+boats[i].height), Scalar( 0, 255, 0), 2, 8);
//finds the position of the detection along the X axis
int centreX = boats[i].x + boats[i].width*0.5;
int fromCent = Vid1Crop.cols - centreX;
float angle;
//calls Angle function
angle = Angle(centreX, fromCent, Vid1Crop);
//calls DisplayPoints function
Point XYpoints = DisplayPoints(angle);
//prints out the result, angle for cam ranges
cout << angle;
cout << " degrees" << endl;
//Draws red circles on the single source display corresponding to the detections
circle( Output, XYpoints, 5.0, Scalar( 0, 0, 255 ), 4, 8 );
}
//shows the New output image after correction
imshow("New", Vid1);
imshow("Display", Output);
//delay for 1ms between frames - Note 25 fps in video
waitKey(1);
}
fs.release();
return (0);
}
I've been trying to rectify and build the disparity mappping for a pair of images using OpenCV stereoRectifyUncalibrated, but I'm not getting very good results. My code is:
template<class T>
T convertNumber(string& number)
{
istringstream ss(number);
T t;
ss >> t;
return t;
}
void readPoints(vector<Point2f>& points, string filename)
{
fstream filest(filename.c_str(), ios::in);
string line;
assert(filest != NULL);
getline(filest, line);
do{
int posEsp = line.find_first_of(' ');
string posX = line.substr(0, posEsp);
string posY = line.substr(posEsp+1, line.size() - posEsp);
float X = convertNumber<float>(posX);
float Y = convertNumber<float>(posY);
Point2f pnt = Point2f(X, Y);
points.push_back(pnt);
getline(filest, line);
}while(!filest.eof());
filest.close();
}
void drawKeypointSequence(Mat lFrame, Mat rFrame, vector<KeyPoint>& lKeyp, vector<KeyPoint>& rKeyp)
{
namedWindow("prevFrame", WINDOW_AUTOSIZE);
namedWindow("currFrame", WINDOW_AUTOSIZE);
moveWindow("prevFrame", 0, 300);
moveWindow("currFrame", 650, 300);
Mat rFrameAux;
rFrame.copyTo(rFrameAux);
Mat lFrameAux;
lFrame.copyTo(lFrameAux);
int size = rKeyp.size();
for(int i=0; i<size; i++)
{
vector<KeyPoint> drawRightKeyp;
vector<KeyPoint> drawleftKeyp;
drawRightKeyp.push_back(rKeyp[i]);
drawleftKeyp.push_back(lKeyp[i]);
cout << rKeyp[i].pt << " <<<>>> " << lKeyp[i].pt << endl;
drawKeypoints(rFrameAux, drawRightKeyp, rFrameAux, Scalar::all(255), DrawMatchesFlags::DRAW_OVER_OUTIMG);
drawKeypoints(lFrameAux, drawleftKeyp, lFrameAux, Scalar::all(255), DrawMatchesFlags::DRAW_OVER_OUTIMG);
imshow("currFrame", rFrameAux);
imshow("prevFrame", lFrameAux);
waitKey(0);
}
imwrite("RightKeypFrame.jpg", rFrameAux);
imwrite("LeftKeypFrame.jpg", lFrameAux);
}
int main(int argc, char* argv[])
{
StereoBM stereo(StereoBM::BASIC_PRESET, 16*5, 21);
double ndisp = 16*4;
assert(argc == 5);
string rightImgFilename(argv[1]); // Right image (current frame)
string leftImgFilename(argv[2]); // Left image (previous frame)
string rightPointsFilename(argv[3]); // Right image points file
string leftPointsFilename(argv[4]); // Left image points file
Mat rightFrame = imread(rightImgFilename.c_str(), 0);
Mat leftFrame = imread(leftImgFilename.c_str(), 0);
vector<Point2f> rightPoints;
vector<Point2f> leftPoints;
vector<KeyPoint> rightKeyp;
vector<KeyPoint> leftKeyp;
readPoints(rightPoints, rightPointsFilename);
readPoints(leftPoints, leftPointsFilename);
assert(rightPoints.size() == leftPoints.size());
KeyPoint::convert(rightPoints, rightKeyp);
KeyPoint::convert(leftPoints, leftKeyp);
// Desenha os keypoints sequencialmente, de forma a testar a consistência do matching
drawKeypointSequence(leftFrame, rightFrame, leftKeyp, rightKeyp);
Mat fundMatrix = findFundamentalMat(leftPoints, rightPoints, CV_FM_8POINT);
Mat homRight;
Mat homLeft;
Mat disp16 = Mat(rightFrame.rows, leftFrame.cols, CV_16S);
Mat disp8 = Mat(rightFrame.rows, leftFrame.cols, CV_8UC1);
stereoRectifyUncalibrated(leftPoints, rightPoints, fundMatrix, rightFrame.size(), homLeft, homRight);
warpPerspective(rightFrame, rightFrame, homRight, rightFrame.size());
warpPerspective(leftFrame, leftFrame, homLeft, leftFrame.size());
namedWindow("currFrame", WINDOW_AUTOSIZE);
namedWindow("prevFrame", WINDOW_AUTOSIZE);
moveWindow("currFrame", 650, 300);
moveWindow("prevFrame", 0, 300);
imshow("currFrame", rightFrame);
imshow("prevFrame", leftFrame);
imwrite("RectfRight.jpg", rightFrame);
imwrite("RectfLeft.jpg", leftFrame);
waitKey(0);
stereo(rightFrame, leftFrame, disp16, CV_16S);
disp16.convertTo(disp8, CV_8UC1, 255/ndisp);
FileStorage file("disp_map.xml", FileStorage::WRITE);
file << "disparity" << disp8;
file.release();
imshow("disparity", disp8);
imwrite("disparity.jpg", disp8);
moveWindow("disparity", 0, 0);
waitKey(0);
}
drawKeyPoint sequence is the way I visually check the consistency of the points I have for both images. By drawing each of their keypoints in sequence, I can be sure that keypoint i on image A is keypoint i on image B.
I've also tried playing with the ndisp parameter, but it didn't help much.
I tried it for the following pair of images:
LeftImage
RightImage
got the following rectified pair:
RectifiedLeft
RectifiedRight
and finally, the following disparity map
DisparityMap
Which, as you can see, is quite bad. I've also tried the same pair of images with the following stereoRectifyUncalibrated example: http://programmingexamples.net/wiki/OpenCV/WishList/StereoRectifyUncalibrated and the SBM_Sample.cpp from opencv tutorial code samples to build the disparity map, and got a very similar result.
I'm using opencv 2.4
Thanks in advance!
Besides possible calibration problems, your images clearly lack some texture for the stereo block matching to work.
This algorithm will see many ambiguities and too large disparities on flat (non-tetxured) parts.
Note however that the keypoints seem to match well, so even if the rectification output seems weird it is probably correct.
You can test your code against standard images from the Middlebury stereo page for sanity checks.
I would suggest to do a stereo calibration using the chessboard, or take multiple pictures with a chess board and use stereocalibrate.cpp on your computer. I am saying that because you are using stereorectifyuncalibrated, While the algorithm does not need to know the intrinsic parameters of the cameras, it heavily depends on the epipolar geometry. Therefore, if the camera lenses have a significant distortion, it would be better to correct it before computing the fundamental matrix and calling this function. For example, distortion coefficients can be estimated for each head of stereo camera separately by using calibrateCamera(). Then, the images can be corrected using undistort() , or just the point coordinates can be corrected with undistortPoints().
I'm converting code from Matlab to C++, and one of the functions that I don't understand is imtransform. I need to "register" an image, which basically means stretching, skewing, and rotating my image so that it overlaps correctly with another image.
Matlab's imtransform does the registration for you, but as I'm programming this in C++ I need to know what's been abstracted. What is the normal math involved in image registration? How can I go from 2 arrays of data (which make up images) to 1 array, which is the combined image overlapped?
I recommend you to use OpenCV within c++ and there are a lot of image processing tools and functions you can call and use.
The Registration module implements parametric image registration. The implemented method is direct alignment, that is, it uses directly the pixel values for calculating the registration between a pair of images, as opposed to feature-based registration.
The OpenCV constants that represent these models have a prefix MOTION_ and are shown inside the brackets.
Translation ( MOTION_TRANSLATION ) : The first image can be shifted ( translated ) by (x , y) to obtain the second image. There are only two parameters x and y that we need to estimate.
Euclidean ( MOTION_EUCLIDEAN ) : The first image is a rotated and shifted version of the second image. So there are three parameters — x, y and angle . You will notice in Figure 4, when a square undergoes Euclidean transformation, the size does not change, parallel lines remain parallel, and right angles remain unchanged after transformation.
Affine ( MOTION_AFFINE ) : An affine transform is a combination of rotation, translation ( shift ), scale, and shear. This transform has six parameters. When a square undergoes an Affine transformation, parallel lines remain parallel, but lines meeting at right angles no longer remain orthogonal.
Homography ( MOTION_HOMOGRAPHY ) : All the transforms described above are 2D transforms. They do not account for 3D effects. A homography transform on the other hand can account for some 3D effects ( but not all ). This transform has 8 parameters. A square when transformed using a Homography can change to any quadrilateral.
Reference: https://docs.opencv.org/3.4.2/db/d61/group__reg.html
This is an example I found very useful for image registration:
#include <opencv2/opencv.hpp>
#include "opencv2/xfeatures2d.hpp"
#include "opencv2/features2d.hpp"
using namespace std;
using namespace cv;
using namespace cv::xfeatures2d;
const int MAX_FEATURES = 500;
const float GOOD_MATCH_PERCENT = 0.15f;
void alignImages(Mat &im1, Mat &im2, Mat &im1Reg, Mat &h)
{
Mat im1Gray, im2Gray;
cvtColor(im1, im1Gray, CV_BGR2GRAY);
cvtColor(im2, im2Gray, CV_BGR2GRAY);
// Variables to store keypoints and descriptors
std::vector<KeyPoint> keypoints1, keypoints2;
Mat descriptors1, descriptors2;
// Detect ORB features and compute descriptors.
Ptr<Feature2D> orb = ORB::create(MAX_FEATURES);
orb->detectAndCompute(im1Gray, Mat(), keypoints1, descriptors1);
orb->detectAndCompute(im2Gray, Mat(), keypoints2, descriptors2);
// Match features.
std::vector<DMatch> matches;
Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("BruteForce-Hamming");
matcher->match(descriptors1, descriptors2, matches, Mat());
// Sort matches by score
std::sort(matches.begin(), matches.end());
// Remove not so good matches
const int numGoodMatches = matches.size() * GOOD_MATCH_PERCENT;
matches.erase(matches.begin()+numGoodMatches, matches.end());
// Draw top matches
Mat imMatches;
drawMatches(im1, keypoints1, im2, keypoints2, matches, imMatches);
imwrite("matches.jpg", imMatches);
// Extract location of good matches
std::vector<Point2f> points1, points2;
for( size_t i = 0; i < matches.size(); i++ )
{
points1.push_back( keypoints1[ matches[i].queryIdx ].pt );
points2.push_back( keypoints2[ matches[i].trainIdx ].pt );
}
// Find homography
h = findHomography( points1, points2, RANSAC );
// Use homography to warp image
warpPerspective(im1, im1Reg, h, im2.size());
}
int main(int argc, char **argv)
{
// Read reference image
string refFilename("form.jpg");
cout << "Reading reference image : " << refFilename << endl;
Mat imReference = imread(refFilename);
// Read image to be aligned
string imFilename("scanned-form.jpg");
cout << "Reading image to align : " << imFilename << endl;
Mat im = imread(imFilename);
// Registered image will be resotred in imReg.
// The estimated homography will be stored in h.
Mat imReg, h;
// Align images
cout << "Aligning images ..." << endl;
alignImages(im, imReference, imReg, h);
// Write aligned image to disk.
string outFilename("aligned.jpg");
cout << "Saving aligned image : " << outFilename << endl;
imwrite(outFilename, imReg);
// Print estimated homography
cout << "Estimated homography : \n" << h << endl;
}
Raw C++ does not have any of the concepts you refer to built into it. However, there are many image processing libraries for C++ you can use that can do various transforms. DevIL and FreeImage should be able to do layering, as well as some transforms.