creating a bounding box around a field of optical flow paths

creating a bounding box around a field of optical flow paths - c++

I have used cv::calcOpticalFlowFarneback to calculate the optical flow in the current and previous frames of video with ofxOpenCv in openFrameworks.
I then draw the video with the optical flow field on top and then draw vectors showing the flow of motion in areas that are above a certain threshold.
What I want to do now is create a bounding box of those areas of motion and get the centroid and store that x,y position in a variable for tracking.
This is how I'm drawing my flow field if that helps.
if (calculatedFlow){
ofSetColor( 255, 255, 255 );
video.draw( 0, 0);
int w = gray1.width;
int h = gray1.height;
//1. Input images + optical flow
ofPushMatrix();
ofScale( 4, 4 );
//Optical flow
float *flowXPixels = flowX.getPixelsAsFloats();
float *flowYPixels = flowY.getPixelsAsFloats();
ofSetColor( 0, 0, 255 );
for (int y=0; y<h; y+=5) {
for (int x=0; x<w; x+=5) {
float fx = flowXPixels[ x + w * y ];
float fy = flowYPixels[ x + w * y ];
//Draw only long vectors
if ( fabs( fx ) + fabs( fy ) > .5 ) {
ofDrawRectangle( x-0.5, y-0.5, 1, 1 );
ofDrawLine( x, y, x + fx, y + fy );
}
}
}
}

For what you are asking, there is no simple answer. Here is a suggested solution. It involves multiple steps, but if your domain is simple enough, you could simplify this.
For each frame,
Calculate flow as two images flow_x,flow_y comparing current frame with previous frame using farneback method.(you seem to be doing this, in your code)
Translate the flow images into an hsv image, where the hue component of each pixel denotes the angle of the flow atan2(flow_y/flow_x) and value component of each pixel denotes the magnitude of the flow sqrt(flow_x\*\*2 + flow_y\*\*2)
In the above step, use your thresholding mechanism to suppress flow- pixels (make them black) whose magnitude falls below a certain threshold.
Segment the HSV image based on color ranges. You could use apriori information about your domain, or you could take histogram of hue components and identify prominent ranges of hues to classify pixels. As a result of this step, you can assign a class to each pixel.
Separate the pixels belonging to each class into multiple images. All pixels belonging to segmented class-1 will goto image-1, all pixels belonging to segmented class-2 will go to image-2 etc. Now each segmented image contains pixels in the HSV image, in a particular color range.
Transform each segmented image as a black and white image, and using opencv's morphological operations split into multiple regions using connectivity. (connected components).
Find the centroid of each connected component.
I found this reference to be helpful in this context.

I resolved my problem by creating new image from my flowX, and flowY. This was done by adding flowX and flowY to a new CV FloatImage.
flowX +=flowY;
flowXY = flowX;
Then I was able to do the contour finding from the pixels of the newly created image and then I could store all the centroids of all the blobs of movement.
Like so:
contourFinder.findContours( mask, 10, 10000, 20, false );
//Storing the objects centers with contour finder.
vector<ofxCvBlob> &blobs = contourFinder.blobs;
int n = blobs.size(); //Get number of blobs
obj.resize( n ); //Resize obj array
for (int i=0; i<n; i++) {
obj[i] = blobs[i].centroid; //Fill obj array
}
I initially noticed that movement was only being tracked in one direction in the x-axis and y-axis because of negative values. I resolved this by changing the calculation for my optical flow by calling the abs() function in cv::Mat.
Mat img1( gray1.getCvImage() ); //Create OpenCV images
Mat img2( gray2.getCvImage() );
Mat flow;
calcOpticalFlowFarneback( img1, img2, flow, 0.7, 3, 11, 5, 5, 1.1, 0 );
//Split flow into separate images
vector<Mat> flowPlanes;
Mat newFlow;
newFlow = abs(flow); //abs flow so values are absolute. Allows tracking in both directions.
split( newFlow, flowPlanes );
//Copy float planes to ofxCv images flowX and flowY
IplImage iplX( flowPlanes[0] );
flowX = &iplX;
IplImage iplY( flowPlanes[1] );
flowY = &iplY;

Related

Change lightness to pixels which are outside of the area

I have one point set to position (x,y) and two angles from this point. I draw in example bellow two lines for demonstration, how it should look.
Now what I want is change lightness to all pixels outside from this lines.
Here is original image.
And here is example, what I want.
How can I easy change pixels with Opencv(C++), if I have and know input image, point, and two angles? I know many of solution, but I want easiest one, how can detect which pixels need change and which not.

One way would be to:
Make a binary mask of the size of the original image, based on your points and angle (i.e draw filled polygon).
Make a clone of the original image. Apply brightness changes to the whole of cloned image.
Copy cloned image back to original image based on the mask.

I write code bellow from #Zindarod steps. Hope to help someone.
Angles are in degress.
void view(cv::Mat& frame, double angle_left, double angle_right, cv::Point center){
int length = 1500;
cv::Point left_view;
left_view.x = (int)round(center.x + length * cos((angle_left * (CV_PI / 180))));
left_view.y = (int)round(center.y + length * sin((angle_left * (CV_PI / 180))));
cv::Point right_view;
right_view.x = (int)round(center.x + length * cos((angle_right * (CV_PI / 180))));
right_view.y = (int)round(center.y + length * sin((angle_right * (CV_PI / 180))));
cv::Point pts[4] = { position_of_eyes, left_view, right_view, position_of_eyes };
Mat mask = Mat(frame.size(), CV_32FC3, cv::Scalar(1.0, 1.0, 0.3));
cv::fillConvexPoly(mask, pts, 3, cv::Scalar(1.0,1.0,1.0));
cv::cvtColor(frame, frame, CV_BGR2HSV);
frame.convertTo(frame, CV_32FC3);
cv::multiply(frame, mask, frame);
frame.convertTo(frame, CV_8UC3);
cv::cvtColor(frame, frame, CV_HSV2BGR);
}

Given an origin point and two angles, you can calculate 2 unit vectors for you two lines, let these be unitA and unitB.
For each pixel of the image do these steps:
1. get a vector (called vec) from the origin to the pixel.
2. find the angle (ang) between vec and a reference vector (refVec).
3. if ang is greater than the angle between refVec and unitA, but smaller than the angle between the refVec and unitB recolor the pixel.

Farneback optical flow - dealing with pixels out of view, pixels with wrong flow result, different size image

I am writing my thesis and one part of the task is to interpolate between images to create intermediate images. The work has to be done in c++ using openCV 2.4.13.
The best solution I've found so far is computing optical flow and remapping. But this solution has two problems that I am unable to solve on my own:
There are pixels that should go out of view (bottom of image for example), but they do not.
Some pixels do not move, creating a distorted result (upper right part of the couch)
What has made the flow&remap approach better:
Equalizing the intensity. This i'm allowed to do. You can check the result by comparing the couch form (centre of remapped image and original).
Reducing size of image. This i'm NOT allowed to do, as I need the same size output. Is there a way to rescale the optical flow result to get the bigger remapped image?
Other approaches tried and failed:
cuda::interpolateFrames. Creates incredible ghosting.
blending images with cv::addWeighted. Even worse ghosting.
Below is the code I am using at the moment. And images: dropbox link with input and result images
int main(){
cv::Mat second, second_gray, cutout, cutout_gray, flow_n;
second = cv::imread( "/home/zuze/Desktop/forstack/second_L.jpg", 1 );
cutout = cv::imread("/home/zuze/Desktop/forstack/cutout_L.png", 1);
cvtColor(second, second_gray, CV_BGR2GRAY);
cvtColor(cutout, cutout_gray, CV_RGB2GRAY );
///----------COMPUTE OPTICAL FLOW AND REMAP -----------///
cv::calcOpticalFlowFarneback( second_gray, cutout_gray, flow_n, 0.5, 3, 15, 3, 5, 1.2, 0 );
cv::Mat remap_n; //looks like it's drunk.
createNewFrame(remap_n, flow_n, 1, second, cutout );
cv::Mat cflow_n;
cflow_n = cutout_gray;
cvtColor(cflow_n, cflow_n, CV_GRAY2BGR);
drawOptFlowMap(flow_n, cflow_n, 10, CV_RGB(0,255,0));
///--------EQUALIZE INTENSITY, COMPUTE OPTICAL FLOW AND REMAP ----///
cv::Mat cutout_eq, second_eq;
cutout_eq= equalizeIntensity(cutout);
second_eq= equalizeIntensity(second);
cv::Mat flow_eq, cutout_eq_gray, second_eq_gray, cflow_eq;
cvtColor( cutout_eq, cutout_eq_gray, CV_RGB2GRAY );
cvtColor( second_eq, second_eq_gray, CV_RGB2GRAY );
cv::calcOpticalFlowFarneback( second_eq_gray, cutout_eq_gray, flow_eq, 0.5, 3, 15, 3, 5, 1.2, 0 );
cv::Mat remap_eq;
createNewFrame(remap_eq, flow_eq, 1, second, cutout_eq );
cflow_eq = cutout_eq_gray;
cvtColor(cflow_eq, cflow_eq, CV_GRAY2BGR);
drawOptFlowMap(flow_eq, cflow_eq, 10, CV_RGB(0,255,0));
cv::imshow("remap_n", remap_n);
cv::imshow("remap_eq", remap_eq);
cv::imshow("cflow_eq", cflow_eq);
cv::imshow("cflow_n", cflow_n);
cv::imshow("sec_eq", second_eq);
cv::imshow("cutout_eq", cutout_eq);
cv::imshow("cutout", cutout);
cv::imshow("second", second);
cv::waitKey();
return 0;
}
Function for remapping, to be used for intermediate image creation:
void createNewFrame(cv::Mat & frame, const cv::Mat & flow, float shift, cv::Mat & prev, cv::Mat &next){
cv::Mat mapX(flow.size(), CV_32FC1);
cv::Mat mapY(flow.size(), CV_32FC1);
cv::Mat newFrame;
for (int y = 0; y < mapX.rows; y++){
for (int x = 0; x < mapX.cols; x++){
cv::Point2f f = flow.at<cv::Point2f>(y, x);
mapX.at<float>(y, x) = x + f.x*shift;
mapY.at<float>(y, x) = y + f.y*shift;
}
}
remap(next, newFrame, mapX, mapY, cv::INTER_LANCZOS4);
frame = newFrame;
cv::waitKey();
}
Function to display optical flow in vector form:
void drawOptFlowMap (const cv::Mat& flow, cv::Mat& cflowmap, int step, const cv::Scalar& color) {
cv::Point2f sum; //zz
std::vector<float> all_angles;
int count=0; //zz
float angle, sum_angle=0; //zz
for(int y = 0; y < cflowmap.rows; y += step)
for(int x = 0; x < cflowmap.cols; x += step)
{
const cv::Point2f& fxy = flow.at< cv::Point2f>(y, x);
if((fxy.x != fxy.x)||(fxy.y != fxy.y)){ //zz, for SimpleFlow
//std::cout<<"meh"; //do nothing
}
else{
line(cflowmap, cv::Point(x,y), cv::Point(cvRound(x+fxy.x), cvRound(y+fxy.y)),color);
circle(cflowmap, cv::Point(cvRound(x+fxy.x), cvRound(y+fxy.y)), 1, color, -1);
sum +=fxy;//zz
angle = atan2(fxy.y,fxy.x);
sum_angle +=angle;
all_angles.push_back(angle*180/M_PI);
count++; //zz
}
}
}
Function to equalize intensity of images, for better results:
cv::Mat equalizeIntensity(const cv::Mat& inputImage){
if(inputImage.channels() >= 3){
cv::Mat ycrcb;
cvtColor(inputImage,ycrcb,CV_BGR2YCrCb);
std::vector<cv::Mat> channels;
cv::split(ycrcb,channels);
cv::equalizeHist(channels[0], channels[0]);
cv::Mat result;
cv::merge(channels,ycrcb);
cvtColor(ycrcb,result,CV_YCrCb2BGR);
return result;
}
return cv::Mat();
}
So to recap, my questions:
Is it possible to resize Farneback optical flow to apply to 2xbigger image?
How to deal with pixels that go out of view like at the bottom of my images (the brown wooden part should disappear).
How to deal with distortion that is created because optical flow wasn't computed for those pixels, while many pixels around there have motion? (couch upper right, & lion figurine has a ghost hand in the remapped image).

With OpenCV's Farneback optical flow, you will only get a rough estimation of pixel displacement, hence the distortions that appear on the result images.
I don't think optical flow is the way to go for what you are trying to achieve IMHO. Instead I'd recommend you to have a look at Image / Pixel Registration for instace here : http://docs.opencv.org/trunk/db/d61/group__reg.html
Image / Pixel Registration is the science of matching pixels of two images. Active research is ongoing about this complex non-trivial subject that is not yet accurately resolved.

How to combine two remap() operations into one?

I have a tight loop, where I get a camera image, undistort it and also transform it according to some transformation (e.g. a perspective transform). I already figured out to use cv::remap(...) for each operation, which is already much more efficient than using plain matrix operations.
In my understanding it should be possible to combine the lookup maps into one and call remap just once in every loop iteration. Is there a canonical way to do this? I would prefer not to implement all the interpolation stuff myself.
Note: The procedure should work with differently sized maps. In my particular case the undistortion preserves the image dimensions, while the other transformation scales the image to a different size.
Code for illustration:
// input arguments
const cv::Mat_<math::flt> intrinsic = getIntrinsic();
const cv::Mat_<math::flt> distortion = getDistortion();
const cv::Mat mNewCameraMatrix = cv::getOptimalNewCameraMatrix(intrinsic, distortion, myImageSize, 0);
// output arguments
cv::Mat undistortMapX;
cv::Mat undistortMapY;
// computes undistortion maps
cv::initUndistortRectifyMap(intrinsic, distortion, cv::Mat(),
newCameraMatrix, myImageSize, CV_16SC2,
undistortMapX, undistortMapY);
// computes undistortion maps
// ...computation of mapX and mapY omitted
cv::convertMaps(mapX, mapY, skewMapX, skewMapY, CV_16SC2);
for(;;) {
cv::Mat originalImage = getNewImage();
cv::Mat undistortedImage;
cv::remap(originalImage, undistortedImage, undistortMapX, undistortMapY, cv::INTER_LINEAR);
cv::Mat skewedImage;
cv::remap(undistortedImage, skewedImage, skewMapX, skewMapY, cv::INTER_LINEAR);
outputImage(skewedImage);
}

You can apply remap on undistortMapX and undistortMapY.
cv::remap(undistortMapX, undistrtSkewX, skewMapX, skewMapY, cv::INTER_LINEAR);
cv::remap(undistortMapY, undistrtSkewY, skewMapX, skewMapY, cv::INTER_LINEAR);
Than you can use:
cv::remap(originalImage , skewedImage, undistrtSkewX, undistrtSkewY, cv::INTER_LINEAR);
It works because skewMaps and undistortMaps are arrays of coordinates in image, so it should be similar to taking location of location...
Edit (answer to comments):
I think I need to make some clarification. remap() function calculates pixels in new image from pixels of old image. In case of linear interpolation each pixel in new image is a weighted average of 4 pixels from the old image. The weights differ from pixel to pixel according to values from provided maps. If the value is more or less integer, then most of the weight is taken from single pixel. As a result new image will be as sharp is original image. On the other hand, if the value is far from being integer (i.e. integer + 0.5) then the weights are similar. This will create smoothing effect. To get a feeling of what I am talking about, look at the undistorted image. You will see that some parts of the image are sharper/smoother than other parts.
Now back to the explanation about what happened when you combined two remap operations into one. The coordinates in combined maps are correct, i.e. pixel in skewedImage is calculated from correct 4 pixels of originalImage with correct weights. But it is not identical to result of two remap operations. Each pixel in undistortedImage is a weighted average of 4 pixels from originalImage. This means that each pixel of skewedImage would be a weighted average of 9-16 pixels from orginalImage. Conclusion: using single remap() can NOT possibly give result that is identical to two usages of remap().
Discussion about which of the two possible images (single remap() vs double remap()) is better is quite complicated. Normally it is good to make as little interpolations as possible, because each interpolation introduces different artifacts. Especially if the artifacts are not uniform in the image (some regions became more smooth than others). In some cases those artifacts may have good visual effect on the image - like reducing some of the jitter. But if this is what you want, you can achieve this in cheaper and more consistent ways. For example by smoothing original image prior to remaping.

In the case of two general mappings, there is no choice but to use the approach suggested by #MichaelBurdinov.
However, in the special case of two mappings with known inverse mappings, an alternative approach is to compute the maps manually. This manual approach is more accurate than the double remap one, since it does not involve interpolation of coordinate maps.
In practice, most of the interesting applications match this special case. It does too in your case because your first map corresponds to image undistortion (whose inverse operation is image distortion, which is associated to a well known analytical model) and your second map corresponds to a perspective transform (whose inverse can be expressed analytically).
Computing the maps manually is actually quite easy. As stated in the documentation (link) these maps contain, for each pixel in the destination image, the (x,y) coordinates where to find the appropriate intensity in the source image. The following code snippet shows how to compute the maps manually in your case:
int dst_width=...,dst_height=...; // Initialize the size of the output image
cv::Mat Hinv=H.inv(), Kinv=K.inv(); // Precompute the inverse perspective matrix and the inverse camera matrix
cv::Mat map_undist_warped_x32f(dst_height,dst_width,CV_32F); // Allocate the x map to the correct size (n.b. the data type used is float)
cv::Mat map_undist_warped_y32f(dst_height,dst_width,CV_32F); // Allocate the y map to the correct size (n.b. the data type used is float)
// Loop on the rows of the output image
for(int y=0; y<dst_height; ++y) {
std::vector<cv::Point3f> pts_undist_norm(dst_width);
// For each pixel on the current row, first use the inverse perspective mapping, then multiply by the
// inverse camera matrix (i.e. map from pixels to normalized coordinates to prepare use of projectPoints function)
for(int x=0; x<dst_width; ++x) {
cv::Mat_<float> pt(3,1); pt << x,y,1;
pt = Kinv*Hinv*pt;
pts_undist_norm[x].x = pt(0)/pt(2);
pts_undist_norm[x].y = pt(1)/pt(2);
pts_undist_norm[x].z = 1;
}
// For each pixel on the current row, compose with the inverse undistortion mapping (i.e. the distortion
// mapping) using projectPoints function
std::vector<cv::Point2f> pts_dist;
cv::projectPoints(pts_undist_norm,cv::Mat::zeros(3,1,CV_32F),cv::Mat::zeros(3,1,CV_32F),intrinsic,distortion,pts_dist);
// Store the result in the appropriate pixel of the output maps
for(int x=0; x<dst_width; ++x) {
map_undist_warped_x32f.at<float>(y,x) = pts_dist[x].x;
map_undist_warped_y32f.at<float>(y,x) = pts_dist[x].y;
}
}
// Finally, convert the float maps to signed-integer maps for best efficiency of the remap function
cv::Mat map_undist_warped_x16s,map_undist_warped_y16s;
cv::convertMaps(map_undist_warped_x32f,map_undist_warped_y32f,map_undist_warped_x16s,map_undist_warped_y16s,CV_16SC2);
Note: H above is your perspective transform while Kshould be the camera matrix associated with the undistorted image, so it should be what in your code is called newCameraMatrix (which BTW is not an output argument of initUndistortRectifyMap). Depending on your specific data, there might also be some additional cases to handle (e.g. division by pt(2) when it might be zero, etc).

I found this question when looking to combine dewarping (undistortion) and projection tranforms in python, but there is no direct python answer.
Here is an direct conversion of BConic's answer in python
import numpy as np
import cv2
dst_width = ...
dst_height = ...
h_inv = np.linalg.inv(h)
k_inv = np.linalg.inv(new_camera_matrix)
map_x = np.zeros((dst_height, dst_width), dtype=np.float32)
map_y = np.zeros((dst_height, dst_width), dtype=np.float32)
for y in range(dst_height):
pts_undist_norm = np.zeros((dst_width, 3, 1))
for x in range(dst_width):
pt = np.array([x, y, 1]).reshape(3,1)
pt2 = k_inv # h_inv # pt
pts_undist_norm[x][0] = pt2[0]/pt2[2]
pts_undist_norm[x][1] = pt2[1]/pt2[2]
pts_undist_norm[x][2] = 1
r_vec = np.zeros((3,1))
t_vec = np.zeros((3,1))
pts_dist, _ = cv2.projectPoints(pts_undist_norm, r_vec, t_vec, intrinsic, distortion)
pts_dist = pts_dist.squeeze()
for x2 in range(dst_width):
map_x[y][x2] = pts_dist[x2][0]
map_y[y][x2] = pts_dist[x2][1]
# using CV_16SC2 introduced substantial image artifacts for me
map_x_final, map_y_final = cv2.convertMaps(map_x, map_y, cv2.CV_32FC1, cv2.CV_32FC1)
This is obviously really slow since it is using a double for loop and iterating through every pixel, so you can do it much faster using numpy. You should be able to do something similar in C++ to eliminate the for loops and do a single matrix multiplication.
import numpy as np
import cv2
dst_width = ...
dst_height = ...
h_inv = np.linalg.inv(h)
k_inv = np.linalg.inv(new_camera_matrix)
m_grid = np.mgrid[0:dst_width, 0:dst_height].reshape(2, dst_height*dst_width)
m_grid = np.insert(m_grid, 2, 1, axis=0)
m_grid_result = k_inv # h_inv # m_grid
pts_undist_norm = m_grid_result[:2, :] / m_grid_result[2, :]
pts_undist_norm = np.insert(pts_undist_norm, 2, 1, axis=0)
r_vec = np.zeros((3,1))
t_vec = np.zeros((3,1))
pts_dist = cv2.projectPoints(pts_undist_norm, r_vec, t_vec, intrinsic, distortion)
pts_dist = pts_dist.squeeze().astype(np.float32)
map_x = pts_dist[:, 0].reshape(dst_width, dst_height).swapaxes(0,1)
map_y = pts_dist[:, 1].reshape(dst_width, dst_height).swapaxes(0,1)
# using CV_16SC2 introduced substantial image artifacts for me
map_x_final, map_y_final = cv2.convertMaps(map_x, map_y, cv2.CV_32FC1, cv2.CV_32FC1)
This numpy implementation is roughly 25-75x faster than the first method.

I came across the same problem. I tried to implement AldurDisciple's answer. Instead of calculating transformation in a loop. I'm having a mat with mat.at <Vec2f>(x,y)=Vec2f(x,y) and applying perspectiveTransform to this mat. Add a 3rd channel of "1" to the result mat and apply projectPoints.
Here is my code
Mat xy(2000, 2500, CV_32FC2);
float *pxy = (float*)xy.data;
for (int y = 0; y < 2000; y++)
for (int x = 0; x < 2500; x++)
{
*pxy++ = x;
*pxy++ = y;
}
// perspective transformation of coordinates of destination image,
// which generates the map from destination image to norm points
Mat pts_undist_norm(2000, 2500, CV_32FC2);
Mat matPerspective =transRot3x3;
perspectiveTransform(xy, pts_undist_norm, matPerspective);
//add 3rd channel of 1
vector<Mat> channels;
split(pts_undist_norm, channels);
Mat channel3(2000, 2500, CV_32FC1, cv::Scalar(float(1.0)));
channels.push_back(channel3);
Mat pts_undist_norm_3D(2000, 2500, CV_32FC3);
merge(channels, pts_undist_norm_3D);
//projectPoints to extend the map from norm points back to the original captured image
pts_undist_norm_3D = pts_undist_norm_3D.reshape(0, 5000000);
Mat pts_dist(5000000, 1, CV_32FC2);
projectPoints(pts_undist_norm_3D, Mat::zeros(3, 1, CV_64F), Mat::zeros(3, 1, CV_64F), intrinsic, distCoeffs, pts_dist);
Mat maps[2];
pts_dist = pts_dist.reshape(0, 2000);
split(pts_dist, maps);
// apply map
remap(originalImage, skewedImage, maps[0], maps[1], INTER_LINEAR);
The transformation matrix used to map to norm points is a bit different from the one used in AldurDisciple's answer. transRot3x3 is composed from tvec and rvec generated by calibrateCamera.
double transData[] = { 0, 0, tvecs[0].at<double>(0), 0, 0,
tvecs[0].at<double>(1), 0, 0, tvecs[0].at<double>(2) };
Mat translate3x3(3, 3, CV_64F, transData);
Mat rotation3x3;
Rodrigues(rvecs[0], rotation3x3);
Mat transRot3x3(3, 3, CV_64F);
rotation3x3.col(0).copyTo(transRot3x3.col(0));
rotation3x3.col(1).copyTo(transRot3x3.col(1));
translate3x3.col(2).copyTo(transRot3x3.col(2));
Added:
I realized if the only needed map is the final map why not just use projectPoints to a mat with mat.at(x,y)=Vec2f(x,y,0) .
//generate a 3-channel mat with each entry containing it's own coordinates
Mat xyz(2000, 2500, CV_32FC3);
float *pxyz = (float*)xyz.data;
for (int y = 0; y < 2000; y++)
for (int x = 0; x < 2500; x++)
{
*pxyz++ = x;
*pxyz++ = y;
*pxyz++ = 0;
}
// project coordinates of destination image,
// which generates the map from destination image to source image directly
xyz=xyz.reshape(0, 5000000);
Mat pts_dist(5000000, 1, CV_32FC2);
projectPoints(xyz, rvecs[0], tvecs[0], intrinsic, distCoeffs, pts_dist);
Mat maps[2];
pts_dist = pts_dist.reshape(0, 2000);
split(pts_dist, maps);
//apply map
remap(originalImage, skewedImage, maps[0], maps[1], INTER_LINEAR);

OpenCV Edge/Border detection based on color

I'm fairly new to OpenCV, and very excited to learn more. I've been toying with the idea of outlining edges, shapes.
I've come across this code (running on an iOS device), which uses Canny. I'd like to be able to render this in color, and circle each shape. Can someone point me in the right direction?
Thanks!
IplImage *grayImage = cvCreateImage(cvGetSize(iplImage), IPL_DEPTH_8U, 1);
cvCvtColor(iplImage, grayImage, CV_BGRA2GRAY);
cvReleaseImage(&iplImage);
IplImage* img_blur = cvCreateImage( cvGetSize( grayImage ), grayImage->depth, 1);
cvSmooth(grayImage, img_blur, CV_BLUR, 3, 0, 0, 0);
cvReleaseImage(&grayImage);
IplImage* img_canny = cvCreateImage( cvGetSize( img_blur ), img_blur->depth, 1);
cvCanny( img_blur, img_canny, 10, 100, 3 );
cvReleaseImage(&img_blur);
cvNot(img_canny, img_canny);
And example might be these burger patties. OpenCV would detect the patty, and outline it.
Original Image:

Color information is often handled by conversion to HSV color space which handles "color" directly instead of dividing color into R/G/B components which makes it easier to handle same colors with different brightness etc.
if you convert your image to HSV you'll get this:
cv::Mat hsv;
cv::cvtColor(input,hsv,CV_BGR2HSV);
std::vector<cv::Mat> channels;
cv::split(hsv, channels);
cv::Mat H = channels[0];
cv::Mat S = channels[1];
cv::Mat V = channels[2];
Hue channel:
Saturation channel:
Value channel:
typically, the hue channel is the first one to look at if you are interested in segmenting "color" (e.g. all red objects). One problem is, that hue is a circular/angular value which means that the highest values are very similar to the lowest values, which results in the bright artifacts at the border of the patties. To overcome this for a particular value, you can shift the whole hue space. If shifted by 50° you'll get something like this instead:
cv::Mat shiftedH = H.clone();
int shift = 25; // in openCV hue values go from 0 to 180 (so have to be doubled to get to 0 .. 360) because of byte range from 0 to 255
for(int j=0; j<shiftedH.rows; ++j)
for(int i=0; i<shiftedH.cols; ++i)
{
shiftedH.at<unsigned char>(j,i) = (shiftedH.at<unsigned char>(j,i) + shift)%180;
}
now you can use a simple canny edge detection to find edges in the hue channel:
cv::Mat cannyH;
cv::Canny(shiftedH, cannyH, 100, 50);
You can see that the regions are a little bigger than the real patties, that might be because of the tiny reflections on the ground around the patties, but I'm not sure about that. Maybe it's just because of jpeg compression artifacts ;)
If you instead use the saturation channel to extract edges, you'll end up with something like this:
cv::Mat cannyS;
cv::Canny(S, cannyS, 200, 100);
where the contours aren't completely closed. Maybe you can combine hue and saturation within preprocessing to extract edges in the hue channel but only where saturation is high enough.
At this stage you have edges. Regard that edges aren't contours yet. If you directly extract contours from edges they might not be closed/separated etc:
// extract contours of the canny image:
std::vector<std::vector<cv::Point> > contoursH;
std::vector<cv::Vec4i> hierarchyH;
cv::findContours(cannyH,contoursH, hierarchyH, CV_RETR_TREE , CV_CHAIN_APPROX_SIMPLE);
// draw the contours to a copy of the input image:
cv::Mat outputH = input.clone();
for( int i = 0; i< contoursH.size(); i++ )
{
cv::drawContours( outputH, contoursH, i, cv::Scalar(0,0,255), 2, 8, hierarchyH, 0);
}
you can remove those small contours by checking cv::contourArea(contoursH[i]) > someThreshold before drawing. But you see the two patties on the left to be connected? Here comes the hardest part... use some heuristics to "improve" your result.
cv::dilate(cannyH, cannyH, cv::Mat());
cv::dilate(cannyH, cannyH, cv::Mat());
cv::dilate(cannyH, cannyH, cv::Mat());
Dilation before contour extraction will "close" the gaps between different objects but increase the object size too.
if you extract contours from that it will look like this:
If you instead choose only the "inner" contours it is exactly what you like:
cv::Mat outputH = input.clone();
for( int i = 0; i< contoursH.size(); i++ )
{
if(cv::contourArea(contoursH[i]) < 20) continue; // ignore contours that are too small to be a patty
if(hierarchyH[i][3] < 0) continue; // ignore "outer" contours
cv::drawContours( outputH, contoursH, i, cv::Scalar(0,0,255), 2, 8, hierarchyH, 0);
}
mind that the dilation and inner contour stuff is a little fuzzy, so it might not work for different images and if the initial edges are placed better around the object border it might 1. not be necessary to do the dilate and inner contour thing and 2. if it is still necessary, the dilate will make the object smaller in this scenario (which luckily is great for the given sample image.).
EDIT: Some important information about HSV: The hue channel will give every pixel a color of the spectrum, even if the saturation is very low ( = gray/white) or if the color is very low (value) so often it is desired to threshold the saturation and value channels to find some specific color! This might be much easier and much more stavle to handle than the dilation I've used in my code.

Cumulative Homography Wrongly Scaling

I'm to build a panorama image of the ground covered by a downward facing camera (at a fixed height, around 1 metre above ground). This could potentially run to thousands of frames, so the Stitcher class' built in panorama method isn't really suitable - it's far too slow and memory hungry.
Instead I'm assuming the floor and motion is planar (not unreasonable here) and trying to build up a cumulative homography as I see each frame. That is, for each frame, I calculate the homography from the previous one to the new one. I then get the cumulative homography by multiplying that with the product of all previous homographies.
Let's say I get H01 between frames 0 and 1, then H12 between frames 1 and 2. To get the transformation to place frame 2 onto the mosaic, I need to get H01*H12. This continues as the frame count increases, such that I get H01*H12*H23*H34*H45*....
In code, this is something akin to:
cv::Mat previous, current;
// Init cumulative homography
cv::Mat cumulative_homography = cv::Mat::eye(3);
video_stream >> previous;
for(;;) {
video_stream >> current;
// Here I do some checking of the frame, etc
// Get the homography using my DenseMosaic class (using Farneback to get OF)
cv::Mat tmp_H = DenseMosaic::get_homography(previous,current);
// Now normalise the homography by its bottom right corner
tmp_H /= tmp_H.at<double>(2, 2);
cumulative_homography *= tmp_H;
previous = current.clone( );
}
It works pretty well, except that as the camera moves "up" in the viewpoint, the homography scale decreases. As it moves down, the scale increases again. This gives my panoramas a perspective type effect that I really don't want.
For example, this is taken on a few seconds of video moving forward then backward. The first frame looks ok:
The problem comes as we move forward a few frames:
Then when we come back again, you can see the frame gets bigger again:
I'm at a loss as to where this is coming from.
I'm using Farneback dense optical flow to calculate pixel-pixel correspondences as below (sparse feature matching doesn't work well on this data) and I've checked my flow vectors - they're generally very good, so it's not a tracking problem. I also tried switching the order of the inputs to find homography (in case I'd mixed up the frame numbers), still no better.
cv::calcOpticalFlowFarneback(grey_1, grey_2, flow_mat, 0.5, 6,50, 5, 7, 1.5, flags);
// Using the flow_mat optical flow map, populate grid point correspondences between images
std::vector<cv::Point2f> points_1, points_2;
median_motion = DenseMosaic::dense_flow_to_corresp(flow_mat, points_1, points_2);
cv::Mat H = cv::findHomography(cv::Mat(points_2), cv::Mat(points_1), CV_RANSAC, 1);
Another thing I thought it could be was the translation I include in the transformation to ensure my panorama is centred within the scene:
cv::warpPerspective(init.clone(), warped, translation*homography, init.size());
But having checked the values in the homography before the translation is applied, the scaling issue I mention is still present.
Any hints are gratefully received. There's a lot of code I could put in but it seems irrelevant, please do let me know if there's something missing
UPDATE
I've tried switching out the *= operator for the full multiplication and tried reversing the order the homographies are multiplied in, but no luck. Below is my code for calculating the homography:
/**
\brief Calculates the homography between the current and previous frames
*/
cv::Mat DenseMosaic::get_homography()
{
cv::Mat grey_1, grey_2; // Grayscale versions of frames
cv::cvtColor(prev, grey_1, CV_BGR2GRAY);
cv::cvtColor(cur, grey_2, CV_BGR2GRAY);
// Calculate the dense flow
int flags = cv::OPTFLOW_FARNEBACK_GAUSSIAN;
if (frame_number > 2) {
flags = flags | cv::OPTFLOW_USE_INITIAL_FLOW;
}
cv::calcOpticalFlowFarneback(grey_1, grey_2, flow_mat, 0.5, 6,50, 5, 7, 1.5, flags);
// Convert the flow map to point correspondences
std::vector<cv::Point2f> points_1, points_2;
median_motion = DenseMosaic::dense_flow_to_corresp(flow_mat, points_1, points_2);
// Use the correspondences to get the homography
cv::Mat H = cv::findHomography(cv::Mat(points_2), cv::Mat(points_1), CV_RANSAC, 1);
return H;
}
And this is the function I use to find the correspondences from the flow map:
/**
\brief Calculate pixel->pixel correspondences given a map of the optical flow across the image
\param[in] flow_mat Map of the optical flow across the image
\param[out] points_1 The set of points from #cur
\param[out] points_2 The set of points from #prev
\param[in] step_size The size of spaces between the grid lines
\return The median motion as a point
Uses a dense flow map (such as that created by cv::calcOpticalFlowFarneback) to obtain a set of point correspondences across a grid.
*/
cv::Point2f DenseMosaic::dense_flow_to_corresp(const cv::Mat &flow_mat, std::vector<cv::Point2f> &points_1, std::vector<cv::Point2f> &points_2, int step_size)
{
std::vector<double> tx, ty;
for (int y = 0; y < flow_mat.rows; y += step_size) {
for (int x = 0; x < flow_mat.cols; x += step_size) {
/* Flow is basically the delta between left and right points */
cv::Point2f flow = flow_mat.at<cv::Point2f>(y, x);
tx.push_back(flow.x);
ty.push_back(flow.y);
/* There's no need to calculate for every single point,
if there's not much change, just ignore it
*/
if (fabs(flow.x) < 0.1 && fabs(flow.y) < 0.1)
continue;
points_1.push_back(cv::Point2f(x, y));
points_2.push_back(cv::Point2f(x + flow.x, y + flow.y));
}
}
// I know this should be median, not mean, but it's only used for plotting the
// general motion direction so it's unimportant.
cv::Point2f t_median;
cv::Scalar mtx = cv::mean(tx);
t_median.x = mtx[0];
cv::Scalar mty = cv::mean(ty);
t_median.y = mty[0];
return t_median;
}

It turns out this was because my viewpoint was close to the features, meaning that the non-planarity of the tracked features was causing skew to the homography. I managed to prevent this (it's more of a hack than a method...) by using estimateRigidTransform instead of findHomography, as this does not estimate for perspective variations.
In this particular case, it makes sense to do so, as the view does only ever undergo rigid transformations.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js