OpenCV: can projectPoints return negative values? - c++

I'm using the cv::projectPoints to get correspondent pixels of a vector of 3D points.
The points are all near each other.
The problem is that for some points I get correct pixels coordinates but for other I get strange negative values like -22599...
Is it normal that cv::projectPoints return negative values or is it a bug in my code?
void SingleCameraTriangulator::projectPointsToImage2(const std::vector< cv::Vec3d >& pointsGroup, const double scale, std::vector< Pixel >& pixels)
{
cv::Vec3d
t2, r2;
decomposeTransformation(*g_12_, r2, t2);
cv::Mat imagePoints2;
cv::projectPoints(pointsGroup, r2, t2, *camera_matrix_, *distortion_coefficients_, imagePoints2);
for (std::size_t i = 0; i < imagePoints2.rows; i++)
{
cv::Vec2d pixel = imagePoints2.at<cv::Vec2d>(i);
Pixel p;
p.x_ = pixel[0];
p.y_ = pixel[1];
if ( (p.x_ < 0) || (p.x_ > ((1 / scale) * img_1_->cols)) || (p.y_ < 0) || (p.y_ > ((1/scale) * img_1_->rows)))
{
cv::Vec3d point = pointsGroup[i];
std::cout << point << " - " << pixel << " - " << pixel*scale << "problema" << std::endl;
}
p.i_ = getBilinearInterpPix32f(*img_2_, scale * p.x_, scale * p.y_);
pixels.push_back(p);
}
}
Thank you in advance for any suggestions.

reprojectImageTo3D (are you use it for getting 3D points?) gives large z coordinates (10000) for outlier points, so I think your problem is here.

Related

Why doesn't my Gradient descent algorithm converge? (For Logistic Regression)

I have an assignment which says to implement logistic regression in c++ using gradient descent. Part of the assignment is to make the gradient descent stop when the magnitude of the gradient is below 10e-07.
I have to minimize: //chart.googleapis.com/chart?cht=tx&chl=L(w)%20%3D%20%5Cfrac%7B1%7D%7BN%7D%5Csum%20log(1%20%2B%20exp(-y_%7Bi%7Dw%5E%7BT%7Dx_%7Bi%7D))
However my gradient descent keeps stopping due to max iterations surpassed. I have tried with various max iteration thresholds, and they all max out. I think there is something wrong with my code, since logistic regression is supposedly an easy task for gradient descent due to the concave nature of its cost function, the gradient descent should easily find the minium.
I am using the armadillo library for matrices and vectors.
#include "armadillo.hpp"
using namespace arma;
double Log_Likelihood(Mat<double>& x, Mat<int>& y, Mat<double>& w)
{
Mat<double> L;
double L_sum = 0;
for (int i = 0; i < x.n_rows; i++)
{
L = log(1 + exp(-y[i] * w * x.row(i).t() ));
L_sum += as_scalar(L);
}
return L_sum / x.n_rows;
}
Mat<double> Gradient(Mat<double>& x, Mat<int>& y, Mat<double>& w)
{
Mat<double> grad(1, x.n_cols);
for (int i = 0; i < x.n_rows; i++)
{
grad = grad + (y[i] * (1 / (1 + exp(y[i] * w * x.row(i).t()))) * x.row(i));
}
return -grad / x.n_rows;
}
void fit(Mat<double>& x, Mat<int>& y, double alpha = 0.05, double threshold = pow(10, -7), int maxiter = 10000)
{
w.set_size(1, x.n_cols);
w = x.row(0);
int iter = 0;
double log_like = 0;
while (true)
{
log_like = Log_Likelihood(x, y, w);
if (iter % 1000 == 0)
{
std::cout << "Iter: " << iter << " -Log likelihood = " << log_like << " ||dL/dw|| = " << norm( Gradient(x, y, w), 2) << std::endl;
}
iter++;
if ( norm( Gradient(x, y, w), 2) < threshold)
{
std::cout << "Magnitude of gradient below threshold." << std::endl;
break;
}
if (iter == maxiter)
{
std::cout << "Max iterations surpassed." << std::endl;
break;
}
w = w - (alpha * Gradient(x, y, w));
}
}
I want the gradient descent to stop because the magnitude of the gradient falls below 10e-07.
My labels are {1, -1}.
Verify that your loglikelihood is increasing towards convergence by recording or plotting the values at every iteration, and also check that the norm of the gradient is going towards 0. You should be doing gradient ascent, so add the gradient instead of subtracting it. If the norm of the gradient consistently increases it means you are not going in a direction towards the optimum. If on the other hand, the norm of the gradient "jumps around" but doesn't go to 0, then you should reduce your stepsize/learning rate alpha and try again.
Plotting and analyzing these values will be helpful to debug and analyze your algorithm.

Rectangle Intersection. print message for empty intersection

I have four coordinates: x,y,width=w,height=h and I have two rectangles with the following coordinates:
r1.x=2,r1.y=3,r1.w=5,r1.h=6;
r2.x=0, r2.y=7,r2.w=-4,r4.h=2
How you can observe this intersection is empty.
what I did until now it was:
rectangle intersection (rectangle r1, rectangle r2){
r1.x=max(r1.x,r2.x);
r1.y=max(r1.y,r2.y);
r1.w=min(r1.w,r2.w);
r1.h=min(r1.h,r2.h);
return r1;
}
I think the above code it is used when there is an intersection, but when the intersection is empty I do not know. Also, I would like to print a message "empty" when there is no intersection.
thanks!
The method you are using for rectangle intersection does NOT work when rectangles are represented with their width and height.
It could work if you store the rectangles' two opposite corners (instead of one corner and the dimensions) and make sure that the first corner's coordinates are always less than or equal to the second corner, effectively storing min_x, min_y, max_x, and max_y for your rectangles.
I would suggest that you adopt the convention of making sure the rectangles always include their min coordinates and always exclude their max coords.
Assuming you have something not very different from:
struct rectangle {
int x;
int y;
int w;
int h;
};
(or the same using float or double instead of int)
I will assume here that w and h are always positive, if they may be negative, you should first normalize the input rectangle to ensure that they are.
You find the intersection by finding its opposite corners, and ensuring that lower left come before upper right:
rectangle intersection(const rectangle& r1, const rectangle& r2) {
// optionaly control arguments:
if (r1.w < 0 || r1.h < 0 || r2.w < 0 || r2.h < 0) {
throw std::domain_error("Unnormalized rectangles on input");
}
int lowx = max(r1.x, r2.x); // Ok, x coordinate of lower left corner
int lowy = max(r1.y, r2.y); // same for y coordinate
int upx = min(r1.x + r1.w, r2.x + r2.w) // x for upper right corner
int upy = min(r1.y + r1.h, r2.y + r2.h) // y for upper right corner
if (upx < lowx || upy < lowy) { // empty intersection
throw std::domain_error("Empty intersection");
}
return rectangle(lowx, lowy, upx - lowx, upy - lowy);
}
You can normalize a rectangle by forcing positive values for width and height:
rectangle& normalize(rectangle& r) {
if (r.w < 0) {
r.x += r.w;
r.w = - r.w;
}
if (r.h < 0) {
r.y += r.h;
r.h = -r.h;
}
return r;
}
You can then use that in a second function to display the intersection result:
void display_intersection(std::outstream out, rectangle r1, rectangle r2) {
try {
rectangle inter = intersection(normalize(r1), normalize(r2));
out << "(" << inter.x << ", " << inter.y << ") to (";
out << inter.x + inter.w << ", " << inter.y + inter.h << ")" << std::endl;
}
except (std::domain_error& e) {
out << "empty" << std::endl;
}
}

How to fill OpenVDB voxels that are inside a given plane?

I have a quad defined by 4 (x,y,z) points (like a plane that has edges). I have an OpenVDB grid. I want to fill all the voxels with value 1 that are inside my quad (including edges). Is such thing possible with out setting each voxel of the quad (limited plane) manually?
If the four points build a rectangle, it could be possible using the
void fill(const CoordBBox& bbox, const ValueType& value, bool active = true);
function that exists in the Grid-class. It is not possible to transform the CoordBBox for rotations, instead you would have to do that by changing the transformation of the grid. With pseudo-code it could look like
CoordBBox plane; // created from your points
Transform old = grid.transform();
grid.setTransform(...); // Some transformation that places the grid correctly with respect to the plane
grid.fill(plane, 1);
grid.setTransform(old);
If this is not the case, you would have to set the values yourself.
There is an unoptimized method, for arbitrarily shaped planar quadrilaterals, you only need to input four vertices of the 3D space plane, and the output is filled planar voxels.
Get the vertex with the longest sum of adjacent sides as A, the adjacent vertices of A are B and D, and obtain the voxel coordinates and voxel numbers between A-B and A-D based on ray-cast, and in addition A vertex C-B and C-D progressively sample the same number of voxels.
Make one-to-one correspondence between the voxels adjacent to the two vertices A and D above, and fill the plane area based on ray-cast.
void VDBVolume::fillPlaneVoxel(const PlanarEquation& planar) {
auto accessor = volume_->getUnsafeAccessor();
const auto transform = volume_->transform();
const openvdb::Vec3f vdbnormal(planar.normal.x(), planar.normal.y(), planar.normal.z());
int longside_vtx[2];
calVtxLongSide(planar.vtx, longside_vtx);
// 1. ray cast long side
std::vector<Eigen::Vector3f> longsidepoints;
int neiborvtx[2];
{
const Eigen::Vector3f origin = planar.vtx[longside_vtx[0]];
const openvdb::Vec3R eye(origin.x(), origin.y(), origin.z());
GetRectNeiborVtx(longside_vtx[0], neiborvtx);
for(int i = 0; i < 2; ++i) {
Eigen::Vector3f direction = planar.vtx[neiborvtx[i]] - origin;
openvdb::Vec3R dir(direction.x(), direction.y(), direction.z());
dir.normalize();
const float length = static_cast<float>(direction.norm());
if(length > 50.f) {
std::cout << "GetRectNeiborVtx length to large, something wrong: "<< i << "\n" << origin.transpose() << "\n"
<< planar.vtx[neiborvtx[i]].transpose() << std::endl;
continue;
}
const float t0 = -voxel_size_/2;
const float t1 = length + voxel_size_/2;
const auto ray = openvdb::math::Ray<float>(eye, dir, t0, t1).worldToIndex(*volume_);
openvdb::math::DDA<decltype(ray)> dda(ray);
do {
const auto voxel = dda.voxel();
const auto voxel_center_world = GetVoxelCenter(voxel, transform);
longsidepoints.emplace_back(voxel_center_world);
} while (dda.step());
}
}
// 2. 在小边均匀采样相同数目点
const int longsidepointnum = longsidepoints.size();
std::vector<Eigen::Vector3f> shortsidepoints;
shortsidepoints.resize(longsidepointnum);
{
// 输入:顶点、两邻接点,vtxs, 输出采样带年坐标,根据距离进行采样
GenerateShortSidePoints(longside_vtx[1], neiborvtx, planar.vtx, shortsidepoints);
}
// 3. ray cast from longsidepoints to shortsidepoints
// std::cout << "longsidepointnum: " << longsidepointnum << std::endl;
for(int pid = 0; pid < longsidepointnum; ++pid) {
const Eigen::Vector3f origin = longsidepoints[pid];
const openvdb::Vec3R eye(origin.x(), origin.y(), origin.z());
const Eigen::Vector3f direction = shortsidepoints[pid] - origin;
openvdb::Vec3R dir(direction.x(), direction.y(), direction.z());
dir.normalize();
const float length = direction.norm();
if(length > 50.f) {
std::cout << "length to large, something wrong: "<< pid << "\n" << origin.transpose() << "\n"
<< shortsidepoints[pid].transpose() << std::endl;
continue;
}
const float t0 = -voxel_size_/2;
const float t1 = length + voxel_size_/2;
const auto ray = openvdb::math::Ray<float>(eye, dir, t0, t1).worldToIndex(*volume_);
openvdb::math::DDA<decltype(ray)> dda(ray);
do {
const auto voxel = dda.voxel();
accessor.setValue(voxel, vdbnormal);
} while (dda.step());
}
}

Recommended way to compose rotations

I have 2 rotations represented as yaw, pitch, roll (Tait-Brian intrinsic right-handed). What is the recommended way to construct a single rotation that is equivalent to both of them?
EDIT: if I understand correctly from the answers, I must first convert yaw, pitch, roll to either matrix or quaternion, compose them and then transform the result back to yaw, pitch, roll representation.
Also, my first priority is simplicity, then numerical stability and efficiency.
Thanks :)
As a general answer, if you make a rotation matrix for each of the two rotations, you can then make a single matrix which is the product of the two (order is important!) to represent the effect of applying both rotations.
It is possible to conceive of instances where "gimbal lock" could make this numerically unstable for certain angles (typically involving angles very close to 90 degrees).
It is faster and more stable to use quaternions. You can see a nice treatment at http://www.genesis3d.com/~kdtop/Quaternions-UsingToRepresentRotation.htm - in summary, every rotation can be represented by a quaternion and multiple rotations are just represented by the product of the quaternions. They tend to have better stability properties.
Formulas for doing this can be found at http://en.m.wikipedia.org/wiki/Conversion_between_quaternions_and_Euler_angles
UPDATE using the formulas provided at http://planning.cs.uiuc.edu/node102.html , you can adapt the following code to do a sequence of rotations. While the code is written in (and compiles as ) C++, I am not taking advantage of certain built in C++ types and methods that might make this code more elegant - showing my C roots here. The point is really to show how the rotation equations work, and how you can concatenate multiple rotations.
The two key functions are calcRot which computes the rotation matrix for given yaw, pitch and roll; and mMult which multiplies two matrices together. When you have two successive rotations, the product of their rotation matrices is the "composite" rotation - you do have to watch out for the order in which you do things. The example that I used shows this. First I rotate a vector by two separate rotations; then I compute a single matrix that combines both rotations and get the same result; finally I reverse the order of the rotations, and get a different result. All of which should help you solve your problem.
Make sure that the conventions I used make sense for you.
#include <iostream>
#include <cmath>
#define PI (2.0*acos(0.0))
//#define DEBUG
void calcRot(double ypr[3], double M[3][3]) {
// extrinsic rotations: using the world frame of reference
// ypr: yaw, pitch, roll in radians
double cy, sy, cp, sp, cr, sr;
// compute sin and cos of each just once:
cy = cos(ypr[0]);
sy = sin(ypr[0]);
cp = cos(ypr[1]);
sp = sin(ypr[1]);
cr = cos(ypr[2]);
sr = sin(ypr[2]);
// compute this rotation matrix:
// source: http://planning.cs.uiuc.edu/node102.html
M[0][0] = cy*cp;
M[0][1] = cy*sp*sr - sy*cr;
M[0][2] = cy*sp*cr + sy*sr;
M[1][0] = sy*cp;
M[1][1] = sy*sp*sr + cy*cr;
M[1][2] = sy*sp*sr - cy*sr;
M[2][0] = -sp;
M[2][1] = cp*sr;
M[2][2] = cp*cr;
}
void mMult(double M[3][3], double R[3][3]) {
// multiply M * R, returning result in M
double T[3][3] = {0};
for(int ii = 0; ii < 3; ii++) {
for(int jj = 0; jj < 3; jj++) {
for(int kk = 0; kk < 3; kk++ ) {
T[ii][jj] += M[ii][kk] * R[kk][jj];
}
}
}
// copy the result:
for(int ii = 0; ii < 3; ii++) {
for(int jj = 0; jj < 3; jj++ ) {
M[ii][jj] = T[ii][jj];
}
}
}
void printRotMat(double M[3][3]) {
// print 3x3 matrix - for debug purposes
#ifdef DEBUG
std::cout << "rotation matrix is: " << std::endl;
for(int ii = 0; ii < 3; ii++) {
for(int jj = 0; jj < 3; jj++ ) {
std::cout << M[ii][jj] << " ";
}
std::cout << std::endl;
}
std::cout << std::endl;
#endif
}
void applyRot(double before[3], double after[3], double M[3][3]) {
// apply rotation matrix M to vector 'before'
// returning result in vector 'after'
double sumBefore = 0, sumAfter = 0;
std::cout << "Result of rotation:" << std::endl;
for(int ii = 0; ii < 3; ii++) {
std::cout << before[ii] << " -> ";
sumBefore += before[ii] * before[ii];
after[ii] = 0;
for( int jj = 0; jj < 3; jj++) {
after[ii] += M[ii][jj]*before[jj];
}
sumAfter += after[ii] * after[ii];
std::cout << after[ii] << std::endl;
}
std::cout << std::endl;
#ifdef DEBUG
std::cout << "length before: " << sqrt(sumBefore) << "; after: " << sqrt(sumAfter) << std::endl;
#endif
}
int main(void) {
double r1[3] = {0, 0, PI/2}; // order: yaw, pitch, roll
double r2[3] = {0, PI/2, 0};
double initPoint[3] = {3,4,5}; // initial point before rotation
double rotPoint[3], rotPoint2[3];
// initialize rotation matrix to I
double R[3][3];
double R2[3][3];
// compute first rotation matrix in-place:
calcRot(r1, R);
printRotMat(R);
applyRot(initPoint, rotPoint, R);
// apply second rotation on top of first:
calcRot(r2, R2);
std::cout << std::endl << "second rotation matrix: " << std::endl;
printRotMat(R2);
// applying second matrix to result of first rotation:
std::cout << std::endl << "applying just the second matrix to result of first: " << std::endl;
applyRot(rotPoint, rotPoint2, R2);
mMult(R2, R);
std::cout << "after multiplication: " << std::endl;
printRotMat(R2);
std::cout << "Applying the combined matrix to the intial vector: " << std::endl;
applyRot(initPoint, rotPoint2, R2);
// now in the opposite order:
double S[3][3] = {{1, 0, 0}, {0, 1, 0}, {0, 0, 1}};
calcRot(r2, S);
printRotMat(S);
calcRot(r1, R2);
mMult(R2, S);
std::cout << "applying rotation in the opposite order: " << std::endl;
printRotMat(R2);
applyRot(initPoint, rotPoint, R2);
}
Output (with #DEBUG not defined - commented out):
Result of rotation:
3 -> 3
4 -> -5
5 -> 4
second rotation matrix:
applying just the second matrix to result of first:
Result of rotation:
3 -> 4
-5 -> -5
4 -> -3
after multiplication:
Applying the combined matrix to the intial vector:
Result of rotation:
3 -> 4
4 -> -5
5 -> -3
Note that these last two give the same result, showing that you can combine rotation matrices.
applying rotation in the opposite order:
Result of rotation:
3 -> 5
4 -> 3
5 -> 4
Now the result is different - the order is important.
If you are familiar with matrix operations, you may try Rodrigues' rotation formula. If you are familiar with quaternions, you may try the P' = q*P*q' approach.
Quaterion math is a bit more complicated to grasp, but code is simpler and faster.

Camera motion compensation

I am using openCV to implementing camera motion compensation for an application. I know I need to calculate the optical flow and then find the fundamental matrix between two frames to transform the image.
Here is what I have done so far:
void VideoStabilization::stabilize(Image *image) {
if (image->getWidth() != width || image->getHeight() != height) reset(image->getWidth(), image->getHeight());
IplImage *currImage = toCVImage(image);
IplImage *currImageGray = cvCreateImage(cvSize(width, height), IPL_DEPTH_8U, 1);
cvCvtColor(currImage, currImageGray, CV_BGRA2GRAY);
if (baseImage) {
CvPoint2D32f currFeatures[MAX_CORNERS];
char featuresFound[MAX_CORNERS];
opticalFlow(currImageGray, currFeatures, featuresFound);
IplImage *result = transformImage(currImage, currFeatures, featuresFound);
if (result) {
updateImage(image, result);
cvReleaseImage(&result);
}
}
cvReleaseImage(&currImage);
if (baseImage) cvReleaseImage(&baseImage);
baseImage = currImageGray;
updateGoodFeatures();
}
void VideoStabilization::updateGoodFeatures() {
const double QUALITY_LEVEL = 0.05;
const double MIN_DISTANCE = 5.0;
baseFeaturesCount = MAX_CORNERS;
cvGoodFeaturesToTrack(baseImage, eigImage,
tempImage, baseFeatures, &baseFeaturesCount, QUALITY_LEVEL, MIN_DISTANCE);
cvFindCornerSubPix(baseImage, baseFeatures, baseFeaturesCount,
cvSize(10, 10), cvSize(-1,-1), TERM_CRITERIA);
}
void VideoStabilization::opticalFlow(IplImage *currImage, CvPoint2D32f *currFeatures, char *featuresFound) {
const unsigned int WIN_SIZE = 15;
const unsigned int PYR_LEVEL = 5;
cvCalcOpticalFlowPyrLK(baseImage, currImage,
NULL, NULL,
baseFeatures,
currFeatures,
baseFeaturesCount,
cvSize(WIN_SIZE, WIN_SIZE),
PYR_LEVEL,
featuresFound,
NULL,
TERM_CRITERIA,
0);
}
IplImage *VideoStabilization::transformImage(IplImage *image, CvPoint2D32f *features, char *featuresFound) const {
unsigned int featuresFoundCount = 0;
for (unsigned int i = 0; i < MAX_CORNERS; ++i) {
if (featuresFound[i]) ++featuresFoundCount;
}
if (featuresFoundCount < 8) {
std::cout << "Not enough features found." << std::endl;
return NULL;
}
CvMat *points1 = cvCreateMat(2, featuresFoundCount, CV_32F);
CvMat *points2 = cvCreateMat(2, featuresFoundCount, CV_32F);
CvMat *fundamentalMatrix = cvCreateMat(3, 3, CV_32F);
unsigned int pos = 0;
for (unsigned int i = 0; i < featuresFoundCount; ++i) {
while (!featuresFound[pos]) ++pos;
cvSetReal2D(points1, 0, i, baseFeatures[pos].x);
cvSetReal2D(points1, 1, i, baseFeatures[pos].y);
cvSetReal2D(points2, 0, i, features[pos].x);
cvSetReal2D(points2, 1, i, features[pos].y);
++pos;
}
int fmCount = cvFindFundamentalMat(points1, points2, fundamentalMatrix, CV_FM_RANSAC, 1.0, 0.99);
if (fmCount < 1) {
std::cout << "Fundamental matrix not found." << std::endl;
return NULL;
}
std::cout << fundamentalMatrix->data.fl[0] << " " << fundamentalMatrix->data.fl[1] << " " << fundamentalMatrix->data.fl[2] << "\n";
std::cout << fundamentalMatrix->data.fl[3] << " " << fundamentalMatrix->data.fl[4] << " " << fundamentalMatrix->data.fl[5] << "\n";
std::cout << fundamentalMatrix->data.fl[6] << " " << fundamentalMatrix->data.fl[7] << " " << fundamentalMatrix->data.fl[8] << "\n";
cvReleaseMat(&points1);
cvReleaseMat(&points2);
IplImage *result = transformImage(image, *fundamentalMatrix);
cvReleaseMat(&fundamentalMatrix);
return result;
}
MAX_CORNERS is 100 and it usually find around 70-90 features.
With this code, I get a weird fundamental matrix, like:
-0.000190809 -0.00114947 1.2487
0.00127824 6.57727e-05 0.326055
-1.22443 -0.338243 1
Since I just hold the camera with my hand and try not to shake it (and there werent any objects moving), I expected the matrix to be close to identity. What am I doing wrong?
Also, I'm not sure what to use to transform the image. cvWarpAffine need a 2x3 matrix, should I discard the last row or use another function?
What you're looking for is not the fundamental matrix but rather an affine or perspective transform.
The fundamental matrix describes the relation of two cameras having significantly different viewpoints. It is calculated such that if you have two points x (on one image) and x' (on another) that are projections of the same point in space, then x F x' (the product) is zero. If x and x' are nearly identical... then the only solution is to make F nearly zero (and practically useless). That's why you've got what you have.
The matrix that should indeed be near identity is a transformation A that transforms the points x to x'= A x (the old image into the new one). Depending on what types of transformations you want to include (affine or perspective), you could (theoretically) use the functions cvGetAffineTransform or cvGetPerspectiveTransform to calculate the transform. For that, you would need 3 or 4 point pairs, respectively.
However, the best choice (I think) is cvFindHomograpy. It estimates a perspective transform based on all of the point pairs available, using outlier filtering algorithms (RANSAC, for example), giving you a 3x3 matrix.
Then you can use cvWarpPerspective to transform the images themselves.