ScalableTSDFVolume Integrate from TUM-RGBD Dataset - c++

I am using Open3D 0.15 and C++11 on Ubuntu 18.04.
The main function I'm interested in is the ScalabeTSDFVolume Integrate() function, using the TUM RGBD dataset (the xyz set to be exact), based off of the IntegrateRGBD example from the Open3D repo.
Since the TUM-RGBD dataset does not provide an association file that matches the RGBD images and the trajectory info, I've created my own small code that matches the timestamp on the TUM dataset's image data and the trajectory information, and converting the 7-dimension [x y z rx ry rz rw] trajectory information into Eigen::Matrix4d, using the same equation that Open3D's FileTUM.cpp uses:
do
{
// Read the timestamp first
gt >> p_gt.timestamp;
double poseArr[7];
// push the remaining 7 numbers to the poseArr
for (int i = 0; i < 7; i++)
gt >> poseArr[i];
// copy paste of the tum trajectory reader
Eigen::Matrix4d transform;
transform.setIdentity();
transform.topLeftCorner<3, 3>() =
Eigen::Quaterniond(poseArr[6], poseArr[3], poseArr[4], poseArr[5]).toRotationMatrix();
transform.topRightCorner<3, 1>() = Eigen::Vector3d(poseArr[0], poseArr[1], poseArr[2]);
p_gt.pose = transform.inverse();
gtF.push_back(p_gt);
} while (std::getline(gt, line));
The code runs fine, but the issue is when I try to integrate multiple frames into the same volume and extract its pointcloud or mesh.
I can tell that the RGBD information is being fed into the program correctly, by extracting the mesh at the very first frame:
first frame mesh extraction
But there is a significant artifact when I try to extract the mesh when more frames are integrated, like this:
30 frames mesh extraction
From my previous experience, this probably has to do with the fact that the transformation matrices are not in the correct axis. If anyone has tried to use the TUM dataset with Open3D and encountered the same problem, I would greatly appreciate any info on this.
Edit:
For reference, this is the modified code I'm using for the reconstruction.
int main(int argc, char *argv[]) {
using namespace open3d;
std::string filebase("/home/geometry/Documents/rgbd_dataset_freiburg1_xyz");
VirtualSensor::CameraParameters kinect{ 525.0,525.0,319.5,239.5,5000};
VirtualSensor::CameraParameters camPar = kinect;
VirtualSensor v1(filebase,camPar);
bool save_pointcloud = true;
bool save_mesh = true;
bool save_voxel = false;
int every_k_frames = 50;
double length = 4.0;
double uLength = 6.0;
int resolution = 512;
double sdf_trunc_percentage = 0.01;
int verbose = 2;
utility::SetVerbosityLevel((utility::VerbosityLevel)verbose);
auto camera_intrinsic = camera::PinholeCameraIntrinsic(640, 480, 525.0, 525.0, 319.5, 239.5);
int index = 0;
int save_index = 0;
int pairSize = 30;
// initialise TSDF
pipelines::integration::ScalableTSDFVolume volume(
length / (double)resolution, length * sdf_trunc_percentage,
pipelines::integration::TSDFVolumeColorType::RGB8);
//pipelines::integration::UniformTSDFVolume uVolume(uLength, resolution, uLength*sdf_trunc_percentage, pipelines::integration::TSDFVolumeColorType::RGB8);
utility::FPSTimer timer("Process RGBD stream",
pairSize);
geometry::Image depth, color;
// start loop
for(int i = 0; i < pairSize; i++){
utility::LogInfo("Processing frame {:d} ...", index);
io::ReadImage(v1.GetDepthPath(i), depth);
io::ReadImage(v1.GetColorPath(i), color);
auto rgbd = geometry::RGBDImage::CreateFromColorAndDepth(
color, depth, 5000.0, 6.0, false);
if (index == 0 ||
(every_k_frames > 0 && index % every_k_frames == 0))
volume.Reset();
}
volume.Integrate(*rgbd,
camera_intrinsic, // intrinsic never changes
v1.GetCounterGT(i)); // get the groundtruth pose from my class
index++;
// saving mesh/pc logic
if (index == pairSize ||
(every_k_frames > 0 && index % every_k_frames == 0)) {
utility::LogInfo("Saving fragment {:d} ...", save_index);
std::string save_index_str = std::to_string(save_index);
if (save_pointcloud) {
utility::LogInfo("Saving pointcloud {:d} ...", save_index);
auto pcd = volume.ExtractPointCloud();
io::WritePointCloud("pointcloud_" + save_index_str + ".ply",
*pcd);
}
if (save_mesh) {
utility::LogInfo("Saving mesh {:d} ...", save_index);
auto mesh = volume.ExtractTriangleMesh();
io::WriteTriangleMesh("mesh_" + save_index_str + ".ply",
*mesh);
}
if (save_voxel) {
utility::LogInfo("Saving voxel {:d} ...", save_index);
auto voxel = volume.ExtractVoxelPointCloud();
io::WritePointCloud("voxel_" + save_index_str + ".ply",
*voxel);
}
save_index++;
}
timer.Signal();
}
return 0;
}

Related

How to rotate an image without using OpenCV functions? (using Linear, Qubic interpolation)

I am trying to rotate an image without using the OpenCV function.
I want to do it pixel by pixel with interpolation (nearest neighbors & linear & cubic) and later I would like to do it with a rotation matrix.
Problems:
Can't understand how to implement the interpolations. Even one example with the Qubic will help me.
for some reason the left pixels in the original image are sent to the right side in the rotated image and it seems not right for the rotation (should be black pixels).
Add an extra option (but not a must) to rotate the image from the center of the image. (and not from (0,0) which is the top left of the image by default)
The original image:
My code: (AFTER UPDATE 1)
#include <iostream>
#include <math.h>
#include "opencv2/opencv.hpp"
using namespace std;
enum interpolation_type{
INTERPOLATION_CUBIC,
INTERPOLATION_LINEAR,
INTERPOLATION_NEAREST_NEIGHBOR
};
void Interpolation_Calculator(const cv::Point& srcPixel,cv::Point2i& dstPixel, interpolation_type type){
// The origin pixels for the currPixel in the newImage depends on the interpolation type
int originX = 0;
int originY = 0;
if(type == INTERPOLATION_NEAREST_NEIGHBOR)
{
originX = (int)round(srcPixel.x);
originY = (int)round(srcPixel.y);
}
else if(type == INTERPOLATION_LINEAR){
}
else if (type == INTERPOLATION_CUBIC){
}
dstPixel.x = originX;
dstPixel.y = originY;
}
void RotationFunction(const cv::Mat& src,cv::Mat& dst, int angle, interpolation_type type){
// The pixels in the new image we want to find right origin pixel for his value.
double rotatedX;
double rotatedY;
double toRadian = 3.141592653589/180;
for(int r=0;r<dst.rows;r++)
{
for(int c=0;c<dst.cols;c++)
{
rotatedX = r*cos(angle * toRadian) - c*sin(angle * toRadian);
rotatedY = r*sin(angle * toRadian) + c*cos(angle * toRadian);
cv::Point rotatedPixel(rotatedX,rotatedY);
cv::Point2i originPixel;
Interpolation_Calculator(rotatedPixel,originPixel,type);
//cv::Vec3b vector(0,0,0);
// Checking if the Interpolation calculations crossed the boundaries
if(originPixel.x < 0 || originPixel.x > src.cols - 1 || originPixel.y < 0 || originPixel.y > src.rows - 1)
dst.at<cv::Vec3b>(cv::Point(r, c)) = 0;
else { // In case everything is good
cv::Vec3b currPixel = src.at<cv::Vec3b>(originPixel);
dst.at<cv::Vec3b>(cv::Point(r, c)) = currPixel;
}
}
}
}
int main() {
cv::Mat img = cv::imread("../lion.jpeg");
cv::Mat rotatedImage(img.rows,img.cols,CV_8UC3);
// Rotating
RotationFunction(img,rotatedImage,25,INTERPOLATION_NEAREST_NEIGHBOR);
// End of Rotating
// Show the images
cv::imshow("window1",img);
cv::imshow("window2",rotatedImage);
cv::waitKey(0);
// End of Show the images
return 0;
}
Bad Output:

GDALWarpRegionToBuffer & Tiling when Dst Frame not strictly contained in Src Frame

I'm currently working with gdal api C/C++ and I'm facing an issue with gdal warp region to buffer functionality (WarpRegionToBuffer).
When my destination dataset is not strictly contained in the frame of my source dataset, the area where there should be no data values is filled with random data (see out_code.tif enclosed). However gdalwarp command line functionality, which also uses WarpRegionToBuffer function, does not seem to have this problem.
1/ Here is the code I use:
#include <iostream>
#include <string>
#include <vector>
#include "gdal.h"
#include "gdalwarper.h"
#include "cpl_conv.h"
int main(void)
{
std::string pathSrc = "in.dt1";
//these datas will be provided by command line
std::string pathDst = "out_code.tif";
double resolutionx = 0.000833333;
double resolutiony = 0.000833333;
//destination corner coordinates: top left (tl) bottom right (br)
float_t xtl = -1;
float_t ytl = 45;
float_t xbr = 2;
float_t ybr = 41;
//tile size defined by user
int tilesizex = 256;
int tilesizey = 256;
float width = std::ceil((xbr - xtl)/resolutionx);
float height = std::ceil((ytl - ybr)/resolutiony);
double adfDstGeoTransform[6] = {xtl, resolutionx, 0, ytl, 0, -resolutiony};
GDALDatasetH hSrcDS, hDstDS;
// Open input file
GDALAllRegister();
hSrcDS = GDALOpen(pathSrc.c_str(), GA_ReadOnly);
GDALDataType eDT = GDALGetRasterDataType(GDALGetRasterBand(hSrcDS,1));
// Create output file, using same spatial reference as input image, but new geotransform
GDALDriverH hDriver = GDALGetDriverByName( "GTiff" );
hDstDS = GDALCreate( hDriver, pathDst.c_str(), width, height, GDALGetRasterCount(hSrcDS), eDT, NULL );
OGRSpatialReference oSRS;
char *pszWKT = NULL;
//force geo projection
oSRS.SetWellKnownGeogCS( "WGS84" );
oSRS.exportToWkt( &pszWKT );
GDALSetProjection( hDstDS, pszWKT );
//Fetches the coefficients for transforming between pixel/line (P,L) raster space,
//and projection coordinates (Xp,Yp) space.
GDALSetGeoTransform( hDstDS, adfDstGeoTransform );
// Setup warp options
GDALWarpOptions *psWarpOptions = GDALCreateWarpOptions();
psWarpOptions->hSrcDS = hSrcDS;
psWarpOptions->hDstDS = hDstDS;
psWarpOptions->nBandCount = 1;
psWarpOptions->panSrcBands = (int *) CPLMalloc(sizeof(int) * psWarpOptions->nBandCount );
psWarpOptions->panSrcBands[0] = 1;
psWarpOptions->panDstBands = (int *) CPLMalloc(sizeof(int) * psWarpOptions->nBandCount );
psWarpOptions->panDstBands[0] = 1;
psWarpOptions->pfnProgress = GDALTermProgress;
//these datas will be calculated in order to warp tile by tile
//current tile size
int cursizex = 0;
int cursizey = 0;
double nbtilex = std::ceil(width/tilesizex);
double nbtiley = std::ceil(height/tilesizey);
int starttilex = 0;
int starttiley = 0;
// Establish reprojection transformer
psWarpOptions->pTransformerArg =
GDALCreateGenImgProjTransformer(hSrcDS,
GDALGetProjectionRef(hSrcDS),
hDstDS,
GDALGetProjectionRef(hDstDS),
FALSE, 0.0, 1);
psWarpOptions->pfnTransformer = GDALGenImgProjTransform;
// Initialize and execute the warp operation on region
GDALWarpOperation oOperation;
oOperation.Initialize(psWarpOptions);
for (int ty = 0; ty < nbtiley; ty++) {
//handle last tile size
//if it last tile change size otherwise keep tilesize
for (int tx = 0; tx < nbtilex; tx++) {
//if it last tile change size otherwise keep tilesize
starttiley = ty * tilesizey;
starttilex = tx * tilesizex;
cursizex = std::min(starttilex + tilesizex, (int)width) - starttilex;
cursizey = std::min(starttiley + tilesizey, (int)height) - starttiley;
float * buffer = new float[cursizex*cursizey];
memset(buffer, 0, cursizex*cursizey);
//warp source
CPLErr ret = oOperation.WarpRegionToBuffer(
starttilex, starttiley, cursizex, cursizey,
buffer,
eDT);
if (ret != 0) {
CEA_SIMONE_ERROR(CPLGetLastErrorMsg());
throw std::runtime_error("warp error");
}
//write the fuzed tile in dest
ret = GDALRasterIO(GDALGetRasterBand(hDstDS,1),
GF_Write,
starttilex, starttiley, cursizex, cursizey,
buffer, cursizex, cursizey,
eDT,
0, 0);
if (ret != 0) {
CEA_SIMONE_ERROR("raster io write error");
throw std::runtime_error("raster io write error");
}
delete(buffer);
}
}
// Clean memory
GDALDestroyGenImgProjTransformer( psWarpOptions->pTransformerArg );
GDALDestroyWarpOptions( psWarpOptions );
GDALClose( hDstDS );
GDALClose( hSrcDS );
return 0;
}
The result:
output image of previous sample of code (as png, as I can't enclose TIF img)
The GdalWarp command line:
gdalwarp -te -1 41 2 45 -tr 0.000833333 0.000833333 in.dt1 out_cmd_line.tif
The command line result:
output image of previous command line (as png, as I can't enclose TIF img)
Can you please help me find what is wrong with my use of GDAL C/C++ API in order to have a similar behaviour as gdalwarp command line? There is probably an algorithm in gdalwarp that computes a mask of useful pixels in destination frame before calling WarpRegionToBuffer, but I didn't find it.
I would really appreciate help on this problem!
Best regards

Generate image from an unorganized Point Cloud in PCL

I have an unorganized point cloud of the scene. Below is a screenshot of the point cloud-
I want to compose an image from this point cloud. Below is the code snippet-
#include <iostream>
#include <pcl/io/pcd_io.h>
#include <pcl/point_types.h>
#include <opencv2/opencv.hpp>
int main(int argc, char** argv)
{
pcl::PointCloud<pcl::PointXYZRGBA>::Ptr cloud(new pcl::PointCloud<pcl::PointXYZRGBA>);
pcl::io::loadPCDFile("file.pcd", *cloud);
cv::Mat image = cv::Mat(cloud->height, cloud->width, CV_8UC3);
for (int i = 0; i < image.rows; i++)
{
for (int j = 0; j < image.cols; j++)
{
pcl::PointXYZRGBA point = cloud->at(j, i);
image.at<cv::Vec3b>(i, j)[0] = point.b;
image.at<cv::Vec3b>(i, j)[1] = point.g;
image.at<cv::Vec3b>(i, j)[2] = point.r;
}
}
cv::imwrite("image.png", image);
return (0);
}
The PCD file can be found here. The above code throws following error at runtime-
terminate called after throwing an instance of 'pcl::IsNotDenseException'
what(): : Can't use 2D indexing with a unorganized point cloud
Since the cloud is unorganized, the HEIGHT field is 1. This makes me confused while defining the dimensions of the image.
Questions
How to compose an image from an unorganized point cloud?
How to convert pixels located in composed image back to point cloud (3D space)?
PS: I am using PCL 1.7 in Ubuntu 14.04 LTS OS.
What Unorganized point cloud means is that the points are NOT assigned to a fixed (organized) grid, therefore ->at(j, i) can't be used (height is always 1, and the width is just the size of the cloud.
If you want to generate an image from your cloud, I suggest the following process:
Project the point cloud to a plane.
Generate a grid (organized point cloud) on that plane.
Interpolate the colors from the unorganized cloud to the grid (organized cloud).
Generate image from your organized grid (your initial attempt).
To be able to convert back to 3D:
When projecting to the plane save the "projection vectors" (vector from original point to the projected point).
Interpolate that as well to the grid.
methods for creating the grid:
Project the point cloud to a plane (unorganized cloud), and optionally save the reconstruction information in the normals:
pcl::PointCloud<pcl::PointXYZINormal>::Ptr ProjectToPlane(pcl::PointCloud<pcl::PointXYZINormal>::Ptr cloud, Eigen::Vector3f origin, Eigen::Vector3f axis_x, Eigen::Vector3f axis_y)
{
PointCloud<PointXYZINormal>::Ptr aux_cloud(new PointCloud<PointXYZINormal>);
copyPointCloud(*cloud, *aux_cloud);
auto normal = axis_x.cross(axis_y);
Eigen::Hyperplane<float, 3> plane(normal, origin);
for (auto itPoint = aux_cloud->begin(); itPoint != aux_cloud->end(); itPoint++)
{
// project point to plane
auto proj = plane.projection(itPoint->getVector3fMap());
itPoint->getVector3fMap() = proj;
// optional: save the reconstruction information as normals in the projected cloud
itPoint->getNormalVector3fMap() = itPoint->getVector3fMap() - proj;
}
return aux_cloud;
}
Generate a grid based on an origin point and two axis vectors (length and image_size can either be predetermined as calculated from your cloud):
pcl::PointCloud<pcl::PointXYZINormal>::Ptr GenerateGrid(Eigen::Vector3f origin, Eigen::Vector3f axis_x , Eigen::Vector3f axis_y, float length, int image_size)
{
auto step = length / image_size;
pcl::PointCloud<pcl::PointXYZINormal>::Ptr image_cloud(new pcl::PointCloud<pcl::PointXYZINormal>(image_size, image_size));
for (auto i = 0; i < image_size; i++)
for (auto j = 0; j < image_size; j++)
{
int x = i - int(image_size / 2);
int y = j - int(image_size / 2);
image_cloud->at(i, j).getVector3fMap() = center + (x * step * axisx) + (y * step * axisy);
}
return image_cloud;
}
Interpolate to an organized grid (where the normals store reconstruction information and the curvature is used as a flag to indicate empty pixel (no corresponding point):
void InterpolateToGrid(pcl::PointCloud<pcl::PointXYZINormal>::Ptr cloud, pcl::PointCloud<pcl::PointXYZINormal>::Ptr grid, float max_resolution, int max_nn_to_consider)
{
pcl::search::KdTree<pcl::PointXYZINormal>::Ptr tree(new pcl::search::KdTree<pcl::PointXYZINormal>);
tree->setInputCloud(cloud);
for (auto idx = 0; idx < grid->points.size(); idx++)
{
std::vector<int> indices;
std::vector<float> distances;
if (tree->radiusSearch(grid->points[idx], max_resolution, indices, distances, max_nn_to_consider) > 0)
{
// Linear Interpolation of:
// Intensity
// Normals- residual vector to inflate(recondtruct) the surface
float intensity(0);
Eigen::Vector3f n(0, 0, 0);
float weight_factor = 1.0F / accumulate(distances.begin(), distances.end(), 0.0F);
for (auto i = 0; i < indices.size(); i++)
{
float w = weight_factor * distances[i];
intensity += w * cloud->points[indices[i]].intensity;
auto res = cloud->points[indices[i]].getVector3fMap() - grid->points[idx].getVector3fMap();
n += w * res;
}
grid->points[idx].intensity = intensity;
grid->points[idx].getNormalVector3fMap() = n;
grid->points[idx].curvature = 1;
}
else
{
grid->points[idx].intensity = 0;
grid->points[idx].curvature = 0;
grid->points[idx].getNormalVector3fMap() = Eigen::Vector3f(0, 0, 0);
}
}
}
Now you have a grid (an organized cloud), which you can easily map to an image. Any changes you make to the images, you can map back to the grid, and use the normals to project back to your original point cloud.
usage example for creating the grid:
pcl::PointCloud<pcl::PointXYZINormal>::Ptr original_cloud = ...;
// reference frame for the projection
// e.g. take XZ plane around 0,0,0 of length 100 and map to 128*128 image
Eigen::Vector3f origin = Eigen::Vector3f(0,0,0);
Eigen::Vector3f axis_x = Eigen::Vector3f(1,0,0);
Eigen::Vector3f axis_y = Eigen::Vector3f(0,0,1);
float length = 100
int image_size = 128
auto aux_cloud = ProjectToPlane(original_cloud, origin, axis_x, axis_y);
// aux_cloud now contains the points of original_cloud, with:
// xyz coordinates projected to XZ plane
// color (intensity) of the original_cloud (remains unchanged)
// normals - we lose the normal information, as we use this field to save the projection information. if you wish to keep the normal data, you should define a custom PointType.
// note: for the sake of projection, the origin is only used to define the plane, so any arbitrary point on the plane can be used
auto grid = GenerateGrid(origin, axis_x , axis_y, length, image_size)
// organized cloud that can be trivially mapped to an image
float max_resolution = 2 * length / image_size;
int max_nn_to_consider = 16;
InterpolateToGrid(aux_cloud, grid, max_resolution, max_nn_to_consider);
// Now you have a grid (an organized cloud), which you can easily map to an image. Any changes you make to the images, you can map back to the grid, and use the normals to project back to your original point cloud.
additional helper methods for how I use the grid:
// Convert an Organized cloud to cv::Mat (an image and a mask)
// point Intensity is used for the image
// if as_float is true => take the raw intensity (image is CV_32F)
// if as_float is false => assume intensity is in range [0, 255] and round it (image is CV_8U)
// point Curvature is used for the mask (assume 1 or 0)
std::pair<cv::Mat, cv::Mat> ConvertGridToImage(pcl::PointCloud<pcl::PointXYZINormal>::Ptr grid, bool as_float)
{
int rows = grid->height;
int cols = grid->width;
if ((rows <= 0) || (cols <= 0))
return pair<Mat, Mat>(Mat(), Mat());
// Initialize
Mat image = Mat(rows, cols, as_float? CV_32F : CV_8U);
Mat mask = Mat(rows, cols, CV_8U);
if (as_float)
{
for (int y = 0; y < image.rows; y++)
{
for (int x = 0; x < image.cols; x++)
{
image.at<float>(y, x) = grid->at(x, image.rows - y - 1).intensity;
mask.at<uchar>(y, x) = 255 * grid->at(x, image.rows - y - 1).curvature;
}
}
}
else
{
for (int y = 0; y < image.rows; y++)
{
for (int x = 0; x < image.cols; x++)
{
image.at<uchar>(y, x) = (int)round(grid->at(x, image.rows - y - 1).intensity);
mask.at<uchar>(y, x) = 255 * grid->at(x, image.rows - y - 1).curvature;
}
}
}
return pair<Mat, Mat>(image, mask);
}
// project image to cloud (using the grid data)
// organized - whether the resulting cloud should be an organized cloud
pcl::PointCloud<pcl::PointXYZI>::Ptr BackProjectImage(cv::Mat image, pcl::PointCloud<pcl::PointXYZINormal>::Ptr grid, bool organized)
{
if ((image.size().height != grid->height) || (image.size().width != grid->width))
{
assert(false);
throw;
}
PointCloud<PointXYZI>::Ptr cloud(new PointCloud<PointXYZI>);
cloud->reserve(grid->height * grid->width);
// order of iteration is critical for organized target cloud
for (auto r = image.size().height - 1; r >= 0; r--)
{
for (auto c = 0; c < image.size().width; c++)
{
PointXYZI point;
auto mask_value = mask.at<uchar>(image.rows - r - 1, c);
if (mask_value > 0) // valid pixel
{
point.intensity = mask_value;
point.getVector3fMap() = grid->at(c, r).getVector3fMap() + grid->at(c, r).getNormalVector3fMap();
}
else // invalid pixel
{
if (organized)
{
point.intensity = 0;
point.x = numeric_limits<float>::quiet_NaN();
point.y = numeric_limits<float>::quiet_NaN();
point.z = numeric_limits<float>::quiet_NaN();
}
else
{
continue;
}
}
cloud->push_back(point);
}
}
if (organized)
{
cloud->width = grid->width;
cloud->height = grid->height;
}
return cloud;
}
usage example for working with the grid:
// image_mask is std::pair<cv::Mat, cv::Mat>
auto image_mask = ConvertGridToImage(grid, false);
...
do some work with the image/mask
...
auto new_cloud = BackProjectImage(image_mask.first, grid, false);
For an unorganized point cloud, height and width have different meanings as you may have noticed. http://pointclouds.org/documentation/tutorials/basic_structures.php
It is not as simple to convert an unorganized point cloud to an image, as the points are represented as floats and there is no defined perspective. However, you can work around that by determining a perspective and creating discrete bins for the points. A similar question and answer can be found here: Converting a pointcloud to a depth/multi channel image

clustering image segments in opencv

I am working on motion detection with non-static camera using opencv.
I am using a pretty basic background subtraction and thresholding approach to get a broad sense of all that's moving in a sample video. After thresholding, I enlist all separable "patches" of white pixels, store them as independent components and color them randomly with red, green or blue. The image below shows this for a football video where all such components are visible.
I create rectangles over these detected components and I get this image:
So I can see the challenge here. I want to cluster all the "similar" and close-by components into a single entity so that the rectangles in the output image show a player moving as a whole (and not his independent limbs). I tried doing K-means clustering but since ideally I would not know the number of moving entities, I could not make any progress.
Please guide me on how I can do this. Thanks
this problem can be almost perfectly solved by dbscan clustering algorithm. Below, I provide the implementation and result image. Gray blob means outlier or noise according to dbscan. I simply used boxes as input data. Initially, box centers were used for distance function. However for boxes, it is insufficient to correctly characterize distance. So, the current distance function use the minimum distance of all 8 corners of two boxes.
#include "opencv2/opencv.hpp"
using namespace cv;
#include <map>
#include <sstream>
template <class T>
inline std::string to_string (const T& t)
{
std::stringstream ss;
ss << t;
return ss.str();
}
class DbScan
{
public:
std::map<int, int> labels;
vector<Rect>& data;
int C;
double eps;
int mnpts;
double* dp;
//memoization table in case of complex dist functions
#define DP(i,j) dp[(data.size()*i)+j]
DbScan(vector<Rect>& _data,double _eps,int _mnpts):data(_data)
{
C=-1;
for(int i=0;i<data.size();i++)
{
labels[i]=-99;
}
eps=_eps;
mnpts=_mnpts;
}
void run()
{
dp = new double[data.size()*data.size()];
for(int i=0;i<data.size();i++)
{
for(int j=0;j<data.size();j++)
{
if(i==j)
DP(i,j)=0;
else
DP(i,j)=-1;
}
}
for(int i=0;i<data.size();i++)
{
if(!isVisited(i))
{
vector<int> neighbours = regionQuery(i);
if(neighbours.size()<mnpts)
{
labels[i]=-1;//noise
}else
{
C++;
expandCluster(i,neighbours);
}
}
}
delete [] dp;
}
void expandCluster(int p,vector<int> neighbours)
{
labels[p]=C;
for(int i=0;i<neighbours.size();i++)
{
if(!isVisited(neighbours[i]))
{
labels[neighbours[i]]=C;
vector<int> neighbours_p = regionQuery(neighbours[i]);
if (neighbours_p.size() >= mnpts)
{
expandCluster(neighbours[i],neighbours_p);
}
}
}
}
bool isVisited(int i)
{
return labels[i]!=-99;
}
vector<int> regionQuery(int p)
{
vector<int> res;
for(int i=0;i<data.size();i++)
{
if(distanceFunc(p,i)<=eps)
{
res.push_back(i);
}
}
return res;
}
double dist2d(Point2d a,Point2d b)
{
return sqrt(pow(a.x-b.x,2) + pow(a.y-b.y,2));
}
double distanceFunc(int ai,int bi)
{
if(DP(ai,bi)!=-1)
return DP(ai,bi);
Rect a = data[ai];
Rect b = data[bi];
/*
Point2d cena= Point2d(a.x+a.width/2,
a.y+a.height/2);
Point2d cenb = Point2d(b.x+b.width/2,
b.y+b.height/2);
double dist = sqrt(pow(cena.x-cenb.x,2) + pow(cena.y-cenb.y,2));
DP(ai,bi)=dist;
DP(bi,ai)=dist;*/
Point2d tla =Point2d(a.x,a.y);
Point2d tra =Point2d(a.x+a.width,a.y);
Point2d bla =Point2d(a.x,a.y+a.height);
Point2d bra =Point2d(a.x+a.width,a.y+a.height);
Point2d tlb =Point2d(b.x,b.y);
Point2d trb =Point2d(b.x+b.width,b.y);
Point2d blb =Point2d(b.x,b.y+b.height);
Point2d brb =Point2d(b.x+b.width,b.y+b.height);
double minDist = 9999999;
minDist = min(minDist,dist2d(tla,tlb));
minDist = min(minDist,dist2d(tla,trb));
minDist = min(minDist,dist2d(tla,blb));
minDist = min(minDist,dist2d(tla,brb));
minDist = min(minDist,dist2d(tra,tlb));
minDist = min(minDist,dist2d(tra,trb));
minDist = min(minDist,dist2d(tra,blb));
minDist = min(minDist,dist2d(tra,brb));
minDist = min(minDist,dist2d(bla,tlb));
minDist = min(minDist,dist2d(bla,trb));
minDist = min(minDist,dist2d(bla,blb));
minDist = min(minDist,dist2d(bla,brb));
minDist = min(minDist,dist2d(bra,tlb));
minDist = min(minDist,dist2d(bra,trb));
minDist = min(minDist,dist2d(bra,blb));
minDist = min(minDist,dist2d(bra,brb));
DP(ai,bi)=minDist;
DP(bi,ai)=minDist;
return DP(ai,bi);
}
vector<vector<Rect> > getGroups()
{
vector<vector<Rect> > ret;
for(int i=0;i<=C;i++)
{
ret.push_back(vector<Rect>());
for(int j=0;j<data.size();j++)
{
if(labels[j]==i)
{
ret[ret.size()-1].push_back(data[j]);
}
}
}
return ret;
}
};
cv::Scalar HSVtoRGBcvScalar(int H, int S, int V) {
int bH = H; // H component
int bS = S; // S component
int bV = V; // V component
double fH, fS, fV;
double fR, fG, fB;
const double double_TO_BYTE = 255.0f;
const double BYTE_TO_double = 1.0f / double_TO_BYTE;
// Convert from 8-bit integers to doubles
fH = (double)bH * BYTE_TO_double;
fS = (double)bS * BYTE_TO_double;
fV = (double)bV * BYTE_TO_double;
// Convert from HSV to RGB, using double ranges 0.0 to 1.0
int iI;
double fI, fF, p, q, t;
if( bS == 0 ) {
// achromatic (grey)
fR = fG = fB = fV;
}
else {
// If Hue == 1.0, then wrap it around the circle to 0.0
if (fH>= 1.0f)
fH = 0.0f;
fH *= 6.0; // sector 0 to 5
fI = floor( fH ); // integer part of h (0,1,2,3,4,5 or 6)
iI = (int) fH; // " " " "
fF = fH - fI; // factorial part of h (0 to 1)
p = fV * ( 1.0f - fS );
q = fV * ( 1.0f - fS * fF );
t = fV * ( 1.0f - fS * ( 1.0f - fF ) );
switch( iI ) {
case 0:
fR = fV;
fG = t;
fB = p;
break;
case 1:
fR = q;
fG = fV;
fB = p;
break;
case 2:
fR = p;
fG = fV;
fB = t;
break;
case 3:
fR = p;
fG = q;
fB = fV;
break;
case 4:
fR = t;
fG = p;
fB = fV;
break;
default: // case 5 (or 6):
fR = fV;
fG = p;
fB = q;
break;
}
}
// Convert from doubles to 8-bit integers
int bR = (int)(fR * double_TO_BYTE);
int bG = (int)(fG * double_TO_BYTE);
int bB = (int)(fB * double_TO_BYTE);
// Clip the values to make sure it fits within the 8bits.
if (bR > 255)
bR = 255;
if (bR < 0)
bR = 0;
if (bG >255)
bG = 255;
if (bG < 0)
bG = 0;
if (bB > 255)
bB = 255;
if (bB < 0)
bB = 0;
// Set the RGB cvScalar with G B R, you can use this values as you want too..
return cv::Scalar(bB,bG,bR); // R component
}
int main(int argc,char** argv )
{
Mat im = imread("c:/data/football.png",0);
std::vector<std::vector<cv::Point> > contours;
std::vector<cv::Vec4i> hierarchy;
findContours(im.clone(), contours, hierarchy, CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE);
vector<Rect> boxes;
for(size_t i = 0; i < contours.size(); i++)
{
Rect r = boundingRect(contours[i]);
boxes.push_back(r);
}
DbScan dbscan(boxes,20,2);
dbscan.run();
//done, perform display
Mat grouped = Mat::zeros(im.size(),CV_8UC3);
vector<Scalar> colors;
RNG rng(3);
for(int i=0;i<=dbscan.C;i++)
{
colors.push_back(HSVtoRGBcvScalar(rng(255),255,255));
}
for(int i=0;i<dbscan.data.size();i++)
{
Scalar color;
if(dbscan.labels[i]==-1)
{
color=Scalar(128,128,128);
}else
{
int label=dbscan.labels[i];
color=colors[label];
}
putText(grouped,to_string(dbscan.labels[i]),dbscan.data[i].tl(), FONT_HERSHEY_COMPLEX,.5,color,1);
drawContours(grouped,contours,i,color,-1);
}
imshow("grouped",grouped);
imwrite("c:/data/grouped.jpg",grouped);
waitKey(0);
}
I agree with Sebastian Schmitz: you probably shouldn't be looking for clustering.
Don't expect an uninformed method such as k-means to work magic for you. In particular one that is as crude a heuristic as k-means, and which lives in an idealized mathematical world, not in messy, real data.
You have a good understanding of what you want. Try to put this intuition into code. In your case, you seem to be looking for connected components.
Consider downsampling your image to a lower resolution, then rerunning the same process! Or running it on the lower resolution right away (to reduce compression artifacts, and improve performance). Or adding filters, such as blurring.
I'd expect best and fastest results by looking at connected components in the downsampled/filtered image.
I am not entirely sure if you are really looking for clustering (in the Data Mining sense).
Clustering is used to group similar objects according to a distance function. In your case the distance function would only use the spatial qualities. Besides, in k-means clustering you have to specify a k, that you probably don't know beforehand.
It seems to me you just want to merge all rectangles whose borders are closer together than some predetermined threshold. So as a first idea try to merge all rectangles that are touching or that are closer together than half a players height.
You probably want to include a size check to minimize the risk of merging two players into one.
Edit: If you really want to use a clustering algorithm use one that estimates the number of clusters for you.
I guess you can improve your original attempt by using morphological transformations. Take a look at http://docs.opencv.org/master/d9/d61/tutorial_py_morphological_ops.html#gsc.tab=0. Probably you can deal with a closed set for each entity after that, specially with separate players as you got in your original image.

How to unify normal orientation

I've been trying to realize a mesh that has all face normals pointing outward.
In order to realize this, I load a mesh from a *.ctm file, then walk over all
triangles to determine the normal using a cross product and if the normal
is pointing to the negative z direction, I flip v1 and v2 (thus the normal orientation).
After this is done I save the result to a *.ctm file and view it with Meshlab.
The result in Meshlab still shows that normals are pointing in both positive and
negative z direction ( can be seen from the black triangles). Also when viewing
the normals in Meshlab they are really pointing backwards.
Can anyone give me some advice on how to solve this?
The source code for the normalization part is:
pcl::PointCloud<pcl::PointXYZRGBA>::Ptr cloud1 (new pcl::PointCloud<pcl::PointXYZRGBA> ());
pcl::fromROSMsg (meshFixed.cloud,*cloud1);for(std::vector<pcl::Vertices>::iterator it = meshFixed.polygons.begin(); it != meshFixed.polygons.end(); ++it)
{
alglib::real_2d_array v0;
double _v0[] = {cloud1->points[it->vertices[0]].x,cloud1->points[it->vertices[0]].y,cloud1->points[it->vertices[0]].z};
v0.setcontent(3,1,_v0); //3 rows, 1col
alglib::real_2d_array v1;
double _v1[] = {cloud1->points[it->vertices[1]].x,cloud1->points[it->vertices[1]].y,cloud1->points[it->vertices[1]].z};
v1.setcontent(3,1,_v1); //3 rows, 1col
alglib::real_2d_array v2;
double _v2[] = {cloud1->points[it->vertices[2]].x,cloud1->points[it->vertices[2]].y,cloud1->points[it->vertices[2]].z};
v2.setcontent(1,3,_v2); //3 rows, 1col
alglib::real_2d_array normal;
normal = cross(v1-v0,v2-v0);
//if z<0 change indices order v1->v2 and v2->v1
alglib::real_2d_array normalizedNormal;
if(normal[2][0]<0)
{
int index1,index2;
index1 = it->vertices[1];
index2 = it->vertices[2];
it->vertices[1] = index2;
it->vertices[2] = index1;
//make normal of length 1
double normalScaling = 1.0/sqrt(dot(normal,normal));
normal[0][0] = -1*normal[0][0];
normal[1][0] = -1*normal[1][0];
normal[2][0] = -1*normal[2][0];
normalizedNormal = normalScaling * normal;
}
else
{
//make normal of length 1
double normalScaling = 1.0/sqrt(dot(normal,normal));
normalizedNormal = normalScaling * normal;
}
//add to normal cloud
pcl::Normal pclNormalizedNormal;
pclNormalizedNormal.normal_x = normalizedNormal[0][0];
pclNormalizedNormal.normal_y = normalizedNormal[1][0];
pclNormalizedNormal.normal_z = normalizedNormal[2][0];
normalsFixed.push_back(pclNormalizedNormal);
}
The result from this code is:
I've found some code in the VCG library to orient the face and vertex normals.
After using this a large part of the mesh has correct face normals, but not all.
The new code:
// VCG library implementation
MyMesh m;
// Convert pcl::PolygonMesh to VCG MyMesh
m.Clear();
// Create temporary cloud in to have handy struct object
pcl::PointCloud<pcl::PointXYZRGBA>::Ptr cloud1 (new pcl::PointCloud<pcl::PointXYZRGBA> ());
pcl::fromROSMsg (meshFixed.cloud,*cloud1);
// Now convert the vertices to VCG MyMesh
int vertCount = cloud1->width*cloud1->height;
vcg::tri::Allocator<MyMesh>::AddVertices(m, vertCount);
for(unsigned int i=0;i<vertCount;++i)
m.vert[i].P()=vcg::Point3f(cloud1->points[i].x,cloud1->points[i].y,cloud1->points[i].z);
// Now convert the polygon indices to VCG MyMesh => make VCG faces..
int triCount = meshFixed.polygons.size();
if(triCount==1)
{
if(meshFixed.polygons[0].vertices[0]==0 && meshFixed.polygons[0].vertices[1]==0 && meshFixed.polygons[0].vertices[2]==0)
triCount=0;
}
Allocator<MyMesh>::AddFaces(m, triCount);
for(unsigned int i=0;i<triCount;++i)
{
m.face[i].V(0)=&m.vert[meshFixed.polygons[i].vertices[0]];
m.face[i].V(1)=&m.vert[meshFixed.polygons[i].vertices[1]];
m.face[i].V(2)=&m.vert[meshFixed.polygons[i].vertices[2]];
}
vcg::tri::UpdateBounding<MyMesh>::Box(m);
vcg::tri::UpdateNormal<MyMesh>::PerFace(m);
vcg::tri::UpdateNormal<MyMesh>::PerVertexNormalizedPerFace(m);
printf("Input mesh vn:%i fn:%i\n",m.VN(),m.FN());
// Start to flip all normals to outside
vcg::face::FFAdj<MyMesh>::FFAdj();
vcg::tri::UpdateTopology<MyMesh>::FaceFace(m);
bool oriented, orientable;
if ( vcg::tri::Clean<MyMesh>::CountNonManifoldEdgeFF(m)>0 ) {
std::cout << "Mesh has some not 2-manifold faces, Orientability requires manifoldness" << std::endl; // text
return; // can't continue, mesh can't be processed
}
vcg::tri::Clean<MyMesh>::OrientCoherentlyMesh(m, oriented,orientable);
vcg::tri::Clean<MyMesh>::FlipNormalOutside(m);
vcg::tri::Clean<MyMesh>::FlipMesh(m);
//vcg::tri::UpdateTopology<MyMesh>::FaceFace(m);
//vcg::tri::UpdateTopology<MyMesh>::TestFaceFace(m);
vcg::tri::UpdateNormal<MyMesh>::PerVertexNormalizedPerFace(m);
vcg::tri::UpdateNormal<MyMesh>::PerVertexFromCurrentFaceNormal(m);
// now convert VCG back to pcl::PolygonMesh
pcl::PointCloud<pcl::PointXYZRGBA>::Ptr cloud (new pcl::PointCloud<pcl::PointXYZRGBA>);
cloud->is_dense = false;
cloud->width = vertCount;
cloud->height = 1;
cloud->points.resize (vertCount);
// Now fill the pointcloud of the mesh
for(int i=0; i<vertCount; i++)
{
cloud->points[i].x = m.vert[i].P()[0];
cloud->points[i].y = m.vert[i].P()[1];
cloud->points[i].z = m.vert[i].P()[2];
}
pcl::toROSMsg(*cloud,meshFixed.cloud);
std::vector<pcl::Vertices> polygons;
// Now fill the indices of the triangles/faces of the mesh
for(int i=0; i<triCount; i++)
{
pcl::Vertices vertices;
vertices.vertices.push_back(m.face[i].V(0)-&*m.vert.begin());
vertices.vertices.push_back(m.face[i].V(1)-&*m.vert.begin());
vertices.vertices.push_back(m.face[i].V(2)-&*m.vert.begin());
polygons.push_back(vertices);
}
meshFixed.polygons = polygons;
Which results in: (Meshlab still shows normals are facing both sides)
I finally solved the problem. So I'm still using VCG library. From the above new code I slightly updated the following section:
vcg::tri::Clean<MyMesh>::OrientCoherentlyMesh(m, oriented,orientable);
//vcg::tri::Clean<MyMesh>::FlipNormalOutside(m);
//vcg::tri::Clean<MyMesh>::FlipMesh(m);
//vcg::tri::UpdateTopology<MyMesh>::FaceFace(m);
//vcg::tri::UpdateTopology<MyMesh>::TestFaceFace(m);
vcg::tri::UpdateNormal<MyMesh>::PerVertexNormalizedPerFace(m);
vcg::tri::UpdateNormal<MyMesh>::PerVertexFromCurrentFaceNormal(m);
Now I've updated the vcg::tri::Clean<MyMesh>::OrientCoherentlyMesh() function in clean.h. Here the update is to orient the first polygon of a group correctly. Also after swapping the edge the normal of the face is calculated and updated.
static void OrientCoherentlyMesh(MeshType &m, bool &Oriented, bool &Orientable)
{
RequireFFAdjacency(m);
assert(&Oriented != &Orientable);
assert(m.face.back().FFp(0)); // This algorithms require FF topology initialized
Orientable = true;
Oriented = true;
tri::UpdateSelection<MeshType>::FaceClear(m);
std::stack<FacePointer> faces;
for (FaceIterator fi = m.face.begin(); fi != m.face.end(); ++fi)
{
if (!fi->IsD() && !fi->IsS())
{
// each face put in the stack is selected (and oriented)
fi->SetS();
// New section of code to orient the initial face correctly
if(fi->N()[2]>0.0)
{
face::SwapEdge<FaceType,true>(*fi, 0);
face::ComputeNormal(*fi);
}
// End of new code section.
faces.push(&(*fi));
// empty the stack
while (!faces.empty())
{
FacePointer fp = faces.top();
faces.pop();
// make consistently oriented the adjacent faces
for (int j = 0; j < 3; j++)
{
//get one of the adjacent face
FacePointer fpaux = fp->FFp(j);
int iaux = fp->FFi(j);
if (!fpaux->IsD() && fpaux != fp && face::IsManifold<FaceType>(*fp, j))
{
if (!CheckOrientation(*fpaux, iaux))
{
Oriented = false;
if (!fpaux->IsS())
{
face::SwapEdge<FaceType,true>(*fpaux, iaux);
// New line to update face normal
face::ComputeNormal(*fpaux);
// end of new section.
assert(CheckOrientation(*fpaux, iaux));
}
else
{
Orientable = false;
break;
}
}
// put the oriented face into the stack
if (!fpaux->IsS())
{
fpaux->SetS();
faces.push(fpaux);
}
}
}
}
}
if (!Orientable) break;
}
}
Besides I also updated the function bool CheckOrientation(FaceType &f, int z) to perform a calculation based on normal z-direction.
template <class FaceType>
bool CheckOrientation(FaceType &f, int z)
{
// Added next section to calculate the difference between normal z-directions
FaceType *original = f.FFp(z);
double nf2,ng2;
nf2=f.N()[2];
ng2=original->N()[2];
// End of additional section
if (IsBorder(f, z))
return true;
else
{
FaceType *g = f.FFp(z);
int gi = f.FFi(z);
// changed if statement from: if (f.V0(z) == g->V1(gi))
if (nf2/abs(nf2)==ng2/abs(ng2))
return true;
else
return false;
}
}
The result is as I expect and desire from the algorithm: