I am trying to convert a point cloud (x, y, z) data acquired from a Kinect V2 using libfreenect2, into a virtual 2D laser scan (e.g., a horizontal angle/distance vector).
I am currently assigning per pixel column, the PCL distance value, as shown below:
std::vector<float> scan(512, 0);
for (unsigned int row = 0; row < 424; ++row) {
for (unsigned int col = 0; col < 512; ++col) {
float x, y, z;
registration->getPointXYZ(depth, row, col, x, y, z);
if (std::isnan(x) || std::isnan(y) || std::isnan(z)) {
continue;
}
Eigen::Vector3f values = rotate_translate((-1 * x), y - 1.186, z);
if (scan[col] == 0) {
scan[col] = values[1];
}
if (values[1] < scan[col]) {
scan[col] = values[1];
}
}
}
You may ignore the rotate_translate method, it simply changes the local to global coordinates using the sensor pose.
The problem is best shown using the pictures below:
Whereas the LIDAR range sensor produces the following pointsmap:
the kinect 2D range scan is curved, and of course narrower, since the horizontal FOV is 70.6 degrees compared to the 270 degree range of the LIDAR.
It is this curvature that I am trying to fix; the SLAM/ICP library I'm using is mrpt and the actual data scan is inserted into an mrpt::obs::CObservation2DRangeScan observation:
auto obs = mrpt::obs::CObservation2DRangeScan();
obs.loadFromVectors(scan.size(), scan.data(), (char*)scan.data());
obs.aperture = mrpt::utils::DEG2RAD(70.6f);
obs.maxRange = 6.0;
obs.rightToLeft = true;
obs.timestamp = mrpt::system::now();
obs.setSensorPose(sensor);
I've searched around google and SO, and the only answers which seem to address this question, are this one and that one. So whereas I understand that the curvature is the result of me assigning each pixel column the PCL value, I am uncertain as to how I can use that to remove the curvature.
Each reply seems to take a different approach, and from what I understand the task is a linear interpolation of the angle per pixel ratio, and the current pixel coordinates?
Related
I'm writing a code which calculates the optical flow with the iterative Lucas-Kanade method: calcOpticalFlowPyrLK().
I have a vector of an array that can hold two elements, see example below:
vector <Point2f> points[2];
The x and y coordinates are stored in the array and the array is stored in the vector. When outputting the array, for instance cout << points[0], the coordinates are currently displayed on the screen as follows:
Output example: [261.837, 65.093]
Now I want to extract the x- and y coordinate, separate them and store them in different variables. Already tried several ways with an iterator with no result. I would appreciate it if someone could help me with this, thanks.
The following example applies the PLK to a regular grid and shows how to read the x and y coordinates. The points are stored in a Point2f class using the vector class to store them in an array. The class has public x and y members you can use directly. This examples uses no iterator.
std::vector<cv::Point2f> prevPoints, currPoints;
std::vector<float> error; // stores the SSD error.
std::vector<uchar> status;// stores a flag of successful tracking / I recomend to ignore it.
cv::Mat prevGrayImg,currGrayImg;
// <- insert code for read the images
// initalize grid or the features you want to track
for( int r = 0; r < prevGrayImg.rows;r+=5){
for( int c = 0; c < prevGrayImg.cols;c+=5){
prevPoints.push_back(cv::Point2f(c,r));
}}
// apply pyramidal lucas kanade
cv::calcOpticalFlowPyrLK(prevGrayImg, currGrayImg, prevPoints, currPoints, status, error);
for( unsigned int i = 0; i < prevPoints.size(); i++){
float x0 = prevPoints[i].x;
float y0 = prevPoints[i].y;
float x1 = currPoints[i].x;
float y1 = currPoints[i].y;
}
With iterator it would be:
for( auto i = prevPoints.begin(); i != prevPoints.end(); ++i){
float x0 = i->x; ... a.s.o
I have a 2D matrix that represent an image. Firstly, I extract a line from this image (no matter the orientation) and I project pixel's value of this line in a 1D vertical array (with the size of the image's heigth).
This works well. I can perform many operation on this array.
After that, I need to re-insert this vertical array at the same place, same orientation of the line in the 2D matrix.
The problem comes from the inverse projection, I have many holes in my re-integrated line..
Mat DataRaw::InsertLine(Mat image_full,Mat image, Point pointH, Point pointL)
{
float offset = 0;
float coef_dir = 0;
// Equation of the line
coef_dir = (float)(pointH.y-pointL.y)/(pointH.x-pointL.x);
offset = pointH.y - (coef_dir*pointH.x);
float x_cur = 0;
int x = 0;
float x_prev = 0;
for (int y = 0; y<image.rows; y++)
{
x_cur = (float)(y-offset)/coef_dir; // x courant
if (y > 0)
x_prev = (float)((y-1)-offset)/coef_dir; // x à y-1
x = (int)x_cur;
if (x_cur-x_prev > 1)
{
if (y >= 1)
image_full.at<uchar>(y-1,x) = image.at<uchar>(y,0);
}
image_full.at<uchar>(y,x) = image.at<uchar>(y,0);
}
return image_full;
}
PointL and PointH are two point where the line passes through.
I calculate line equation using these two points.
Here is my function to re-insert my line in the 2D matrix, I try to check the difference at each Y step. But...
Thanks for your help !
/***** EDIT ******/
My problem at the left, what I want at right :
http://i.stack.imgur.com/bTB0s.png
I'm currently working on optical flow with OpenCV C++. I'm using calcOpticalFlowPyrLK with a grid of point (= one interest point for each 5*5 pixels square).
Which is the best way to :
1) Compute the histogram of the computed values (orientation and distance) for each frame
2) Compute an histogram of the values (orientation and distance) that a given pixel took during several frames (for instance 100)
Are the functions of OpenCV adapted for this work ? How may I use them in a simple way in combination with calcOpticalFlowPyrLK ?
I was searching for the same OpenCV tools a couple of months ago. Unfortunately, OpenCV does not include any Motion Histogram implementation. Instead, what you should have to do is to run calcOpticalFlowPyrLK for each frame and calculate the orientation/length of each displacement. Then, you have to create/fill the histograms yourself . Not as hard as it sounds, believe me :)
The OpenCV implementation for the fist part of HOOF can be like below:
const int rows = flow1.rows;
const int cols = flow1.cols;
for (int y = 0; y < rows; ++y)
for (int x = 0; x < cols; ++x)
{
Vec2f flow1_at_point = flow1.at<Vec2f>(y, x);
float u1 = flow1_at_point[0];
float v1 = flow1_at_point[1];
magnitudeImage += sqrt((u1*u1) + (v1 + v1));
orientationImage += atan2(u1, v1);
}
Preface
Yes, there is plenty to cover here... but I'll do my best to keep this as well-organized, informative and straight-to-the-point as I possibly can!
Using the HGE library in C++, I have created a simple tile engine.
And thus far, I have implemented the following designs:
A CTile class, representing a single tile within a CTileLayer, containing row/column information as well as an HGE::hgeQuad (which stores vertex, color and texture information, see here for details).
A CTileLayer class, representing a two-dimensional 'plane' of tiles (which are stored as a one-dimensional array of CTile objects), containing the # of rows/columns, X/Y world-coordinate information, tile pixel width/height information, and the layer's overall width/height in pixels.
A CTileLayer is responsible for rendering any tiles which are either fully or partially visible within the boundaries of a virtual camera 'viewport', and to avoid doing so for any tiles which are outside of this visible range. Upon creation, it pre-calculates all information to be stored within each CTile object, so the core of engine has more room to breathe and can focus strictly on the render loop. Of course, it also handles proper deallocation of each contained tile.
Issues
The problem I am now facing essentially boils down to the following architectural/optimization issues:
In my render loop, even though I am not rendering any tiles which are outside of visible range, I am still looping through all of the tiles, which seems to have a major performance impact for larger tilemaps (i.e., any thing above 100x100 rows/columns # 64x64 tile dimensions still drops the framerate down by 50% or more).
Eventually, I intend to create a fancy tilemap editor to coincide with this engine.
However, since I am storing all two-dimensional information inside one or more 1D arrays, I don't have any idea how possible it would be to implement some sort of rectangular-select & copy/paste feature, without some MAJOR performance hit -- involving looping through every tile twice per frame. And yet if I used 2D arrays, there would be a slightly less but more universal FPS drop!
Bug
As stated before... In my render code for a CTileLayer object, I have optimized which tiles are to be drawn based upon whether or not they are within viewing range. This works great, and for larger maps I noticed only a 3-8 FPS drop (compared to a 100+ FPS drop without this optimization).
But I think I'm calculating this range incorrectly, because after scrolling halfway through the map you can start to see a gap (on the topmost & leftmost sides) where tiles aren't being rendered, as if the clipping range is increasing faster than the camera can move (even though they both move at the same speed).
This gap gradually increases in size the further along into the X & Y axis you go, eventually eating up nearly half of the top & left sides of the screen on a large map.
My render code for this is shown below...
Code
//
// [Allocate]
// For pre-calculating tile information
// - Rows/Columns = Map Dimensions (in tiles)
// - Width/Height = Tile Dimensions (in pixels)
//
void CTileLayer::Allocate(UINT numColumns, UINT numRows, float tileWidth, float tileHeight)
{
m_nColumns = numColumns;
m_nRows = numRows;
float x, y;
UINT column = 0, row = 0;
const ULONG nTiles = m_nColumns * m_nRows;
hgeQuad quad;
m_tileWidth = tileWidth;
m_tileHeight = tileHeight;
m_layerWidth = m_tileWidth * m_nColumns;
m_layerHeight = m_tileHeight * m_nRows;
if(m_tiles != NULL) Free();
m_tiles = new CTile[nTiles];
for(ULONG l = 0; l < nTiles; l++)
{
m_tiles[l] = CTile();
m_tiles[l].column = column;
m_tiles[l].row = row;
x = (float(column) * m_tileWidth) + m_offsetX;
y = (float(row) * m_tileHeight) + m_offsetY;
quad.blend = BLEND_ALPHAADD | BLEND_COLORMUL | BLEND_ZWRITE;
quad.tex = HTEXTURE(nullptr); //Replaced for the sake of brevity (in the engine's code, I used a globally allocated texture array and did some random tile generation here)
for(UINT i = 0; i < 4; i++)
{
quad.v[i].z = 0.5f;
quad.v[i].col = 0xFF7F7F7F;
}
quad.v[0].x = x;
quad.v[0].y = y;
quad.v[0].tx = 0;
quad.v[0].ty = 0;
quad.v[1].x = x + m_tileWidth;
quad.v[1].y = y;
quad.v[1].tx = 1.0;
quad.v[1].ty = 0;
quad.v[2].x = x + m_tileWidth;
quad.v[2].y = y + m_tileHeight;
quad.v[2].tx = 1.0;
quad.v[2].ty = 1.0;
quad.v[3].x = x;
quad.v[3].y = y + m_tileHeight;
quad.v[3].tx = 0;
quad.v[3].ty = 1.0;
memcpy(&m_tiles[l].quad, &quad, sizeof(hgeQuad));
if(++column > m_nColumns - 1) {
column = 0;
row++;
}
}
}
//
// [Render]
// For drawing the entire tile layer
// - X/Y = world position
// - Top/Left = screen 'clipping' position
// - Width/Height = screen 'clipping' dimensions
//
bool CTileLayer::Render(HGE* hge, float cameraX, float cameraY, float cameraTop, float cameraLeft, float cameraWidth, float cameraHeight)
{
// Calculate the current number of tiles
const ULONG nTiles = m_nColumns * m_nRows;
// Calculate min & max X/Y world pixel coordinates
const float scalarX = cameraX / m_layerWidth; // This is how far (from 0 to 1, in world coordinates) along the X-axis we are within the layer
const float scalarY = cameraY / m_layerHeight; // This is how far (from 0 to 1, in world coordinates) along the Y-axis we are within the layer
const float minX = cameraTop + (scalarX * float(m_nColumns) - m_tileWidth); // Leftmost pixel coordinate within the world
const float minY = cameraLeft + (scalarY * float(m_nRows) - m_tileHeight); // Topmost pixel coordinate within the world
const float maxX = minX + cameraWidth + m_tileWidth; // Rightmost pixel coordinate within the world
const float maxY = minY + cameraHeight + m_tileHeight; // Bottommost pixel coordinate within the world
// Loop through all tiles in the map
for(ULONG l = 0; l < nTiles; l++)
{
CTile tile = m_tiles[l];
// Calculate this tile's X/Y world pixel coordinates
float tileX = (float(tile.column) * m_tileWidth) - cameraX;
float tileY = (float(tile.row) * m_tileHeight) - cameraY;
// Check if this tile is within the boundaries of the current camera view
if(tileX > minX && tileY > minY && tileX < maxX && tileY < maxY) {
// It is, so draw it!
hge->Gfx_RenderQuad(&tile.quad, -cameraX, -cameraY);
}
}
return false;
}
//
// [Free]
// Gee, I wonder what this does? lol...
//
void CTileLayer::Free()
{
delete [] m_tiles;
m_tiles = NULL;
}
Questions
What can be done to fix those architectural/optimization issues, without greatly impacting any other rendering optimizations?
Why is that bug occurring? How can it be fixed?
Thank you for your time!
Optimising the iterating of the map is fairly straight forward.
Given a visible rect in world coordinates (left, top, right, bottom) it's fairly trivial to work out the tile positions, simply by dividing by the tile size.
Once you have those tile coordinates (tl, tt, tr, tb) you can very easily calculate the first visible tile in your 1D array. (The way you calculate any tile index from a 2D coordinate is (y*width)+x - remember to make sure the input coordinate is valid first though.) You then just have a double for loop to iterate the visible tiles:
int visiblewidth = tr - tl + 1;
int visibleheight = tb - tt + 1;
for( int rowidx = ( tt * layerwidth ) + tl; visibleheight--; rowidx += layerwidth )
{
for( int tileidx = rowidx, cx = visiblewidth; cx--; tileidx++ )
{
// render m_Tiles[ tileidx ]...
}
}
You can use a similar system for selecting a block of tiles. Just store the selection coordinates and calculate the actual tiles in exactly the same way.
As for your bug, why do you have x, y, left, right, width, height for the camera? Just store camera position (x,y) and calculate the visible rect from the dimensions of your screen/viewport along with any zoom factor you have defined.
This is a pseudo codish example, geometry variables are in 2d vectors. Both the camera object and the tilemap has a center-position and a extent (half size). The math is just the same even if you decide to stick with pure numbers. Even if you don't use center coordinates and extent, perhaps you'll get an idea on the math. All of this code is in the render function, and is rather simplified. Also, this example assume you already got a 2D array -like object that holds the tiles.
So, first a full example, and I'll explain each part further down.
// x and y are counters, sx is a placeholder for x start value as x will
// be in the inner loop and need to be reset each iteration.
// mx and my will be the values x and y will count towards too.
x=0,
y=0,
sx=0,
mx=total_number_of_tiles_on_x_axis,
my=total_number_of_tiles_on_y_axis
// calculate the lowest and highest worldspace values of the cam
min = cam.center - cam.extent
max = cam.center + cam.extent
// subtract with tilemap corners and divide by tilesize to get
// the anount of tiles that is outside of the cameras scoop
floor = Math.floor( min - ( tilemap.center - tilemap.extent ) / tilesize)
ceil = Math.ceil( max - ( tilemap.center + tilemap.extent ) / tilesize)
if(floor.x > 0)
sx+=floor.x
if(floor.y > 0)
y+=floor.y
if(ceil.x < 0)
mx+=ceil.x
if(ceil.y < 0)
my+=ceil.y
for(; y<my; y++)
// x need to be reset each y iteration, start value are stored in sx
for(x=sx; x<mx; x++)
// render tile x in tilelayer y
Explained bit by bit. First thing in the render function, we will use a few variables.
// x and y are counters, sx is a placeholder for x start value as x will
// be in the inner loop and need to be reset each iteration.
// mx and my will be the values x and y will count towards too.
x=0,
y=0,
sx=0,
mx=total_number_of_tiles_on_x_axis,
my=total_number_of_tiles_on_y_axis
To prevent rendering all tiles, you need to provide either a camera-like object or information on where the visible area starts and stops (in worldspace if the scene is movable)
In this example I'm providing a camera object to the render function which has a center and an extent stored as 2d vectors.
// calculate the lowest and highest worldspace values of the cam
min = cam.center - cam.extent
max = cam.center + cam.extent
// subtract with tilemap corners and divide by tilesize to get
// the anount of tiles that is outside of the cameras scoop
floor = Math.floor( min - ( tilemap.center - tilemap.extent ) / tilesize)
ceil = Math.ceil( max - ( tilemap.center + tilemap.extent ) / tilesize)
// floor & ceil is 2D vectors
Now, if floor is higher than 0 or ceil is lower than 0 on any axis, it means that there just as many tiles outside of the camera scoop.
// check if there is any tiles outside to the left or above of camera
if(floor.x > 0)
sx+=floor.x// set start number of sx to amount of tiles outside of camera
if(floor.y > 0)
y+=floor.y // set startnumber of y to amount of tiles outside of camera
// test if there is any tiles outisde to the right or below the camera
if(ceil.x < 0)
mx+=ceil.x // then add the negative value to mx (max x)
if(ceil.y < 0)
my+=ceil.y // then add the negative value to my (max y)
A normal render of the tilemap would go from 0 to number of tiles that axis, this using a loop within a loop to account for both axis. But thanks to the above code x and y will always stick to the space within the border of the camera.
// will loop through only the visible tiles
for(; y<my; y++)
// x need to be reset each y iteration, start value are stored in sx
for(x=sx; x<mx; x++)
// render tile x in tilelayer y
Hope this helps!
I've been playing with the optical flow functions in OpenCV and am stuck. I've successfully generated X and Y optical flow fields/maps using the Farneback method, but I don't know how to apply this to the input image coordinates to warp the images. The resulting X and Y fields are of 32bit float type (0-1.0), but how does this translate to the coordinates of the input and output images? For example, 1.0 of what? The width of the image? The difference between the two?
Plus, I'm not sure what my loop would look like to apply the transform/warp. I've done plenty of loops to change color, but the pixels always remain in the same location. Moving pixels around is new territory for me!
Update: I got this to work, but the resulting image is messy:
//make a float copy of 8 bit grayscale source image
IplImage *src_img = cvCreateImage(img_sz, IPL_DEPTH_32F, 1);
cvConvertScale(input_img,src_img,1/255.0); //convert 8 bit to float
//create destination image
IplImage *dst_img = cvCreateImage(img_sz, IPL_DEPTH_32F, 1);
for(y = 0; y < flow->height; y++){
//grab flow maps for X and Y
float* vx = (float*)(velx->imageData + velx->widthStep*y);
float* vy = (float*)(vely->imageData + vely->widthStep*y);
//coords for source and dest image
const float *srcpx = (const float*)(src_img->imageData+(src_img->widthStep*y));
float *dstpx = (float*)(dst_img->imageData+(dst_img->widthStep*y));
for(x=0; x < flow->width; x++)
{
int newx = x+(vx[x]);
int newy = (int)(vy[x])*flow->width;
dstpx[newx+newy] = srcpx[x];
}
}
I could not get this to work. The output was just garbled noise:
cvRemap(src_img,dst_img,velx,vely,CV_INTER_CUBIC,cvScalarAll(0));
The flow vectors are velocity values. If the pixel in image 1 at position (x, y) has the flow vector (vx, vy) it is estimated to be at position (x+vx, y+vy) (so the values aren't really in the [0, 1] range - they can be bigger, and be negative too). Easiest way to do the warping is to create floating point images with those values (x+vx for the x direction, similar for y), and then use cv::remap.
Using OpenCV
https://github.com/opencv/opencv/blob/master/samples/python/opt_flow.py
def warp_flow(img, flow):
h, w = flow.shape[:2]
flow = -flow
flow[:,:,0] += np.arange(w)
flow[:,:,1] += np.arange(h)[:,np.newaxis]
res = cv2.remap(img, flow, None, cv2.INTER_LINEAR)
return res