I am writing this question fishing for any state-of-the-art software or methods that can quickly compute the intersection of N 2D polygons (the convex hulls of projected convex polyhedrons), and M 2D polygons where typically N >> M. N may be in the order or at least 1M polygons and N in the order 50k. I've searched for some time now, but I keep coming up with the same answer shown below.
Use boost and a loop to
compute the projection of the polyhedron (not the bottleneck)
compute the convex hull of said polyhedron (bottleneck)
compute the intersection of the projected polyhedron and existing 2D polygon (major bottleneck).
This loop is repeated NK times where typically K << M, and K is the average number of 2D polygons intersecting a single projected polyhedron. This is done to reduce the number of computations.
The problem with this is that if I have N=262144 and M=19456 it takes about 129 seconds (when multithreaded by polyhedron), and this must be done about 300 times. Ideally, I would like to reduce the computation time to about 1 second for the above sizes, so I was wondering if someone could help point to some software or literature that could improve efficiency.
[EDIT]
#sehe's request I'm posting the most relevant parts of the code. I haven't compiled it, so this is just to get the gist... this code assumes, there are voxels and pixels, but the shapes can be anything. The order of the points in the grid can be any, but the indices of where the points reside in the grid are the same.
#include <boost/geometry/geometry.hpp>
#include <boost/geometry/geometries/point.hpp>
#include <boost/geometry/geometries/ring.hpp>
const std::size_t Dimension = 2;
typedef boost::geometry::model::point<float, Dimension, boost::geometry::cs::cartesian> point_2d;
typedef boost::geometry::model::polygon<point_2d, false /* is cw */, true /* closed */> polygon_2d;
typedef boost::geometry::model::box<point_2d> box_2d;
std::vector<float> getOverlaps(std::vector<float> & projected_grid_vx, // projected voxels
std::vector<float> & pixel_grid_vx, // pixels
std::vector<int> & projected_grid_m, // number of voxels in each dimension
std::vector<int> & pixel_grid_m, // number of pixels in each dimension
std::vector<float> & pixel_grid_omega, // size of the pixel grid in cm
int projected_grid_size, // total number of voxels
int pixel_grid_size) { // total number of pixels
std::vector<float> overlaps(projected_grid_size * pixel_grid_size);
std::vector<float> h(pixel_grid_m.size());
for(int d=0; d < pixel_grid_m.size(); d++) {
h[d] = (pixel_grid_omega[2*d+1] - pixel_grid_omega[2*d]) / pixel_grid_m[d];
}
for(int i=0; i < projected_grid_size; i++){
std::vector<float> point_indices(8);
point_indices[0] = i;
point_indices[1] = i + 1;
point_indices[2] = i + projected_grid_m[0];
point_indices[3] = i + projected_grid_m[0] + 1;
point_indices[4] = i + projected_grid_m[0] * projected_grid_m[1];
point_indices[5] = i + projected_grid_m[0] * projected_grid_m[1] + 1;
point_indices[6] = i + (projected_grid_m[1] + 1) * projected_grid_m[0];
point_indices[7] = i + (projected_grid_m[1] + 1) * projected_grid_m[0] + 1;
std::vector<float> vx_corners(8 * projected_grid_m.size());
for(int vn = 0; vn < 8; vn++) {
for(int d = 0; d < projected_grid_m.size(); d++) {
vx_corners[vn + d * 8] = projected_grid_vx[point_indices[vn] + d * projeted_grid_size];
}
}
polygon_2d proj_voxel;
for(int vn = 0; vn < 8; vn++) {
point_2d poly_pt(vx_corners[2 * vn], vx_corners[2 * vn + 1]);
boost::geometry::append(proj_voxel, poly_pt);
}
boost::geometry::correct(proj_voxel);
polygon_2d proj_voxel_hull;
boost::geometry::convex_hull(proj_voxel, proj_voxel_hull);
box_2d bb_proj_vox;
boost::geometry::envelope(proj_voxel_hull, bb_proj_vox);
point_2d min_pt = bb_proj_vox.min_corner();
point_2d max_pt = bb_proj_vox.max_corner();
// then get min and max indices of intersecting bins
std::vector<float> min_idx(projected_grid_m.size() - 1),
max_idx(projected_grid_m.size() - 1);
// compute min and max indices of incidence on the pixel grid
// this is easy assuming you have a regular grid of pixels
min_idx[0] = std::min( (float) std::max( std::floor((min_pt.get<0>() - pixel_grid_omega[0]) / h[0] - 0.5 ), 0.), pixel_grid_m[0]-1);
min_idx[1] = std::min( (float) std::max( std::floor((min_pt.get<1>() - pixel_grid_omega[2]) / h[1] - 0.5 ), 0.), pixel_grid_m[1]-1);
max_idx[0] = std::min( (float) std::max( std::floor((max_pt.get<0>() - pixel_grid_omega[0]) / h[0] + 0.5 ), 0.), pixel_grid__m[0]-1);
max_idx[1] = std::min( (float) std::max( std::floor((max_pt.get<1>() - pixel_grid_omega[2]) / h[1] + 0.5 ), 0.), pixel_grid_m[1]-1);
// iterate only over pixels which intersect the projected voxel
for(int iy = min_idx[1]; iy <= max_idx[1]; iy++) {
for(int ix = min_idx[0]; ix <= max_idx[0]; ix++) {
int idx = ix + iy * pixel_grid_size[0]; // `first' index of pixel corner point
polygon_2d pix_poly;
for(int pn = 0; pn < 4; pn++) {
point_2d pix_corner_pt(
pixel_grid_vx[idx + pn % 2 + (pn / 2) * pixel_grid_m[0]],
pixel_grid_vx[idx + pn % 2 + (pn / 2) * pixel_grid_m[0] + pixel_grid_size]
);
boost::geometry::append(pix_poly, pix_corner_pt);
}
boost::geometry::correct( pix_poly );
//make this into a convex hull since the order of the point may be any
polygon_2d pix_hull;
boost::geometry::convex_hull(pix_poly, pix_hull);
// on to perform intersection
std::vector<polygon_2d> vox_pix_ints;
polygon_2d vox_pix_int;
try {
boost::geometry::intersection(proj_voxel_hull, pix_hull, vox_pix_ints);
} catch ( std::exception e ) {
// skip since these may coincide at a point or line
continue;
}
// both are convex so only one intersection expected
vox_pix_int = vox_pix_ints[0];
overlaps[i + idx * projected_grid_size] = boost::geometry::area(vox_pix_int);
}
} // end intersection for
} //end projected_voxel for
return overlaps;
}
You could create the ratio of polygon to bounding box:
This could be done computationally once to arrive at an avgerage poly area to BB ratio R constant.
Or you could do it with geometry using a circle bounded by its BB Since your using only projected polyhedron:
R = 0.0;
count = 0;
for (each poly) {
count++;
R += polyArea / itsBoundingBoxArea;
}
R = R/count;
Then calculate the summation of intersection of bounding boxes.
Sbb = 0.0;
for (box1, box2 where box1.isIntersecting(box2)) {
Sbb += box1.intersect(box2);
}
Then:
Approximation = R * Sbb
All of this would not work if concave polys were allowed. Because a concave poly can occupy less than 1% of it's bounding box. You will still have to find the convex hull.
Alternatively, If you can find the polygons area quicker than its hull, you could use the actual computed average poly area. This would give you a decent approximation as well while avoiding both poly intersection and wrapping.
Hm, the problem seems similar to doing "collision-detection" i game-engines. Or "potentially visible sets".
While I don't know much about the current state-of-the-art, i remember an optimization was to enclose objects in spheres, since checking overlaps between spheres (or circles in 2D) is really cheap.
In order to speed-up checks for collisions, objects were often put into search-structures (e.g. a sphere-tree (circle-tree in 2D case)). Basically organizing the space into a hierarchical structure, to make queries for overlaps fast.
So basically my suggestion boils down to: Try looking at algorithms for collision-detection i game-engines.
Assumption
I'm assuming that you mean "intersections" and not intersection. Moreover, It is not the expected use case that most of the individual polys from M and N will overlap at the same time. If this assumption is true then:
Answer
The way this is done with 2D game engines is by having a scene graph where every object has a bounding box. Then place all the the polygons into a node in an quadtree according to their location determined by bounding box. Then the task becomes parallel because each node can be processed separately for intersection.
Here is the wiki for quadtree:
Quadtree Wiki
An octree could be used when in 3D.
It actually doesn't even have to be a octree. You could get the same results with any space partition. You could find the maximum separation of polys (lets call it S). And create say S/10 space partitions. Then you would have 10 separate spaces to execute in parallel. Not only would it be concurrent, but It would no longer be M * N time since not every poly must be compared against every other poly.
Related
I have been generating noise textures to use as height maps for terrain generation. In this application, initially there is a 256x256 noise texture that is used to create a block of land that the user is free to roam around. When the user reaches a certain boundary in-game the application generates a new texture and thus another block of terrain.
In the code, a table of 64x64 random values are generated, and the values in the texture are the result of interpolating between these points at various 'frequencies' and 'wavelengths' using a smoothstep function, and then combined to form the final noise texture; and finally the values in the texture are divided through by its largest value to effectively normalize it. When the player is at the boundary and a new texture is created, the random number table that is created re-uses the values from the appropriate edge of the previous texture (eg. if the new texture is for a block of land that is on the +X side of the previous one, the last value in every row of the previous texture is used as the first value in every row of random numbers in the next.)
My problem is this: even though the same values are being used across the edges of adjacent textures, they are nowhere near seamless - some neighboring points on the terrain are mismatched by many many metres. My guess is that the changing frequencies that are used to sample the random number table are probably having a significant effect on all areas of the texture. So how might one generate fractal noise poceduraly, ie. as needed, AND have it look continuous with adjacent values?
Here is a section of the code that returns a value interpolated between the points on the random number table given a point P:
float MainApp::assessVal(glm::vec2 P){
//Integer component of P
int xi = (int)P.x;
int yi = (int)P.y;
//Decimal component ofP
float xr = P.x - xi;
float yr = P.y - yi;
//Find the grid square P lies inside of
int x0 = xi % randX;
int x1 = (xi + 1) % randX;
int y0 = yi % randY;
int y1 = (yi + 1) % randY;
//Get random values for the 4 nodes
float r00 = randNodes->randNodes[y0][x0];
float r10 = randNodes->randNodes[y0][x1];
float r01 = randNodes->randNodes[y1][x0];
float r11 = randNodes->randNodes[y1][x1];
//Smoother interpolation so
//texture appears less blocky
float sx = smoothstep(xr);
float sy = smoothstep(yr);
//Find the weighted value of the 4
//random values. This will be the
//final value in the noise texture
float sx0 = mix(r00, r10, sx);
float sx1 = mix(r01, r11, sx);
return mix(sx0, sx1, sy);
}
Where randNodes is a 2 dimensional array containing the random values.
And here is the code that takes all the values returned from the above function and constructs texture data:
int layers = 5;
float wavelength = 1, frequency = 1;
for (int k = 0; k < layers; k++) {
for (int i = 0; i < stepsY; i++) {
for(int j = 0; j < stepsX; j++){
//Compute value for (stepsX * stepsY) interpolation points
//across the grid of random numbers
glm::vec2 P = glm::vec2((float)j/stepsX * randX, (float)i/stepsY * randY);
buf[i * stepsY + j] += assessVal(P * wavelength) * frequency;
}
}
//repeat (layers) times with different signals
wavelength *= 0.5;
frequency *= 2;
}
for(int i = 0; i < buf.size(); i++){
//divide all data by the largest value.
//this normalises the data to avoid saturation
buf[i] /= largestVal;
}
Finally, here is an example of two textures generated by these functions that should be seamless, but aren't:
The 2 images placed side by side as they are now are obviously mis-matched.
Your code wraps the values only in the domain of the noise texture you read from, but not in the domain of the texture being generated.
For the texture T of size stepX to be repeatable (let's consider 1-d case for simplicity) you must have
T(0) == T(stepX)
Or in your case (substitute j = 0 and j = stepX):
assessVal(0) == assessVal(randX * wavelength)
For when k >= 1 this is clearly not true in your code, because
(randX / pow(2, k)) % randX != 0
One solution is to decrease randX and randY while you go up the frequencies.
But my typical approach would rather be starting from a 2x2 random texture, upscale it to 4x4 with GL_REPEAT, add a bit more per-pixel noise, continue upscaling to 8x8 etc.. till I get to the desired size.
The root cause of course is that your smoothing changes pixels to match their neighbors, but you later add new neighbors and do not re-smooth the pixels who got new neighbors.
One simple and common workaround is to keep an edge of invisible pixels, the width of which is half that of your smoothing kernel. Now, when expanding the area, you can resmooth those invisible pixels just before they're revealed. Don't forget to add a new edge of invisible pixels!
Ok, I'm having a bit of trouble finding a solution for this that seems to be a simple geometry problem.
I have a list of triple coordinates that form a square angle.
Between all these triple-coordinates I want to find a pair that forms up a square.
I believe the best I can do to exemplify is show an image:
and 2. are irrelevant. 3. and 4. are what I'm looking for.
For each triple coordinate I have the midle point, where the angle is, and two other points that describe the two segments that form the angle.
Summing it up, given six points, 2 for the diagonal + 4 other points, how can I find if these make a square?
obs: the two lines that make the angle are consistent but don't have the same size.
obs2:the lines from different triples may not intersect
Thank you for time and any help and insight provided.
If any term I used is incorrect or just plain hard to understand let me know, I'm not a native english speaker.
Edit: The code as it stands.
//for all triples
for (size_t i = 0; i < toTry.size() - 1; i++) {
Vec2i center_i = toTry[i].avg;
//NormalizedDiagonal = ((Side1 - Center) + (Side2 - Center));
Vec2i a = toTry[i].p, b = toTry[i].q;
Vec2f normalized_i = normalizedDiagonal(center_i, toTry[i].p, toTry[i].q);
for (size_t j = i + 1; j < toTry.size(); j++) {
Vec2i center_j = toTry[j].avg;
//Se os pontos sao proximos, nao importam
if (areClose(center_i, center_j, 25))
continue;
Vec2f normalized_j = normalizedDiagonal(center_j, toTry[j].p, toTry[j].q);
line(src, Point(center_i[0], center_i[1]), Point(center_i[0] + 1 * normalized_i[0], center_i[1] + 1 * normalized_i[1]), Scalar(255, 255, 255), 1);
//test if antiparallel
if (abs(normalized_i[0] - normalized_j[0]) > 0.1 || abs(normalized_i[1] - normalized_j[1] > 0.1))
continue;
Vec2f delta;
delta[0] = center_j[0] - center_i[0]; delta[1] = center_j[1] - center_i[1];
double dd = sqrt(pow((center_i[0] - center_j[0]), 2) + pow((center_i[1] - center_j[1]), 2));
//delta[0] = delta[0] / dd;
//delta[1] = delta[1] / dd;
float dotProduct = normalized_i[0] * delta[0] + normalized_i[1] * delta[1];
//test if do product < 0
if (dotProduct < 0)
continue;
float deltaDotDiagonal = delta[0] * normalized_i[0] + delta[1] * normalized_i[1];
menor_d[0] = delta[0] - deltaDotDiagonal * normalized_i[0];
menor_d[1] = delta[1] - deltaDotDiagonal * normalized_i[1];
dd = sqrt(pow((center_j[0] - menor_d[0]), 2) + pow((center_j[1] - menor_d[1]), 2));
if(dd < 25)
[...]
Just to be clear, the actual lengths of the side segments is irrelevant, right? All you care about is whether the semi-infinite lines formed by the side segments of two triples form a square? Or do the actual segments need to intersect?
Assuming the former, a method to check whether two triples form a square is as follows. Let's use the Point3D and Vector3D from the System.Windows.Media.Media3D namespace to define some terminology, since these are decent general-purpose 3d double precision points and vectors that support basic linear algebra methods. These are c# so you can't use them directly but I'd like to be able to refer to some of the basic methods mentioned there.
Here is the basic method to check if two triples intersect:
Define a triple as follows: Center, Side1 and Side2 as three Point3D structures.
For each triple, define the normalized diagonal vector as
NormalizedDiagonal = ((Side1 - Center) + (Side2 - Center));
NormalizedDiagonal.Normalize()
(You might want to cache this for performance.)
Check if the two centers are equal within some linear tolerance you define. If equal, return false -- it's a degenerate case.
Check if the two diagonal vectors are antiparallel within some angular tolerance you define. (I.e. NormalizedDiagonal1 == -NormalizedDiagonal2 with some tolerance.) If not, return false, not a square.
Compute the vector from triple2.Center to triple2.Center: delta = triple2.Center - triple1.Center.
If double deltaDotDiagonal = DotProduct(delta, triple1.NormalizedDiagonal) < 0, return false - the two triples point away from each other.
Finally, compute the distance from the center of triple2 to the (infinite) diagonal line passing through the center triple1. If zero (within your linear tolerance) they form a square.
To compute that distance: distance = (delta - deltaDotDiagonal*triple1.NormalizedDiagonal).Length
Note: deltaDotDiagonal*triple1.NormalizedDiagonal is the projection of the delta vector onto triple1.NormalizedDiagonal, and thus delta - deltaDotDiagonal*triple1.NormalizedDiagonal is the component of delta that is perpendicular to that diagonal. Its length is the distance we seek.
Finally, If your definition of a square requires that the actual side segments intersect, you can add an extra check that the lengths of all the side segments are less than sqrt(2) * delta.Length.
This method checks if two triples form a square. Finding all triples that form squares is, of course, O(N-squared). If this is a problem, you can put them in an array and sort then by angle = Atan2(NormalizedDiagonal.Y, NormalizedDiagonal.X). Having done that, you can find triples that potentially form squares with a given triple by binary-searching the array for triples with angles = +/- π from the angle of the current triple, within your angular tolerance. (When the angle is near π you will need to check both the beginning and end of the array.)
Update
OK, let's see if I can do this with your classes. I don't have definitions for Vec2i and Vec2f so I could get this wrong...
double getLength(Vec2f vector)
{
return sqrt(pow(vector[0], 2) + pow(vector[1], 2));
}
Vec2f scaleVector(Vec2f vec, float scale)
{
Vec2f scaled;
scaled[0] = vec[0] * scale;
scaled[1] = vec[1] * scale;
return scaled;
}
Vec2f subtractVectorsAsFloat(Vec2i first, Vec2i second)
{
// return first - second as float.
Vec2f diff;
diff[0] = first[0] - second[0];
diff[1] = first[1] - second[1];
return diff;
}
Vec2f subtractVectorsAsFloat(Vec2f first, Vec2f second)
{
// return first - second as float.
Vec2f diff;
diff[0] = first[0] - second[0];
diff[1] = first[1] - second[1];
return diff;
}
double dot(Vec2f first, Vec2f second)
{
return first[0] * second[0] + first[1] * second[1];
}
//for all triples
for (size_t i = 0; i < toTry.size() - 1; i++) {
Vec2i center_i = toTry[i].avg;
//NormalizedDiagonal = ((Side1 - Center) + (Side2 - Center));
Vec2i a = toTry[i].p, b = toTry[i].q;
Vec2f normalized_i = normalizedDiagonal(center_i, toTry[i].p, toTry[i].q);
for (size_t j = i + 1; j < toTry.size(); j++) {
Vec2i center_j = toTry[j].avg;
//Se os pontos sao proximos, nao importam
if (areClose(center_i, center_j, 25))
continue;
Vec2f normalized_j = normalizedDiagonal(center_j, toTry[j].p, toTry[j].q);
//test if antiparallel
if (abs(normalized_i[0] - normalized_j[0]) > 0.1 || abs(normalized_i[1] - normalized_j[1] > 0.1))
continue;
// get a vector pointing from center_i to center_j.
Vec2f delta = subtractVectorsAsFloat(center_j, center_i);
//test if do product < 0
float deltaDotDiagonal = dot(delta, normalized_i);
if (deltaDotDiagonal < 0)
continue;
Vec2f deltaProjectedOntoDiagonal = scaleVector(normalized_i, deltaDotDiagonal);
// Subtracting the dot product of delta projected onto normalized_i will leave the component
// of delta which is perpendicular to normalized_i...
Vec2f distanceVec = subtractVectorsAsFloat(deltaProjectedOntoDiagonal, center_j);
// ... the length of which is the distance from center_j
// to the diagonal through center_i.
double distance = getLength(distanceVec);
if(distance < 25) {
}
}
There are two approaches to solving this. One is a very direct approach that involves finding the intersection of two line segments.
You just use the triple coordinates to figure out the midpoint, and the two line segments that protrude from it (trivial). Do this for both triple-sets.
Now calculate the intersection points, if they exist, for all four possible permutations of the extending line segments. From the original answer to a similar question:
You might look at the code I wrote for Computational Geometry in C,
which discusses this question in detail (Chapter 1, Section 5). The
code is available as SegSegInt from the links at that web site.
In a nutshell, I recommend a different approach, using signed area of
triangles. Then comparing appropriate triples of points, one can
distinguish proper from improper intersections, and all degenerate
cases. Once they are distinguished, finding the point of intersection
is easy.
An alternate, image processing approach, would be to render the lines, define one unique color for the lines, and then apply an seed/flood fill algorithm to the first white zone found, applying a new unique color to future zones, until you flood fill an enclosed area that doesn't touch the border of the image.
Good luck!
References
finding the intersection of two line segments in 2d (with potential degeneracies), Accessed 2014-08-18, <https://math.stackexchange.com/questions/276735/finding-the-intersection-of-two-line-segments-in-2d-with-potential-degeneracies>
In a pair of segments, call one "the base segment" and one that is obtained by rotating the base segment by π/2 counterclockwise is "the other segment".
For each triple, compute the angle between the base segment and the X axis. Call this its principal angle.
Sort triples by the principal angle.
Now for each triple with the principal angle of α any potential square-forming mate has the principal angle of α+π (mod 2π). This is easy to find by binary search.
Furthermore, for two candidate triples with vertices a and a' and principal angles α and α+π, the angle of vector aa' should be α+π/4.
Finally, if each of the four segments is at least |aa'|/√2 long, we have a square.
Language/Compiler: C++ (Visual Studio 2013)
Experience: ~2 months
I am working in a rectangular grid in 3D-space (size: xdim by ydim by zdim) where , "xgrid, ygrid, and zgrid" are 3D arrays of the x,y, and z-coordinates, respectively. Now, I am interested in finding all points that lie within a sphere of radius "r" centered about the point "(vi,vj,vk)". I want to store the index locations of these points in the vectors "xidx,yidx,zidx". For a single point this algorithm works and is fast enough but when I wish to iterate over many points within the 3D-space I run into very long run times.
Does anyone have any suggestions on how I can improve the implementation of this algorithm in C++? After running some profiling software I found online (very sleepy, Luke stackwalker) it seems that the "std::vector::size" and "std::vector::operator[]" member functions are bogging down my code. Any help is greatly appreciated.
Note: Since I do not know a priori how many voxels are within the sphere, I set the length of vectors xidx,yidx,zidx to be larger than necessary and then erase all the excess elements at the end of the function.
void find_nv(int vi, int vj, int vk, vector<double> &xidx, vector<double> &yidx, vector<double> &zidx, double*** &xgrid, double*** &ygrid, double*** &zgrid, int r, double xdim,double ydim,double zdim, double pdim)
{
double xcor, ycor, zcor,xval,yval,zval;
vector<double>xyz(3);
xyz[0] = xgrid[vi][vj][vk];
xyz[1] = ygrid[vi][vj][vk];
xyz[2] = zgrid[vi][vj][vk];
int counter = 0;
// Confine loop to be within boundaries of sphere
int istart = vi - r;
int iend = vi + r;
int jstart = vj - r;
int jend = vj + r;
int kstart = vk - r;
int kend = vk + r;
if (istart < 0) {
istart = 0;
}
if (iend > xdim-1) {
iend = xdim-1;
}
if (jstart < 0) {
jstart = 0;
}
if (jend > ydim - 1) {
jend = ydim-1;
}
if (kstart < 0) {
kstart = 0;
}
if (kend > zdim - 1)
kend = zdim - 1;
//-----------------------------------------------------------
// Begin iterating through all points
//-----------------------------------------------------------
for (int k = 0; k < kend+1; ++k)
{
for (int j = 0; j < jend+1; ++j)
{
for (int i = 0; i < iend+1; ++i)
{
if (i == vi && j == vj && k == vk)
continue;
else
{
xcor = pow((xgrid[i][j][k] - xyz[0]), 2);
ycor = pow((ygrid[i][j][k] - xyz[1]), 2);
zcor = pow((zgrid[i][j][k] - xyz[2]), 2);
double rsqr = pow(r, 2);
double sphere = xcor + ycor + zcor;
if (sphere <= rsqr)
{
xidx[counter]=i;
yidx[counter]=j;
zidx[counter] = k;
counter = counter + 1;
}
else
{
}
//cout << "counter = " << counter - 1;
}
}
}
}
// erase all appending zeros that are not voxels within sphere
xidx.erase(xidx.begin() + (counter), xidx.end());
yidx.erase(yidx.begin() + (counter), yidx.end());
zidx.erase(zidx.begin() + (counter), zidx.end());
return 0;
You already appear to have used my favourite trick for this sort of thing, getting rid of the relatively expensive square root functions and just working with the squared values of the radius and center-to-point distance.
One other possibility which may speed things up (a) is to replace all the:
xyzzy = pow (plugh, 2)
calls with the simpler:
xyzzy = plugh * plugh
You may find the removal of the function call could speed things up, however marginally.
Another possibility, if you can establish the maximum size of the target array, is to use an real array rather than a vector. I know they make the vector code as insanely optimal as possible but it still won't match a fixed-size array for performance (since it has to do everything the fixed size array does plus handle possible expansion).
Again, this may only offer very marginal improvement at the cost of more memory usage but trading space for time is a classic optimisation strategy.
Other than that, ensure you're using the compiler optimisations wisely. The default build in most cases has a low level of optimisation to make debugging easier. Ramp that up for production code.
(a) As with all optimisations, you should measure, not guess! These suggestions are exactly that: suggestions. They may or may not improve the situation, so it's up to you to test them.
One of your biggest problems, and one that is probably preventing the compiler from making a lot of optimisations is that you are not using the regular nature of your grid.
If you are really using a regular grid then
xgrid[i][j][k] = x_0 + i * dxi + j * dxj + k * dxk
ygrid[i][j][k] = y_0 + i * dyi + j * dyj + k * dyk
zgrid[i][j][k] = z_0 + i * dzi + j * dzj + k * dzk
If your grid is axis aligned then
xgrid[i][j][k] = x_0 + i * dxi
ygrid[i][j][k] = y_0 + j * dyj
zgrid[i][j][k] = z_0 + k * dzk
Replacing these inside your core loop should result in significant speedups.
You could do two things. Reduce the number of points you are testing for inclusion and simplify the problem to multiple 2d tests.
If you take the sphere an look at it down the z axis you have all the points for y+r to y-r in the sphere, using each of these points you can slice the sphere into circles that contain all the points in the x/z plane limited to the circle radius at that specific y you are testing. Calculating the radius of the circle is a simple solve the length of the base of the right angle triangle problem.
Right now you ar testing all the points in a cube, but the upper ranges of the sphere excludes most points. The idea behind the above algorithm is that you can limit the points tested at each level of the sphere to the square containing the radius of the circle at that height.
Here is a simple hand draw graphic, showing the sphere from the side view.
Here we are looking at the slice of the sphere that has the radius ab. Since you know the length ac and bc of the right angle triangle, you can calculate ab using Pythagoras theorem. Now you have a simple circle that you can test the points in, then move down, it reduce length ac and recalculate ab and repeat.
Now once you have that you can actually do a little more optimization. Firstly, you do not need to test every point against the circle, you only need to test one quarter of the points. If you test the points in the upper left quadrant of the circle (the slice of the sphere) then the points in the other three points are just mirror images of that same point offset either to the right, bottom or diagonally from the point determined to be in the first quadrant.
Then finally, you only need to do the circle slices of the top half of the sphere because the bottom half is just a mirror of the top half. In the end you only tested a quarter of the point for containment in the sphere. This should be a huge performance boost.
I hope that makes sense, I am not at a machine now that I can provide a sample.
simple thing here would be a 3D flood fill from center of the sphere rather than iterating over the enclosing square as you need to visited lesser points. Moreover you should implement the iterative version of the flood-fill to get more efficiency.
Flood Fill
I need to generate a set of vertices for a simple convex polygon to do a minimum weight triangluation for that polygon using dynamic programming , I thought about taking a circle of radius r and then take 20 vertices moving counter clock wise and then i will form a 20 vertex convex polygon but i how can i do that
How would i know the vertex that lies on a circle of radius r ?
and is there another easier way of generating vertices for convex polygon other than that way
Any help greatly appreciated
Generate your 20 random numbers between 0 and 2*pi, and sort them.
Now use a little basic trigonometry to convert to X,Y coordinates.
for (int i = 0; i < 20; i++)
{
x = x0 + r*cos(angle[i]);
y = y0 + r*sin(angle[i]);
// ...
}
btw. +1 for nice approach with that circle ...
do not care for number of vertexes
{
double x0=50.0,y0=50.0,r=50.0; // circle params
double a,da,x,y;
// [view] // my view engine stuff can skip this
glview2D::_lin l;
view.pic_clear();
l.col=0x00FFFFFF;
// [/view]
for (a=0.0;a<2.0*M_PI;) // full circle
{
x=x0+(r*cos(a));
y=y0+(r*sin(a));
a+=(20.0+(40.0*Random()))*M_PI/180.0; // random angle step < 20,60 > degrees
// here add your x,y point to polygon
// [view] // my view engine stuff can skip this
l.p0=l.p1; // just add line (lust x,y and actual x,y)
l.p1.p[0]=x;
l.p1.p[1]=y;
view.lin.add(l);
// [/view]
}
// [view] // my view engine stuff can skip this
view.lin[0].p0=l.p1; // just join first and last point in first line (was point0,point0)
// [view]
}
if number of vertexes is known = N
Set random step to be on average little less then 2PI / N for example:
da=a0+(a1*Random());
a0=0.75*(2*M_PI/N) ... minimal da
a1=0.40*(2*M_PI/N) ... a0+(0.5*a1) is avg = 0.95 ... is less then 2PI/N
inside for add break if vertex count reach N. If after for the vertex count is not N then recompute all from beginning because with random numbers you cannot take it that you always hit N vertexes this way !!!
sample output from source code above
PS.
You can also use ellipse if the circle shape is not good enough
x=x0+(rx*cos(a));
y=y0+(ry*sin(a));
rx != ry
Here is a flexible and efficient way to generate convex polygon : -
Generate random points on the circle at center point (xc,yc)
tweak any point (xi,yi) in sequence of consecutive points
check if (x(i-1),y(i-1)) , (xi,yi) , (x(i+1),y(i+1)) form a left turn else reject the tweak.
if points are arranged in anti clockwise manner then left turn at point (x2,y2) :-
int crosspro = (x3-x2)*(y2-y1) - (y3-y2)*(x2-x1)
if(crosspro>0) return(left_turn);
else return(right_turn);
This is my version of the circle method in Javascript.
var x = [0];
var y = [0];
var r = 0;
var angle = 0
for (var i = 1; i < 20; i++) {
angle += 0.3 + Math.random() * 0.3
if (angle > 2 * Math.PI) {
break; //stop before it becomes convex
}
r = (5 + Math.random() * 20+Math.random()*50)
x.push(x[i - 1] + r * Math.cos(angle));
y.push(y[i - 1] + r * Math.sin(angle));
}
I am looking for optimized functions in c++ for calculating areal averages of floats. the function is passed a source float array, a destination float array (same size as source array), array width and height, "blurring" area width and height.
The function should "wrap-around" edges for the blurring/averages calculations.
Here is example code that blur with a rectangular shape:
/*****************************************
* Find averages extended variations
*****************************************/
void findaverages_ext(float *floatdata, float *dest_data, int fwidth, int fheight, int scale, int aw, int ah, int weight, int xoff, int yoff)
{
printf("findaverages_ext scale: %d, width: %d, height: %d, weight: %d \n", scale, aw, ah, weight);
float total = 0.0;
int spos = scale * fwidth * fheight;
int apos;
int w = aw;
int h = ah;
float* f_temp = new float[fwidth * fheight];
// Horizontal
for(int y=0;y<fheight ;y++)
{
Sleep(10); // Do not burn your processor
total = 0.0;
// Process entire window for first pixel (including wrap-around edge)
for (int kx = 0; kx <= w; ++kx)
if (kx >= 0 && kx < fwidth)
total += floatdata[y*fwidth + kx];
// Wrap
for (int kx = (fwidth-w); kx < fwidth; ++kx)
if (kx >= 0 && kx < fwidth)
total += floatdata[y*fwidth + kx];
// Store first window
f_temp[y*fwidth] = (total / (w*2+1));
for(int x=1;x<fwidth ;x++) // x width changes with y
{
// Substract pixel leaving window
if (x-w-1 >= 0)
total -= floatdata[y*fwidth + x-w-1];
// Add pixel entering window
if (x+w < fwidth)
total += floatdata[y*fwidth + x+w];
else
total += floatdata[y*fwidth + x+w-fwidth];
// Store average
apos = y * fwidth + x;
f_temp[apos] = (total / (w*2+1));
}
}
// Vertical
for(int x=0;x<fwidth ;x++)
{
Sleep(10); // Do not burn your processor
total = 0.0;
// Process entire window for first pixel
for (int ky = 0; ky <= h; ++ky)
if (ky >= 0 && ky < fheight)
total += f_temp[ky*fwidth + x];
// Wrap
for (int ky = fheight-h; ky < fheight; ++ky)
if (ky >= 0 && ky < fheight)
total += f_temp[ky*fwidth + x];
// Store first if not out of bounds
dest_data[spos + x] = (total / (h*2+1));
for(int y=1;y< fheight ;y++) // y width changes with x
{
// Substract pixel leaving window
if (y-h-1 >= 0)
total -= f_temp[(y-h-1)*fwidth + x];
// Add pixel entering window
if (y+h < fheight)
total += f_temp[(y+h)*fwidth + x];
else
total += f_temp[(y+h-fheight)*fwidth + x];
// Store average
apos = y * fwidth + x;
dest_data[spos+apos] = (total / (h*2+1));
}
}
delete f_temp;
}
What I need is similar functions that for each pixel finds the average (blur) of pixels from shapes different than rectangular.
The specific shapes are: "S" (sharp edges), "O" (rectangular but hollow), "+" and "X", where the average float is stored at the center pixel on destination data array. Size of blur shape should be variable, width and height.
The functions does not need to be pixelperfect, only optimized for performance. There could be separate functions for each shape.
I am also happy if anyone can tip me of how to optimize the example function above for rectangluar blurring.
What you are trying to implement are various sorts of digital filters for image processing. This is equivalent to convolving two signals where the 2nd one would be the filter's impulse response. So far, you regognized that a "rectangular average" is separable. By separable I mean, you can split the filter into two parts. One that operates along the X axis and one that operates along the Y axis -- in each case a 1D filter. This is nice and can save you lots of cycles. But not every filter is separable. Averaging along other shapres (S, O, +, X) is not separable. You need to actually compute a 2D convolution for these.
As for performance, you can speed up your 1D averages by properly implementing a "moving average". A proper "moving average" implementation only requires a fixed amount of little work per pixel regardless of the averaging "window". This can be done by recognizing that neighbouring pixels of the target image are computed by an average of almost the same pixels. You can reuse these sums for the neighbouring target pixel by adding one new pixel intensity and subtracting an older one (for the 1D case).
In case of arbitrary non-separable filters your best bet performance-wise is "fast convolution" which is FFT-based. Checkout www.dspguide.com. If I recall correctly, there is even a chapter on how to properly do "fast convolution" using the FFT algorithm. Although, they explain it for 1-dimensional signals, it also applies to 2-dimensional signals. For images you have to perform 2D-FFT/iFFT transforms.
To add to sellibitze's answer, you can use a summed area table for your O, S and + kernels (not for the X one though). That way you can convolve a pixel in constant time, and it's probably the fastest method to do it for kernel shapes that allow it.
Basically, a SAT is a data structure that lets you calculate the sum of any axis-aligned rectangle. For the O kernel, after you've built a SAT, you'd take the sum of the outer rect's pixels and subtract the sum of the inner rect's pixels. The S and + kernels can be implemented similarly.
For the X kernel you can use a different approach. A skewed box filter is separable:
You can convolve with two long, thin skewed box filters, then add the two resulting images together. The center of the X will be counted twice, so will you need to convolve with another skewed box filter, and subtract that.
Apart from that, you can optimize your box blur in many ways.
Remove the two ifs from the inner loop by splitting that loop into three loops - two short loops that do checks, and one long loop that doesn't. Or you could pad your array with extra elements from all directions - that way you can simplify your code.
Calculate values like h * 2 + 1 outside the loops.
An expression like f_temp[ky*fwidth + x] does two adds and one multiplication. You can initialize a pointer to &f_temp[ky*fwidth] outside the loop, and just increment that pointer in the loop.
Don't do the division by h * 2 + 1 in the horizontal step. Instead, divide by the square of that in the vertical step.