Data structure to represent multidimensional ranges with fast "open space" lookup? - c++

So I'm looking to represent non-overlapping ranges in an N dimensional space.
I think CGAL has this functionality, and facilitates fast querying of points as the example shows below.
What I'm not sure of is how to extend this kind of query to find open windows.
So in this case I make 2 rectangles and it would be nice if there was a way find an opening of a certain size.
#include <CGAL/Cartesian.h>
#include <CGAL/Segment_tree_k.h>
#include <CGAL/Range_segment_tree_traits.h>
typedef CGAL::Cartesian<double> K;
typedef CGAL::Segment_tree_map_traits_2<K, char> Traits;
typedef CGAL::Segment_tree_2<Traits > Segment_tree_2_type;
int main()
{
typedef Traits::Interval Interval;
typedef Traits::Pure_interval Pure_interval;
typedef Traits::Key Key;
std::list<Interval> InputList, OutputList1, OutputList2;
InputList.push_back(Interval(Pure_interval(Key(1,2), Key(1,2)),'a'));
InputList.push_back(Interval(Pure_interval(Key(2,3), Key(2,3)),'b'));
Segment_tree_2_type Segment_tree_2(InputList.begin(),InputList.end());
// ??? probably has multiple solutions?
Interval find_me=Interval(Pure_interval(Key(0,3), Key(0,1)),'');
Interval opening = Segment_tree_2.find_opening(find_me);
return 0;
}

I don't think the Segment Tree from the CGAL library can help you to solve this problem, because this tree was designed to perform only two types of queries (window_query and enclosing_query). In both cases the search process returns a subset of the original set of D-dimensional intervals, which was used to build the tree - however you are interested in the open space "between" these intervals, which is not represented explicitly by this data structure.
Problems, similar to the problem you're asking about, were studied in Computational Geometry for a long time - the simplest case is finding a largest (by area or by perimeter) empty rectangle among a set of points on the plane. Generalizations of this problem have been studied as well - for more general obstacles (segments, rectangles, polygons) and higher dimensions. Please see this Wikipage for more information (including references).
However, finding the largest rectangle might be overabundant for you - any rectangle with size more or equal than the given size will suffice, and it will save you some time compared to the largest rectangle search. If your set of rectangles is static, but the size of the empty rectangle you wish to find varies, then it makes sense to preprocess this set into some data structure (as you mentioned above). There are some publications where they present algorithms to find all the maximal empty rectangles and save them in a list. Maximal empty rectangle is defined as a rectangle, which can’t be extended in any direction without intersecting with obstacles. Sorry to say, I couldn’t find any such publications, which can be accessed for free.
I’m suggesting a simple recursive algorithm, which can find an empty rectangle with width and height more or equal than the requested size. The idea is to start from the bounding box of the rectangle set and process rectangles from this set one by one. Each such rectangle is subtracted from the current empty rectangle, however result of this subtraction is represented as a set of maximal rectangles, which may overlap. For example, result of subtraction of the rectangle [0,2)x[0,2) from the rectangle [1,3)x[1,3) is a set of two rectangles [2,3)x[1,3) and [1,3)x[2,3). The algorithm returns an empty rectangle and a Boolean flag, indicating success or fail.
using RVec = std::vector<Rectangle>;
using Result = std::pair<Rectangle, bool>;
Result find(const RVec& V, double W, double H, const Rectangle& R, unsigned J)
{
if (R.sizeIsLess(W, H))
{
// ------ the empty rectangle R is too small
return {R, false};
}
else if (J < V.size())
{
// ------ process the obstacle rectangle with number J
for (const auto& r: subtract(R, V[J]))
{
const auto res = find(V, W, H, r, J + 1);
if (res.second) return {res.first, true};
}
return {R, false};
}
else
{
// ------ the empty rectangle R is big enough, and all the obstacles are processed
return {R, true};
}
}
auto find(const RVec& V, double W, double H)
{
return find(V, W, H, bbox(V), 0);
}
I can’t prove that this algorithm works correctly for any possible set of rectangular obstacles and for any requested width and height, however it worked well in all my tests. The algorithm is recursive, so the limited stack size might be a problem for really large rectangle sets.
The algorithm can be randomized by shuffling the rectangle set and/or the result of rectangles subtraction – then you’ll possibly get multiple solutions for the given rectangle set and given width and height. The algorithm can be extended to higher dimensions as well – then the function subtract will need to be modified. If you are interested I’ll add this function (for 2D case) into this answer.

Related

Custom data structures with various access methods

So what I've got is a Grid class and a Tile class. ATM Grid contains two dimensional vector of Tiles (vector<vector<Tile>>). These Tiles hold info about their x, y and z (it's a top down map) and f.e. erosion rate etc.
My problem is with that is that I need to effectively access these tiles by their x/y coordinates, find a tile with median (or other 0 to 1 value, median being 0.5) value from all z coordinates (to set sea level) and also loop through all of them from highest z to the lowest (for creating erosion map.
What would you suggest would be the best data structure to hold these in so I can effectively do everything I listed above and maybe something else as well if I find out later I need it. Right now I just create a temporary sorted structure or map to do the thing, copying all the tiles into it and working with it, which is really slow.
The options I've considered are map which doesn't have a direct access and is also always sorted which would make picking tiles by their x/y hard.
Then a single vector which would allow direct access but if I was to sort the tiles the direct access would be pointless because the position of Tile in vector would be the same as it's x + y * width.
Here is a small sample code:
Class Grid {
public:
Class Tile {
unsigned x;
unsigned y;
float z; // used for drawing height map
static float seaLevel; // static value for all the tiles
unsigned erosionLevel; //used for drawing erosion map
void setSeaLevel(float pos) {
// set seaLevel to z of tile on pos from 0 to 1 in tile grid
}
void generateErosionMap() {
// loop thorugh all tiles from highest z to lowest z and set their erosion
}
void draw() {
// loop through all tiles by their x/y and draw them
}
vector<vector<Tile>> tileGrid;
}
The C++ library provides a basic set of containers. Each container is optimized for access in a specific way.
When you have a requirement to be able to optimally access the same set of data in different ways, the way to do this is to combine several containers together, all referencing the same underlying data, with each container being used to locate a single chunk of data in one particular way.
Let's take two of your requirements, as an example:
Locate a Grid object based on its X and Y coordinates, and
Iterate over all Grids in monotonically increasing or decreasing order, by their z coordinates.
We can implement the first requirement by using a simple two-dimensional vector:
typedef std::vector<std::vector<std::shared_ptr<Grid>>> lookup_by_xy_t;
lookup_by_xy_t lookup_by_xy;
This is rather obvious, on its face value. But note that the vector does not store the actual Grids, but a std::shared_ptr to these objects. If you are not familiar with std::shared_ptrs, read up on them, and understand what they are.
This is fairly basic: you construct a new Grid:
auto g = std::make_shared<Grid>( /* arguments to Grid's constructor */);
// Any additional initialization...
//
// g->foo(); g->bar=4;
//
// etc...
and simply insert it into the lookup vector:
lookup_by_xy[g->x][g->y]=g;
Now, we handle your second requirement: being able to iterate over all these objects by their z coordinates:
typedef std::multimap<double, std::shared_ptr<Grid>> lookup_by_z_t;
lookup_by_z_t lookup_by_z;
This is assuming that your z coordinate is a double. The multimap will, by default, iterate over its contents in strict weak ordering according to the key, from lowest to the highest key. You can either iterate over the map backwards, or use the appropriate comparison class with the multimap, to order its keys from highest to lowest values.
Now, simply insert the same std::shared_ptr into this lookup container:
lookup_by_z.insert(std::make_pair(g->z, g));
Now, you can find each Grid object by either its x/y coordinate, or iterate over all objects by their z coordinates. Both of the two-dimensional vector, and the multimap, contain shared_ptrs to the same Grid objects. Either one can be used to access them.
Simply create other containers, as needed, to access the same underlying objects, in different ways.
Now, of course, all of this additional framework does impose some additional overhead, in terms of dynamic memory allocations, and the overhead for each container itself. There is no free lunch. A custom allocator might become necessary if the amount of raw data becomes an issue.
So after asking this question on my university and getting bit deeper explanation, I've come to this solution.
If you need a data structure that needs various access methods(like in my case direct access by x/y, linear access through sorted z etc.) best solution is to make you own class for handling it. Also using shared_ptr is much slower than uniqu_ptr and shouldn't be used unless necessary. So in my case the implementation would look something like this:
#ifndef TILE_GRID_H
#define TILE_GRID_H
#include "Tile.h"
#include <memory>
#include <vector>
using Matrix = std::vector<std::vector<std::unique_ptr<Tile>>>;
using Sorted = std::vector<Tile*>;
class TileGrid {
public:
TileGrid(unsigned w, unsigned h) : width(w), height(h) {
// Resize _dA to desired size
_directAccess.resize(height);
for (unsigned j = 0; j < height; ++j)
for (unsigned i = 0; i < width; ++i)
_directAccess[j].push_back(std::make_unique<Tile>(i, j));
// Link _sZ to _dA
for (auto& i : _directAccess)
for (auto& j : i)
_sortedZ.push_back(j.get());
}
// Sorts the data by it's z value
void sortZ() {
std::sort(_sortedZ.begin(), _sortedZ.end(), [](Tile* a, Tile* b) { return b->z < a->z; });
}
// Operator to read directly from this container
Tile& operator()(unsigned x, unsigned y) {
return *_directAccess[y][x];
}
// Operator returning i-th position from sorted tiles (in my case used for setting sea level)
Tile& operator()(float level) {
level = fmax(fmin(level, 1), 0);
return *_sortedZ[width * height * level];
}
// Iterators
auto begin() { return _sortedZ.begin(); }
auto end() { return _sortedZ.end(); }
auto rbegin() { return _sortedZ.rbegin(); }
auto rend() { return _sortedZ.rend(); }
const unsigned width; // x dimensoin
const unsigned height; // y dimension
private:
Matrix _directAccess;
Sorted _sortedZ;
};
#endif // TILE_GRID_H
You could also use template, but in my case I only needed this for the Tile class. So as you can see, the main _directAccess matrix holds all the unique_ptr while _sortedZ has only raw pointers to data stored in _dA. This is much faster and also safe because of these pointers being tied to one class, and all of them being deleted at the same time. Also I've added overloaded () operators for accessing the data and reused iterators from the _sortedZ vector. And again the width and height being const is only because of the intended usage for this data structure(not resizable, immovable tiles etc.).
If you have any questions or suggestions on what to improve, feel free to comment.

Is there a find () of a map to use a comparator with parameters?

To explain, what I want, let there is a map
std::map<Point, SomeClass> hm_map;
Is there a way to use in a find () of a map the comparator with parameter? - the radius of interesting me Point, so the find () must return the set of proper pairs. I think, I have chosen the incorrect container for it.
Edit:
comparator::distance = someNumber;
setOfProperPairs = hmap.find (key, comparator);
where
struct comparator
{
static double distance;
bool operator()(Point ptg, Point p) const
{ return ptg.Hit (p, distance); }
};
Do you know what the container to use for it?
std::map supports one-dimensional sorted data.
If you want geometric sorting in two or more dimensions, there is no std support for that.
You will want a quad tree (or oct tree or higher dimensional analogues), or an r tree, or a kd-tree, or similar.
They are a bit tricky to code.
Now, if you know the radius you care about before you build your structure, you can hack a simpler implementation. Create a square n dimensional grid where the spacing between the grid sides is about 1/2 to 2/3 said radius.
Store data in a multimap from grid cell to exact location and data.
Now when doing a lookup, figure out what grid the center is in, work out what grid cells could have hits in them, and search through said grid cells, doing a final check on the location to see if it is a hit.

Finding all possible pairs of subsets using recursion

I am given
struct point
{
int x;
int y;
};
and the table of points:
point tab[MAX];
Program should return the minimal distance between the centers of gravity of any possible pair of subsets from tab. Subset can be any size (of course >=1 and < MAX).
I am obliged to write this program using recursion.
So my function will be int type because I have to return int.
I globally set variable min (because while doing recurssion I have to compare some values with this min)
int min = 0;
My function should for sure, take number of elements I add, sum of Y coordinates and sum of X coordinates.
int return_min_distance(int sY, int sX, int number, bool iftaken[])
I will be glad for any help further.
I thought about another table of bools which I pass as a parameter to determine if I took value or not from table. Still my problem is how to implement this, I do not know how to even start.
I think you need a function that can iterate through all subsets of the table, starting with either nothing or an existing iterator. The code then gets easy:
int min_distance = MAXINT;
SubsetIterator si1(0, tab);
while (si1.hasNext())
{
SubsetIterator si2(&si1, tab);
while (si2.hasNext())
{
int d = subsetDistance(tab, si1.subset(), si2.subset());
if (d < min_distance)
{
min_distance = d;
}
}
}
The SubsetIterators can be simple base-2 numbers capable of counting up to MAX, where a 1 bit indicates membership in the subset. Yes, it's a O(N^2) algorithm, but I think it has to be.
The trick is incorporating recursion. Sorry, I just don't see how it helps here. If I can think of a way to use it, I'll edit my answer.
Update: I thought about this some more, and while I still can't see a use for recursion, I found a way to make the subset processing easier. Rather than run through the entire table for every distance computation, the SubsetIterators could store precomputed sums of the x and y values for easy distance computation. Then, on every iteration, you subtract the values that are leaving the subset and add the values that are joining. A simple bit-and operation can reveal these. To be even more efficient, you could use gray coding instead of two's complement to store the membership bitmap. This would guarantee that at each iteration exactly one value enters and/or leaves the subset. Minimal work.

How to join a negative polygon with inner negative polygons?

I'm currently working on a private project which depends on some operations on polygons using the Boost C++ Libraries.
I'm currently trying to work with the inner polygon/negative polygon concept.
What I need to do now is to join three polygons where two of them have a positive (counterclockwise) outer polygon and an negative (clockwise) inner polygon.
The third one is a negative polygon a new polygon object with a negative area - points in clockwise direction. And this is the point where I'm not fully sure how to handle the situation.
Here's a picture of those three polygons. The middle one which connects the left upper polygon with the right lower one is the negative one.
Now what I would like to do is to join all three polygons through the union function.
What I expect union to do is to cut away the positive parts of the polygons 1 and 3 (the positive polygons) and return the remaining two polygons of 1 and 3.
What I actually get are my polygons 1 and 3 untouched as there would be no negative polygon 2.
Any help will be appreciated.
Edit:
What I need to get is a vector not a bitmap or a picture or whatever.
These Picture are just used to better visualize what I have and what I need.
Those three Polygons are actually not more than an vector of x and y points.
Here's a picture of what I would expect to be the correct result of union of all three polygons:
Edit2: Corrected the result
How do you want unions to work? Usually a union of polygons 1 and 2 would result in polygon 3, but I suspect for your use case you want it to result in polygon 4. If that's the case, you can simply do a union of all the clockwise paths, then do a union of the counterclockwise paths, then take the difference of the former from the latter. If you want the union to result in polygon 3, then I don't think there's a consistent way to do what you want.
Good plan is to consider your polygons as a bitmap (of booleans):
Every polygon will be blit to a bitmap of type (R,R)->bool. Once it's in bitmap format, negative polygons are just andnot-operations on the booleans:
class Bitmap { virtual bool Map(float x, float y) const=0; };
class AndNot : public Bitmap {
public:
AndNot(Bitmap &bm1, Bitmap &bm2) : bm1(bm1), bm2(bm2) { }
bool Map(float x, float y) const {
return b1.Map(x,y) && !b2.Map(x,y);
}
private:
Bitmap &bm1, &bm2;
};

Fastest way to find if a 3D coordinate is already used

Using C++ (and Qt), I need to process a big amount of 3D coordinates.
Specifically, when I receive a 3D coordinate (made of 3 doubles), I need to check in a list if this coordinate has already been processed.
If not, then I process it and add it to the list (or container).
The amount of coordinates can become very big, so I need to store the processed coordinates in a container which will ensure that checking if a 3D coordinate is already contained in the container is fast.
I was thinking of using a map of a map of a map, storing the x coordinate, then the y coordinate then the z coordinate, but this makes it quite tedious to use, so I'm actually hoping there is a much better way to do it that I cannot think of.
Probably the simplest way to speed up such processing is to store the already-processed points in Octree. Checking for duplication will become close to logarithmic.
Also, make sure you tolerate round-off errors by checking the distance between the points, not the equality of the coordinates.
Divide your space into discrete bins. Could be infinitely deep squares, or could be cubes. Store your processed coordinates in a simple linked list, sorted if you like in each bin. When you get a new coordinate, jump to the enclosing bin, and walk the list looking for the new point.
Be wary of floating point comparisons. You need to either turn values into integers (say multiply by 1000 and truncate), or decide how close 2 values are to be considered equal.
You can easily use a set as follows:
#include <set>
#include <cassert>
const double epsilon(1e-8);
class Coordinate {
public:
Coordinate(double x, double y, double z) :
x_(x), y_(y), z_(z) {}
private:
double x_;
double y_;
double z_;
friend bool operator<(const Coordinate& cl, const Coordinate& cr);
};
bool operator<(const Coordinate& cl, const Coordinate& cr) {
if (cl.x_ < cr.x_ - epsilon) return true;
if (cl.x_ > cr.x_ + epsilon) return false;
if (cl.y_ < cr.y_ - epsilon) return true;
if (cl.y_ > cr.y_ + epsilon) return false;
if (cl.z_ < cr.z_ - epsilon) return true;
return false;
}
typedef std::set<Coordinate> Coordinates;
// Not thread safe!
// Return true if real processing is done
bool Process(const Coordinate& coordinate) {
static Coordinates usedCoordinates;
// Already processed?
if (usedCoordinates.find(coordinate) != usedCoordinates.end()) {
return false;
}
usedCoordinates.insert(coordinate);
// Here goes your processing code
return true;
}
// Test it
int main() {
assert(Process(Coordinate(1, 2, 3)));
assert(Process(Coordinate(1, 3, 3)));
assert(!Process(Coordinate(1, 3, 3)));
assert(!Process(Coordinate(1+epsilon/2, 2, 3)));
}
Assuming you already have a Coordinate class, add a hash function and maintain a hash_set of the coordinates.
Would look something like:
struct coord_eq
{
bool operator()(const Coordinate &s1, const Coordinate &s2) const
{
return s1 == s2;
// or: return s1.x() == s2.x() && s1.y() == s2.y() && s1.z() == s2.z();
}
};
struct coord_hash
{
size_t operator()(const Coordinate &s) const
{
union {double d, unsigned long ul} c[3];
c[0].d = s.x();
c[1].d = s.y();
c[2].d = s.z();
return static_cast<size_t> ((3 * c[0].ul) ^ (5 * c[1].ul) ^ (7 * c[2].ul));
}
};
std::hash_map<Coordinate, coord_hash, coord_eq> existing_coords;
Well, it depends on what's most important... if a tripple map is too tedious to use, then is implementing other data structures not worth the effort?
If you want to get around the uglyness of the tripple map solution, just wrap it up in another container class with an access function with three parameter, and hide all the messing around with maps internally in that.
If you're more worried about the runtime performance of this thing, storing the coordinates in an Octree might be a good idea.
Also worth mentioning is that doing these sorts of things with floats or doubles you should be very careful about precision -- if (0, 0, 0.01) the same coordinate as (0, 0, 0.01000001)? If it is, you'll need to look at the comparison functions you use, regardless of the data structure. That also depends on the source of your coordinates I guess.
Are you expecting/requiring exact matches? These might be hard to enforce with doubles. For example, if you have processed (1.0, 1.0, 1.0) and you then receive (0.9999999999999, 1.0, 1.0) would you consider it the same? If so, you will need to either apply some kind of approximation or else define error bounds.
However, to answer the question itself: the first method that comes to mind is to create a single index (either a string or a bitstring, depending how readable you want things to be). For example, create the string "(1.0,1.0,1.0)" and use that as the key to your map. This will make it easy to look up the map, keeps the code readable (and also lets you easily dump the contents of the map for debugging purposes) and gives you reasonable performance. If you need much faster performance you could use a hashing algorithm to combine the three coordinates numerically without going via a string.
Use any unique transformation of your 3D coordinates and store only the list of the results.
Example:
md5('X, Y, Z') is unique and you can store only the resulting string.
The hash is not a performant idea but you get the concept. Find any methematic unique transformation and you have it.
/Vey
Use an std::set. Define a type for the 3d coordinate (or use a boost::tuple) that has operator< defined. When adding elements, you can add it to the set, and if it was added, do your processing. If it was not added (because it already exists in there), do not do your processing.
However, if you are using doubles, be aware that your algorithm can potentially lead to unpredictable behavior. IE, is (1.0, 1.0, 1.0) the same as (1.0, 1.0, 1.000000001)?
How about using a boost::tuple for the coordinates, and storing the tuple as the index for the map?
(You may also need to do the divide-by-epsilon idea from this answer.)
Pick a constant to scale the coordinates by so that 1 unit describes an acceptably small box and yet the integer part of the largest component by magnitude will fit into a 32-bit integer; convert the X, Y and Z components of the result to integers and hash them together. Use that as a hash function for a map or hashtable (NOT as an array index, you need to deal with collisions).
You may also want to consider using a fudge factor when comparing the coordinates, since you may get floating point values which are only slightly different, and it is usually preferable to weld those together to avoid cracks when rendering.
If you write a helper class with a simple public interface, that greatly reduces the practical tedium of implementation details like use of a map<map<map<>>>. The beauty of encapsulation!
That said, you might be able to rig a hashmap to do the trick nicely. Just hash the three doubles together to get the key for the point as a whole. If you're concerned about to many collisions between points with symmetric coordinates (e.g., (1, 2, 3) and (3, 2, 1) and so on), just make the hash key asymmetric with respect to the x, y, and z coordinates, using bit shift or some such.
You could use a hash_set of any hashable type - for example, turn each tuple into a string "(x, y, z)". hash_set does fast lookups but handles collisions well.
Whatever your storage method, I would suggest you decide on an epsilon (minimum floating point distance that differentiates two coordinates), then divide all coordinates by the epsilon, round and store them as integers.
Something in this direction maybe:
struct Coor {
Coor(double x, double y, double z)
: X(x), Y(y), Z(z) {}
double X, Y, Z;
}
struct coords_thesame
{
bool operator()(const Coor& c1, const Coor& c2) const {
return c1.X == c2.X && c1.Y == c2.Y && c1.Z == c2.Z;
}
};
std::hash_map<Coor, bool, hash<Coor>, coords_thesame> m_SeenCoordinates;
Untested, use at your own peril :)
You can easily define a comparator for a one-level std::map, so that lookup becomes way less cumbersome. There is no reason of being afraid of that. The comparator defines an ordering of the _Key template argument of the map. It can then also be used for the multimap and set collections.
An example:
#include <map>
#include <cassert>
struct Point {
double x, y, z;
};
struct PointResult {
};
PointResult point_function( const Point& p ) { return PointResult(); }
// helper: binary function for comparison of two points
struct point_compare {
bool operator()( const Point& p1, const Point& p2 ) const {
return p1.x < p2.x
|| ( p1.x == p2.x && ( p1.y < p2.y
|| ( p1.y == p2.y && p1.z < p2.z )
)
);
}
};
typedef std::map<Point, PointResult, point_compare> pointmap;
int _tmain(int argc, _TCHAR* argv[])
{
pointmap pm;
Point p1 = { 0.0, 0.0, 0.0 };
Point p2 = { 0.1, 1.0, 1.0 };
pm[ p1 ] = point_function( p1 );
pm[ p2 ] = point_function( p2 );
assert( pm.find( p2 ) != pm.end() );
return 0;
}
There are more than a few ways to do it, but you have to ask yourself first what are your assumptions and conditions.
So, assuming that your space is limited in size and you know what is the maximum accuracy, then you can form a function that given (x,y,z) will convert them to a unique number or string -this can be done only if you know that your accuracy is limited (for example - no two entities can occupy the same cubic centimeter).
Encoding the coordinate allows you to use a single map/hash with O(1).
If this is not tha case, you can always use 3 embedded maps as you suggested, or go towards space division algorithms (such as OcTree as mentioned) which although given O(logN) on a average search, they also give you additional information you might want (neighbors, population, etc..), but of course it is harder to implement.
You can either use a std::set of 3D coordinates, or a sorted std::vector. Both will give you logarithmic time lookup. In either case, you'll need to implement the less than comparison operator for your 3D coordinate class.
Why bother? What "processing" are you doing? Unless it's very complex, it's probably faster to just do the calculation again, rather then waste time looking things up in a huge map or hashtable.
This is one of the more counter-intuitive things about modern cpu's. Computation is fast, memory is slow.
I realize this isn't really an answer to your question, it's questioning your question.
Good question... it's one that has many solutions, because this type of problem comes
up many times in Graphical and Scientific applications.
Depending on the solution you require it may be rather complex under the hood, in this
case less code doesn't necessarily mean faster.
"but this makes it quite tedious to use" --- generally, you can get around this by
typedefs or wrapper classes (wrappers in this case would be highly recommended).
If you don't need to use the 3D co-ordinates in any kind of spacially significant way (
things like "give me all the points within X distance of point P") then I suggest you
just find a way to hash each point, and use a single hash map... O(n) creation, O(1)
access (checking to see if it's been processed), you can't do much better than that.
If you do need more spacial information you'll need a container that explicitly takes
it into account.
The type of container you choose will be dependant on your data set. If you have good
knowledge of the range of values that you recieve this will help.
If you are recieving well distributed data over a known range... go with octree.
If you have a distribution that tends to cluster, then go with k-d trees. You'll need
to rebuild a k-d tree after inputting new co-ordinates (not necessarily every time,
just when it becomes overly imbalanced). Put simply, Kd-trees are like Octrees, but with non uniform division.