Custom data structures with various access methods

Custom data structures with various access methods - c++

So what I've got is a Grid class and a Tile class. ATM Grid contains two dimensional vector of Tiles (vector<vector<Tile>>). These Tiles hold info about their x, y and z (it's a top down map) and f.e. erosion rate etc.
My problem is with that is that I need to effectively access these tiles by their x/y coordinates, find a tile with median (or other 0 to 1 value, median being 0.5) value from all z coordinates (to set sea level) and also loop through all of them from highest z to the lowest (for creating erosion map.
What would you suggest would be the best data structure to hold these in so I can effectively do everything I listed above and maybe something else as well if I find out later I need it. Right now I just create a temporary sorted structure or map to do the thing, copying all the tiles into it and working with it, which is really slow.
The options I've considered are map which doesn't have a direct access and is also always sorted which would make picking tiles by their x/y hard.
Then a single vector which would allow direct access but if I was to sort the tiles the direct access would be pointless because the position of Tile in vector would be the same as it's x + y * width.
Here is a small sample code:
Class Grid {
public:
Class Tile {
unsigned x;
unsigned y;
float z; // used for drawing height map
static float seaLevel; // static value for all the tiles
unsigned erosionLevel; //used for drawing erosion map
void setSeaLevel(float pos) {
// set seaLevel to z of tile on pos from 0 to 1 in tile grid
}
void generateErosionMap() {
// loop thorugh all tiles from highest z to lowest z and set their erosion
}
void draw() {
// loop through all tiles by their x/y and draw them
}
vector<vector<Tile>> tileGrid;
}

The C++ library provides a basic set of containers. Each container is optimized for access in a specific way.
When you have a requirement to be able to optimally access the same set of data in different ways, the way to do this is to combine several containers together, all referencing the same underlying data, with each container being used to locate a single chunk of data in one particular way.
Let's take two of your requirements, as an example:
Locate a Grid object based on its X and Y coordinates, and
Iterate over all Grids in monotonically increasing or decreasing order, by their z coordinates.
We can implement the first requirement by using a simple two-dimensional vector:
typedef std::vector<std::vector<std::shared_ptr<Grid>>> lookup_by_xy_t;
lookup_by_xy_t lookup_by_xy;
This is rather obvious, on its face value. But note that the vector does not store the actual Grids, but a std::shared_ptr to these objects. If you are not familiar with std::shared_ptrs, read up on them, and understand what they are.
This is fairly basic: you construct a new Grid:
auto g = std::make_shared<Grid>( /* arguments to Grid's constructor */);
// Any additional initialization...
//
// g->foo(); g->bar=4;
//
// etc...
and simply insert it into the lookup vector:
lookup_by_xy[g->x][g->y]=g;
Now, we handle your second requirement: being able to iterate over all these objects by their z coordinates:
typedef std::multimap<double, std::shared_ptr<Grid>> lookup_by_z_t;
lookup_by_z_t lookup_by_z;
This is assuming that your z coordinate is a double. The multimap will, by default, iterate over its contents in strict weak ordering according to the key, from lowest to the highest key. You can either iterate over the map backwards, or use the appropriate comparison class with the multimap, to order its keys from highest to lowest values.
Now, simply insert the same std::shared_ptr into this lookup container:
lookup_by_z.insert(std::make_pair(g->z, g));
Now, you can find each Grid object by either its x/y coordinate, or iterate over all objects by their z coordinates. Both of the two-dimensional vector, and the multimap, contain shared_ptrs to the same Grid objects. Either one can be used to access them.
Simply create other containers, as needed, to access the same underlying objects, in different ways.
Now, of course, all of this additional framework does impose some additional overhead, in terms of dynamic memory allocations, and the overhead for each container itself. There is no free lunch. A custom allocator might become necessary if the amount of raw data becomes an issue.

So after asking this question on my university and getting bit deeper explanation, I've come to this solution.
If you need a data structure that needs various access methods(like in my case direct access by x/y, linear access through sorted z etc.) best solution is to make you own class for handling it. Also using shared_ptr is much slower than uniqu_ptr and shouldn't be used unless necessary. So in my case the implementation would look something like this:
#ifndef TILE_GRID_H
#define TILE_GRID_H
#include "Tile.h"
#include <memory>
#include <vector>
using Matrix = std::vector<std::vector<std::unique_ptr<Tile>>>;
using Sorted = std::vector<Tile*>;
class TileGrid {
public:
TileGrid(unsigned w, unsigned h) : width(w), height(h) {
// Resize _dA to desired size
_directAccess.resize(height);
for (unsigned j = 0; j < height; ++j)
for (unsigned i = 0; i < width; ++i)
_directAccess[j].push_back(std::make_unique<Tile>(i, j));
// Link _sZ to _dA
for (auto& i : _directAccess)
for (auto& j : i)
_sortedZ.push_back(j.get());
}
// Sorts the data by it's z value
void sortZ() {
std::sort(_sortedZ.begin(), _sortedZ.end(), [](Tile* a, Tile* b) { return b->z < a->z; });
}
// Operator to read directly from this container
Tile& operator()(unsigned x, unsigned y) {
return *_directAccess[y][x];
}
// Operator returning i-th position from sorted tiles (in my case used for setting sea level)
Tile& operator()(float level) {
level = fmax(fmin(level, 1), 0);
return *_sortedZ[width * height * level];
}
// Iterators
auto begin() { return _sortedZ.begin(); }
auto end() { return _sortedZ.end(); }
auto rbegin() { return _sortedZ.rbegin(); }
auto rend() { return _sortedZ.rend(); }
const unsigned width; // x dimensoin
const unsigned height; // y dimension
private:
Matrix _directAccess;
Sorted _sortedZ;
};
#endif // TILE_GRID_H
You could also use template, but in my case I only needed this for the Tile class. So as you can see, the main _directAccess matrix holds all the unique_ptr while _sortedZ has only raw pointers to data stored in _dA. This is much faster and also safe because of these pointers being tied to one class, and all of them being deleted at the same time. Also I've added overloaded () operators for accessing the data and reused iterators from the _sortedZ vector. And again the width and height being const is only because of the intended usage for this data structure(not resizable, immovable tiles etc.).
If you have any questions or suggestions on what to improve, feel free to comment.

Related

Data structure to store a grid (that will have negative indices)

I'm studying robotics at the university and I have to implement on my own SLAM algorithm. To do it I will use ROS, Gazebo and C++.
I have a doubt about what data structure I have to use to store the map (and what I'm going to store it, but this is another story).
I have thought to represent the map as a 2D grid and robot's start location is (0,0). But I don't know where exactly is the robot on the world that I have to map. It could be at the top left corner, at the middle of the world, or in any other unknonw location inside the world.
Each cell of the grid will be 1x1 meters. I will use a laser to know where are the obstacles. Using current robot's location, I will set to 1 on all the cells that represent an obstacle. For example, it laser detects an obstacle at 2 meters in front of the robot, I will set to 1 the cell at (0,2).
Using a vector, or a 2D matrix, here is a problem, because, vector and matrices indices start at 0, and there could be more room behind the robot to map. And that room will have an obstacle at (-1,-3).
On this data structure, I will need to store the cells that have an obstacle and the cells that I know they are free.
Which kind of data structure will I have to use?
UPDATE:
The process to store the map will be the following:
Robot starts at (0,0) cell. It will detect the obstacles and store them in the map.
Robot moves to (1,0) cell. And again, detect and store the obstacles in the map.
Continue moving to free cells and storing the obstacles it founds.
The robot will detect the obstacles that are in front of it and to the sides, but never behind it.
My problem comes when the robot detects an obstacle on a negative cell (like (0,-1). I don't know how to store that obstacle if I have previously stored only the obstacle on "positive" cells. So, maybe the "offset", it is not a solution here (or maybe I'm wrong).

This is where you can write a class to help you:
class RoboArray
{
constexpr int width_ = ...
constexpr int height_ = ...
Cell grid_[width_ * 2][height_ * 2];
...
public:
...
Cell get(int x, int y) // can make this use [x][y] notation with a helper class
{
return grid_[x + width_][y + height];
}
...
}

The options you have:
Have an offset. Simple and dirty. Your grid is 100x100 but stores -50,-50 to 50x50.
Have multiple offset'ed grids. When you go out of the grid allocate a new one beside it, with a different offset. A list or map of grids.
Have sparse structure. A set or map of coordinates.
Have an hierarchical structure. Your whole, say 50x50, grid is one cell in a grid at a higher level. Implement it with a linked list or something so when you move you build a tree of nest grids. Very efficient for memory and compute time, but much more complex to implement.

You can use a std::set to represent a grid layout by using a position class you create. It contains a x and y variable and can therefore be used to intuitively be used to find points inside the grid. You can also use a std::map if you want to store information about a certain location inside the grid.
Please don't forget to fulfill the C++ named requirements for set/map such as Compare if you don't want to provide a comparison operator externally.
example:
position.h
/* this class is used to store the position of things
* it is made up by a horizontal and a vertical position.
*/
class position{
private:
int32_t horizontalPosition;
int32_t verticalPosition;
public:
position::position(const int hPos = 0,const int vPos = 0) : horizontalPosition{hPos}, verticalPosition{vPos}{}
position::position(position& inputPos) : position(inputPos.getHorPos(),inputPos.getVerPos()){}
position::position(const position& inputPos) : position((inputPos).getHorPos(),(inputPos).getVerPos()){}
//insertion operator, it enables the use of cout on this object: cout << position(0,0) << endl;
friend std::ostream& operator<<(std::ostream& os, const position& dt){
os << dt.getHorPos() << "," << dt.getVerPos();
return os;
}
//greater than operator
bool operator>(const position& rh) const noexcept{
uint64_t ans1 = static_cast<uint64_t>(getVerPos()) | static_cast<uint64_t>(getHorPos())<<32;
uint64_t ans2 = static_cast<uint64_t>(rh.getVerPos()) | static_cast<uint64_t>(rh.getHorPos())<<32;
return(ans1 < ans2);
}
//lesser than operator
bool operator<(const position& rh) const noexcept{
uint64_t ans1 = static_cast<uint64_t>(getVerPos()) | static_cast<uint64_t>(getHorPos())<<32;
uint64_t ans2 = static_cast<uint64_t>(rh.getVerPos()) | static_cast<uint64_t>(rh.getHorPos())<<32;
return(ans1 > ans2);
}
//equal comparison operator
bool operator==(const position& inputPos)const noexcept {
return((getHorPos() == inputPos.getHorPos()) && (getVerPos() == inputPos.getVerPos()));
}
//not equal comparison operator
bool operator!=(const position& inputPos)const noexcept {
return((getHorPos() != inputPos.getHorPos()) || (getVerPos() != inputPos.getVerPos()));
}
void movNorth(void) noexcept{
++verticalPosition;
}
void movEast(void) noexcept{
++horizontalPosition;
}
void movSouth(void) noexcept{
--verticalPosition;
}
void movWest(void) noexcept{
--horizontalPosition;
}
position getNorthPosition(void)const noexcept{
position aPosition(*this);
aPosition.movNorth();
return(aPosition);
}
position getEastPosition(void)const noexcept{
position aPosition(*this);
aPosition.movEast();
return(aPosition);
}
position getSouthPosition(void)const noexcept{
position aPosition(*this);
aPosition.movSouth();
return(aPosition);
}
position getWestPosition(void)const noexcept{
position aPosition(*this);
aPosition.movWest();
return(aPosition);
}
int32_t getVerPos(void) const noexcept {
return(verticalPosition);
}
int32_t getHorPos(void) const noexcept {
return(horizontalPosition);
}
};
std::set<position> gridNoData;
std::map<position, bool> gridWithData;
gridNoData.insert(point(1,1));
gridWithData.insert(point(1,1),true);
gridNoData.insert(point(0,0));
gridWithData.insert(point(0,0),true);
auto search = gridNoData.find(point(0,0));
if (search != gridNoData.end()) {
std::cout << "0,0 exists" << '\n';
} else {
std::cout << "0,0 doesn't exist\n";
}
auto search = gridWithData.find(point(0,0));
if (search != gridWithData.end()) {
std::cout << "0,0 exists with value" << search->second << '\n';
} else {
std::cout << "0,0 doesn't exist\n";
}
The above class was used by me in a similar setting and we used a std::map defined as:
std::map<position,directionalState> exploredMap;
To store if we had found any walls at a certain position.
By using this std::map based method you avoid having to do math to know what offset you have to have inside an 2D array (or some structure like that). It also allows you to move freely as there is no chance that you'll travel outside of the predefined bounds you set at construction. This structure is also more space efficient against a 2D array as this structure only saves the areas where the robot has been. This is also a C++ way of doing things: relying on the STL instead of creating your own 2D map using C constructs.

With offset solution (translation of values by fixed formula (we called it "mapping function" in math class), like doing "+50" to all coordinates, i.e. [-30,-29] will become [+20,+21] and [0,0] will become [+50,+50] ) you still need to have idea what is your maximum size.
In case you want to be dynamic like std::vector<> going from 0 to some N (as much as free memory allows), you can create more complex mapping function, for example map(x) = x*2 when (0 <= x) and x*(-2)-1 when (x < 0) ... this way you can use standard std::vector and let it grow as needed by reaching new maximum coordinates.
With 2D grid vs std::vector this is a bit more complicated as vector of vectors is sometimes not the best idea from performance point of view, but as long as your code can prefer shortness and simplicity over performance, maybe you can use the same mapping for both coordinates and use vector of vectors (using reserve(..) on all of them with some reasonable default to avoid resizing of vectors in common use cases, like if you know the 100m x 100m area will be usual maximum, you can reserve everything to capacity 201 initially to avoid vector resizing for common situations, but it can still grow infinitely (until heap memory is exhausted) in less common situations.
You can also add another mapping function converting 2D coordinates to 1D and use single vector only, and if you want really complicate things, you can for example map those 2D into 0,1,2,... sequence growing from area around center outward to save memory usage for small areas... you will probably easily spend 2-4 weeks on debugging it, if you are kinda fresh to C++ development, and you don't use unit testing and TDD approach (I.e. just go by simple vector of vectors for a start, this paragraph is JFYI, how things can get complicated if you are trying to be too smart :) ).

Class robotArray
{
Int* left,right;
}
RobotArray::RobotArray ()
{
Int* a=new int [50][50];
Int* b=new int[50][50];
//left for the -ve space and right for the positive space with
0,0 of the two arrays removed
Left=a+1;
Right=b+1;
}

I think I see what you are after here: you don't know how big the space is, or even what the coordinates may be.
This is very general, but I would create a class that holds all of the data using vectors (another option -- vector of pairs, or vector of Eigen (the library) vectors). As you discover new regions, you'll add the coordinates and occupancy information to the Map (via AddObservation(), or something similar).
Later, you can determine the minimum and maximum x and y coordinates, and create the appropriate grid, if you like.
class RoboMap{
public:
vector<int> map_x_coord;
vector<int> map_y_coord;
vector<bool> occupancy;
RoboMap();
void AddObservation(int x, int y, bool in_out){
map_x_coord.push_back(x);
map_y_coord.push_back(y);
occupancy.push_back(in_out);
}
};

Which data structure to use to store edges of graph so that I can access edge weight in constant time in c++?

I am using an adjacency list to store a graph. Since I am using adjacency list, I cannot access edge weight of a graph in constant time. So, I wonder which EXTRA data structure to use only to story edges indexed by the two nodes u and v?
Currently, I am trying with map<pair<int, int>, int> but it has log (N) complexity and unordered_map does not have a policy for pairs. I know that, an edge weight is independent of the order of {u,v}, but I am not able to use this feature anyhow.

Use an adjacency matrix; a 2D array where each element in the array[x][y] is the weight of the edge between nodes x and y.

A quite simple solution would be to create an array that stores the outgoing edges for each node plus the weight. You'd simply jump to one node, search the other node in it's outgoing edges and take the weight. Complexity is maximal degree, which I usually assume to be capped.
Only thing to make sure is that all the redundant information is kept consistent.
Like
class AdjacentWeightedEdges {
struct OutgoingWeightedEdge {
size_t target_node;
int weight;
}
vector<OutgoingWeightedEdge> edges;
int edge_weight(const size_t index) const {
iterate through edges
if an edge with index found, return it's weight
raise an error if not
}
}
class Graph {
//your stuff as it is right inserted here
vector<AdjacentWeightedEdges> adjacencies;
int edge_weight(const size_t index_1, const size_t index_2){
return adjacencies[index_1].edge_weight(index_2);
}
}
If even a 1d approach like this creates memory problems, consider only storing the edges for index_1 < index_2.
Another similar method:
Store an array of pointers to the nodes, have the edge weights in the adjacency list and do what I did directly. If you don't go with indices anyhow if memory is a problem.
Another answer here talks about the adjacency matrix - this one could even work if a certain structure for a sparse matrix class is used that stores first non-zero in row/column pointers and then pointers to every following non-zero entry. Although this essentially collapses to my approach. Might be worthwhile if you need a sparse matrix class anyway.

Looping through a map with structure as a key.

I have a struct
struct key
{
int x;
int y;
int z;
};
say x, y, z can take values from 1 to 10.
I also have a map
std::map<key,double> myMap;
which I populate with different key values.
Is there a way to loop through all the key values where say z=5. That is (in terms of pseudo code)
loop over myMap
double v += myMap.find({x=anything,y=anything,z=5})->second;
It would be very kind if someone can provide some comments as to whether this is achievable (I do not want to use boost containers).

If you sort key struct using z first, you may do it this way:
bool operator<( const key &k1, const key &k2 )
{
return std::make_tuple( k1.z, k1.x, k1.y ) < std::make_tuple( k2.z, k2.x, k2.y ); // z must be first
}
auto min = std::numeric_limits<int>::min();
auto end = map.lower_bound( key{ min, min, 6 } );
for( auto it = map.lower_bound( key{ min, min, 5 } ); it != end; ++it ) {
...
}
but if you need to iterate for x or y as well you will have to either create separate multimap per coordinate with pointer to structure or use boost::multiindex.

The standard associative containers use a one-dimensional ordering, i.e. they only know whether one key is less, equal or greater than another. So this is not efficiently possible. You can achieve such filtering in linear time using std::find_if().
Maintaining O(log n) lookup time, it is however possible to create multiple maps, with different ways of indexing; e.g. one with X, one with Y and one with Z as the key. If the values are big objects, you could use pointers to not needlessly duplicate them. Of course, this can all be hidden behind an encapsulating class, that provides axis-based ordering to the outside.
Another approach, which is reasonable for small spaces (like x,y,z from 1 to 10), is to not use std::map but a 3D array/matrix instead. This can be implemented using a 1D std::vector, by mapping indices from 3 dimensions to 1.

#define ARRAY_SIZE 2500 // 2500 is the size of the array
Why not create an array of arrays like
double map[2][ARRAY_SIZE];
// map[0][x] - the key of Xth value in the array (example - 10th/1st/2nd ...)
// map[1][x] - the value of Xth value in the array
Just saying, it's better when you don't complicate !

what is the best way between : instantiate an object or use pointers

We want to create our own list of triangles from a list already existed of STL mode (it's a 3D geometric model, which composed of triangle), several triangle can had the same point, we want to use the best solution:
S1) that goes through the list, use the coordinates of each triangle (element) of this list to create a triangle object and we put it in our list (victor). but here there are points that must be created multiple times, because as I have said many triangles can have the same point.
S2) there already existed another list that contains all the points, then it goes through the list of triangle already existed and for each point of triangle we search it in the list of points (so we have to use sort and search algorithms) to use pointers (point on these points) and create objects that contients 3 pointers (*p1, *p2, *p3) and put them in our list.

Store the points in an std::unordered_set then store the triangles as a list of structures containing 3 std::unordered_set::const_iterator's.
Insertion of the points into the set will be approximately constant time and the insert returns a pair that contains the iterator where the point can be found.
Have a look here for more details on how insert works.
Here's the basic structure of the code (untested)
struct Point
{
float x;
float y;
float z;
};
typedef std::unordered_set<Point, int, hashFunc, equalsFunc> pset;
// Note, see http://stackoverflow.com/questions/16792751/hashmap-for-2d3d-coordinates-i-e-vector-of-doubles for more details on how to store complex structures in unordered_sets
struct RefTriangle
{
pset::const_iterator p[3];
};
pset allPoints;
std::list<RefTriangle> refTriangles
for (const Triangle& t : triangleList)
{
RefTriangle rt;
rt.p[0] = allPoints.insert(t.p1).first;
rt.p[1] = allPoints.insert(t.p2).first;
rt.p[2] = allPoints.insert(t.p3).first;
refTriangles.push_back(rt);
}
In the end you'll have a set of unique points and a list of reference triangle objects that effectively have "pointers" to those points in the unique set.

Negative coordiates with Tile maps

How can you map negative map coordinates in a 2D tile based game?
ex. (-180,100)
or (10, -8)
i need to access them with O(1). i don't want create a huge 2d vector and consider (500,500) as (0,0) just to call negative coordinates.
kinda of a dumb question, but i really have no clue.
Thank you.

I'm assuming you've got an infinite, procedurally-generated world, because if the world isn't infinite, it's a simple matter of setting the lower bound of the X and Y coordinate at zero, then wrapping a function around your tile map array that automatically returns zero if someone asks for a tile that's out of bounds.
In the case of an infinite world, you're not going to be able to access your tiles in O(1) time -- the best you're going to do is O(log n). Here's how I'd go about it:
Divide your tile map into square chunks of whatever size you find reasonable (we'll call this size S)
Create a hash map of vectors, each vector being one chunk of your map.
When the player moves close to a new chunk, generate it in a background thread and toss it into the hash. The hash should be based on the x, y coordinates of the chunk (x/S, y/S).
If you want to access a map tile at a particular position, grab t.he vector for the appropriate chunk and then access tile (x%S, y%S) in that vector.
If you wrap this inside a class, you can add features to it, such as loading and saving chunks to disk so you don't have to hold the entire map in memory. You can also write a getTile function that can take arbitrary coordinates, and takes care of picking the correct chunk uand position inside that chunk for you.
Your vector is always accessible in O(1) time, and the hash should be accessible in O(log n) time. Furthermore, as long as you choose a sane chunk size, the size of your hash won't get out of hand, so the performance impact will be essentially nil.

If you data is dense (all or most of the points in a known 2D range are used), nothing will beat a 2D array (O(1)). Accept the coordinate shift.
If your data is sparse, how about a hash table on the coordinate pairs. Close to O(1).

std::map or hash_map doesnt seem to suit my needs. they are overly complex and not flexible enough.
i decided go for std::vector and accept high memory usage
i'll leave how i did it here for future reference
const uint Xlimit = 500;
const uint Ylimit = 500;
class Tile
{
public:
Tile(){someGameData=NULL;}
void *someGameData;
};
class Game
{
public:
Game()
{
tiles.resize(Xlimit, std::vector<Tile>(Ylimit, Tile()));
}
inline Tile* GetTileAtCoord(int x, int y)
{
return &tiles[x+Xlimit][y+Ylimit];
}
protected:
std::vector<std::vector<Tile>> tiles;
};

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js