Looping through a map with structure as a key. - c++

I have a struct
struct key
{
int x;
int y;
int z;
};
say x, y, z can take values from 1 to 10.
I also have a map
std::map<key,double> myMap;
which I populate with different key values.
Is there a way to loop through all the key values where say z=5. That is (in terms of pseudo code)
loop over myMap
double v += myMap.find({x=anything,y=anything,z=5})->second;
It would be very kind if someone can provide some comments as to whether this is achievable (I do not want to use boost containers).

If you sort key struct using z first, you may do it this way:
bool operator<( const key &k1, const key &k2 )
{
return std::make_tuple( k1.z, k1.x, k1.y ) < std::make_tuple( k2.z, k2.x, k2.y ); // z must be first
}
auto min = std::numeric_limits<int>::min();
auto end = map.lower_bound( key{ min, min, 6 } );
for( auto it = map.lower_bound( key{ min, min, 5 } ); it != end; ++it ) {
...
}
but if you need to iterate for x or y as well you will have to either create separate multimap per coordinate with pointer to structure or use boost::multiindex.

The standard associative containers use a one-dimensional ordering, i.e. they only know whether one key is less, equal or greater than another. So this is not efficiently possible. You can achieve such filtering in linear time using std::find_if().
Maintaining O(log n) lookup time, it is however possible to create multiple maps, with different ways of indexing; e.g. one with X, one with Y and one with Z as the key. If the values are big objects, you could use pointers to not needlessly duplicate them. Of course, this can all be hidden behind an encapsulating class, that provides axis-based ordering to the outside.
Another approach, which is reasonable for small spaces (like x,y,z from 1 to 10), is to not use std::map but a 3D array/matrix instead. This can be implemented using a 1D std::vector, by mapping indices from 3 dimensions to 1.

#define ARRAY_SIZE 2500 // 2500 is the size of the array
Why not create an array of arrays like
double map[2][ARRAY_SIZE];
// map[0][x] - the key of Xth value in the array (example - 10th/1st/2nd ...)
// map[1][x] - the value of Xth value in the array
Just saying, it's better when you don't complicate !

Related

Custom data structures with various access methods

So what I've got is a Grid class and a Tile class. ATM Grid contains two dimensional vector of Tiles (vector<vector<Tile>>). These Tiles hold info about their x, y and z (it's a top down map) and f.e. erosion rate etc.
My problem is with that is that I need to effectively access these tiles by their x/y coordinates, find a tile with median (or other 0 to 1 value, median being 0.5) value from all z coordinates (to set sea level) and also loop through all of them from highest z to the lowest (for creating erosion map.
What would you suggest would be the best data structure to hold these in so I can effectively do everything I listed above and maybe something else as well if I find out later I need it. Right now I just create a temporary sorted structure or map to do the thing, copying all the tiles into it and working with it, which is really slow.
The options I've considered are map which doesn't have a direct access and is also always sorted which would make picking tiles by their x/y hard.
Then a single vector which would allow direct access but if I was to sort the tiles the direct access would be pointless because the position of Tile in vector would be the same as it's x + y * width.
Here is a small sample code:
Class Grid {
public:
Class Tile {
unsigned x;
unsigned y;
float z; // used for drawing height map
static float seaLevel; // static value for all the tiles
unsigned erosionLevel; //used for drawing erosion map
void setSeaLevel(float pos) {
// set seaLevel to z of tile on pos from 0 to 1 in tile grid
}
void generateErosionMap() {
// loop thorugh all tiles from highest z to lowest z and set their erosion
}
void draw() {
// loop through all tiles by their x/y and draw them
}
vector<vector<Tile>> tileGrid;
}
The C++ library provides a basic set of containers. Each container is optimized for access in a specific way.
When you have a requirement to be able to optimally access the same set of data in different ways, the way to do this is to combine several containers together, all referencing the same underlying data, with each container being used to locate a single chunk of data in one particular way.
Let's take two of your requirements, as an example:
Locate a Grid object based on its X and Y coordinates, and
Iterate over all Grids in monotonically increasing or decreasing order, by their z coordinates.
We can implement the first requirement by using a simple two-dimensional vector:
typedef std::vector<std::vector<std::shared_ptr<Grid>>> lookup_by_xy_t;
lookup_by_xy_t lookup_by_xy;
This is rather obvious, on its face value. But note that the vector does not store the actual Grids, but a std::shared_ptr to these objects. If you are not familiar with std::shared_ptrs, read up on them, and understand what they are.
This is fairly basic: you construct a new Grid:
auto g = std::make_shared<Grid>( /* arguments to Grid's constructor */);
// Any additional initialization...
//
// g->foo(); g->bar=4;
//
// etc...
and simply insert it into the lookup vector:
lookup_by_xy[g->x][g->y]=g;
Now, we handle your second requirement: being able to iterate over all these objects by their z coordinates:
typedef std::multimap<double, std::shared_ptr<Grid>> lookup_by_z_t;
lookup_by_z_t lookup_by_z;
This is assuming that your z coordinate is a double. The multimap will, by default, iterate over its contents in strict weak ordering according to the key, from lowest to the highest key. You can either iterate over the map backwards, or use the appropriate comparison class with the multimap, to order its keys from highest to lowest values.
Now, simply insert the same std::shared_ptr into this lookup container:
lookup_by_z.insert(std::make_pair(g->z, g));
Now, you can find each Grid object by either its x/y coordinate, or iterate over all objects by their z coordinates. Both of the two-dimensional vector, and the multimap, contain shared_ptrs to the same Grid objects. Either one can be used to access them.
Simply create other containers, as needed, to access the same underlying objects, in different ways.
Now, of course, all of this additional framework does impose some additional overhead, in terms of dynamic memory allocations, and the overhead for each container itself. There is no free lunch. A custom allocator might become necessary if the amount of raw data becomes an issue.
So after asking this question on my university and getting bit deeper explanation, I've come to this solution.
If you need a data structure that needs various access methods(like in my case direct access by x/y, linear access through sorted z etc.) best solution is to make you own class for handling it. Also using shared_ptr is much slower than uniqu_ptr and shouldn't be used unless necessary. So in my case the implementation would look something like this:
#ifndef TILE_GRID_H
#define TILE_GRID_H
#include "Tile.h"
#include <memory>
#include <vector>
using Matrix = std::vector<std::vector<std::unique_ptr<Tile>>>;
using Sorted = std::vector<Tile*>;
class TileGrid {
public:
TileGrid(unsigned w, unsigned h) : width(w), height(h) {
// Resize _dA to desired size
_directAccess.resize(height);
for (unsigned j = 0; j < height; ++j)
for (unsigned i = 0; i < width; ++i)
_directAccess[j].push_back(std::make_unique<Tile>(i, j));
// Link _sZ to _dA
for (auto& i : _directAccess)
for (auto& j : i)
_sortedZ.push_back(j.get());
}
// Sorts the data by it's z value
void sortZ() {
std::sort(_sortedZ.begin(), _sortedZ.end(), [](Tile* a, Tile* b) { return b->z < a->z; });
}
// Operator to read directly from this container
Tile& operator()(unsigned x, unsigned y) {
return *_directAccess[y][x];
}
// Operator returning i-th position from sorted tiles (in my case used for setting sea level)
Tile& operator()(float level) {
level = fmax(fmin(level, 1), 0);
return *_sortedZ[width * height * level];
}
// Iterators
auto begin() { return _sortedZ.begin(); }
auto end() { return _sortedZ.end(); }
auto rbegin() { return _sortedZ.rbegin(); }
auto rend() { return _sortedZ.rend(); }
const unsigned width; // x dimensoin
const unsigned height; // y dimension
private:
Matrix _directAccess;
Sorted _sortedZ;
};
#endif // TILE_GRID_H
You could also use template, but in my case I only needed this for the Tile class. So as you can see, the main _directAccess matrix holds all the unique_ptr while _sortedZ has only raw pointers to data stored in _dA. This is much faster and also safe because of these pointers being tied to one class, and all of them being deleted at the same time. Also I've added overloaded () operators for accessing the data and reused iterators from the _sortedZ vector. And again the width and height being const is only because of the intended usage for this data structure(not resizable, immovable tiles etc.).
If you have any questions or suggestions on what to improve, feel free to comment.

map range of float values to single value in c++

I have array of key-frames that looks as
sruct key-frame
{ float time;
matrix4x4 transformMatrix;}
array is sorted according to time value. Also I have
float value;
I access this array millions of times. Array itself remains unchanged. My goal is to find index of first keyframe that has time bigger then value. Can this be done in constant time? Does c++ has something that maps not overlaping ranges of values to particular value?
The std::upper_bound function will do exactly this in O(log n) time for collections that support random access iterators (yours most surely does.) You'll need to tell it how to compare the structs, which you can do with a lambda function like so:
auto keyframe_iterator =
std::upper_bound(keyframes.begin(), keyframes.end(), value,
[](const keyframe& a, const keyframe& b) { a.time < b.time; });`

Mapping float value to function depending on specified intervals

The goal
Given a main interval, [0,1] for example, break that interval in any number of subintervals, for example [0,0.2) , [0.2,0.5) , [0.5,1].
Now map different functions to each subinterval generated:
[0,0.2) ~> a( float x )
[0.2,0.5) ~> b( float x )
[0.5,1] ~> c( float x )
Call that mapping function map.
The map mapping function is dessigned
to get a floating-point value on the main interval, and call the corresponding function mapped. That is, given an input value x = 0.3, map calls b(0.3):
map(0.3); //Should call b(0.3)
My question is: What is the proper/best way to implement this on C++?
Attemped solutions:
I have tried a solution which consists on represent intervals as a pair of float values, i.e. using interval = std::pair<float,float>;, and using that interval type as key of a (unordered)map:
void map_function( float x )
{
std::map<interval,std::function<void(float)>> map;
map[{0.0,0.2}] = [](float){ ... }; //a
map[{0.2,0.5}] = [](float){ ... }; //b
map[{0.5,1.0}] = [](float){ ... }; //c
auto it = std::find_if( std::begin( map ) ,
std::end( map ) ,
[x]( const interval& interval )
{
return x >= interval.first && x < interval.second;
});
if( it != std::end( map ) )
*it( x );
else
throw "x out of bounds or subintervals ill-formed";
}
This solution seems to work, but has some minnor problems I think:
It has O(n) complexity, given n subintervals. Is there any way to perform this kind of function in O(1)?
Is std::map the proper container for this work?: The purpose of associative containers is to map from a key to a value, but here the key of the map is not the input itself, is a processed form of the input (The interval which the input value belongs to).
I have tried C++11's std::unordered_map too, but seems like there is no standard hash function for float pairs. That surprises me, but falls into another question. Keep on topic :P
Alternative solutions? Requeriments
I know about interval libraries, like Boost Interval and Boost Interval Container libraries, but I need a solution which relies on Standard Library facilities only.
You can use binary search to O(lg n) complexity. Specifically, the lower bound form #algorithm libary. If you have a vector of tuple <double, ptr_function> you can use bitary search for it. If ranges are a specific const length or length is multiple of some number, you can do it in O(1) time. For example:
multiple of 0.1
Ranges: [0;0.4) = a, [0.4;0.5) = b, [0.5;1) = c
table = {a,a,a,b,c,c,c,c,c}
Getting for x : table[floor(x*10)]
Edit: If you want to keep map, you can use map's lower bound.
If your intervals are adjacent to each other then use just starting points as keys and instead of using find() use lower_bound(). You cannot make it faster than log2(N) in general case. If you know what the maximum decimal precision is I suggest you use int64_t as a key. The transformation is int64_t ikey = 10eX * dkey, where X is the maximum precision.

Slow performance of sparse matrix using std::vector

I'm trying to implement the functionality of MATLAB function sparse.
Insert a value in sparse matrix at a specific index such that:
If a value with same index is already present in the matrix, then the new and old values are added.
Else the new value is appended to the matrix.
The function addNode performs correctly but the problem is that it is extremely slow. I call this function in a loop about 100000 times and the program takes more than 3 minutes to run. While MATLAB accomplishes this task in a matter of seconds. Is there any way to optimize the code or use stl algorithms instead of my own function to achieve what I want?
Code:
struct SparseMatNode
{
int x;
int y;
float value;
};
std::vector<SparseMatNode> SparseMatrix;
void addNode(int x, int y, float val)
{
SparseMatNode n;
n.x = x;
n.y = y;
n.value = val;
bool alreadyPresent = false;
int i = 0;
for(i=0; i<SparseMatrix.size(); i++)
{
if((SparseMatrix[i].x == x) && (SparseMatrix[i].y == y))
{
alreadyPresent = true;
break;
}
}
if(alreadyPresent)
{
SparseMatrix[i].value += val;
if(SparseMatrix[i].value == 0.0f)
SparseMatrix.erase(SparseMatrix.begin + i);
}
else
SparseMatrix.push_back(n);
}
Sparse matrices aren't typically stored as a vector of triplets as you are attempting.
MATLAB (as well as many other libraries) uses a Compressed Sparse Column (CSC) data structure, which is very efficient for static matrices. The MATLAB function sparse also does not build the matrix one entry at a time (as you are attempting) - it takes an array of triplet entries and packs the whole sequence into a CSC matrix. If you are attempting to build a static sparse matrix this is the way to go.
If you want a dynamic sparse matrix object, that supports efficient insertion and deletion of entries, you could look at different structures - possibly a std::map of triplets, or an array of column lists - see here for more information on data formats.
Also, there are many good libraries. If you're wanting to do sparse matrix operations/factorisations etc - SuiteSparse is a good option, otherwise Eigen also has good sparse support.
Sparse matrices are usually stored in compressed sparse row (CSR) or compressed sparse column (CSC, also called Harwell-Boeing) format. MATLAB by default uses CSC, IIRC, while most sparse matrix packages tend to use CSR.
Anyway, if this is for production usage rather than a learning exercise, I'd recommend using a matrix package with support for sparse matrices. In the C++ world, my favourite is Eigen.
The first thinks that stands out is that you are implementing your own functionality for finding an element: that's what std::find is for. So, instead of:
bool alreadyPresent = false;
int i = 0;
for(i=0; i<SparseMatrix.size(); i++)
{
if((SparseMatrix[i].x == x) && (SparseMatrix[i].y == y))
{
alreadyPresent = true;
break;
}
}
You should write:
auto it = std::find(SparseMatrix.begin(), SparseMatrix().end(), Comparer);
where Comparer is a function that compares two SparseMatNode objects.
But the main improvement will come from using the appropriate container. Instead of std::vector, you will be much better off using an associative container. This way, finding an element will have just a O(logN) complexity instead of O(N). You may slighly modify your SparseMatNode class as follows:
typedef std::pair<int, int> Coords;
typedef std::pair<const Coords, float> SparseMatNode;
You may cover this typedefs inside a class to provide a better interface, of course.
And then:
std::unordered_map<Coords, float> SparseMatrix;
This way you can use:
auto it = SparseMatrix.find(std::make_pair(x, y));
to find elements much more efficiently.
Have you tried sorting your vector of sparse nodes? Performing a linear search becomes costly every time you add a node. You could Insert In Place and always perform Binary Search.
Because sparse matrix may be huge and need to be compressed, you may use std::unordered_map. I assume matrix indexes (x and y) are always positive.
#include <unordered_map>
const size_t MAX_X = 1000*1000*1000;
std::unordered_map <size_t, float> matrix;
void addNode (size_t x, size_t y, float val)
{
size_t index = x + y*MAX_X;
matrix[index] += val; //this function can be still faster
if (matrix[index] == 0) //using find() / insert() methods
matrix.erase(index);
}
If std::unordered_map is not available on your system, you may try std::tr1::unordered_map or stdext::hash_map...
If you can use more memory, then use double instead of float, this will improve a bit your processing speed.

C++ map question

I have an integral position-based algorithm. (That is, the output of the algorithm is based on a curvilinear position, and each result is influenced by the values of the previous results).
To avoid recalculating each time, I would like to pre-calculate at a given sample rate, and subsequently perform a lookup and either return a pre-calculated result (if I land directly on one), or interpolate between two adjacent results.
This would be trivial for me in F# or C#, but my C++ is very rusty, (and wasn't even ever that good).
Is map the right construct to use? And could you be so kind as to give me an example of how I'd perform the lookup? (I'm thinking of precalculating in milimetres, which means the key could be an int, the value would be a double).
UPDATE OK, maybe what I need is a sorted dictionary. (Rolls up sleeves), pseudocode:
//Initialisation
fun MyFunction(int position, double previousresult) returns double {/*etc*/};
double lastresult = 0.0;
for(int s = startposition to endposition by sampledist)
{
lastresult = MyFunction(s, lastresult);
MapOrWhatever.Add(s, lastresult);
}
//Using for lookup
fun GetValueAtPosition(int position) returns double
{
CheckPositionIsInRangeElseException(position);
if(MapOrWhatever.ContainsKey(position))
return MapOrWhatever[position];
else
{
int i = 0;
//or possibly something clever with position % sampledist...
while(MapOrWhatever.Keys[i] < position) i+=sampledist;
return Interpolate(MapOrWhatever, i, i+sampledist, position);
}
}
Thinks... maybe if I keep a constant sampledist, I could just use an array and index it...
A std::map sounds reasonable for memoization here provided your values are guaranteed not to be contiguous.
#include <map>
// ...
std::map<int, double> memo;
memo.insert(std::make_pair(5, 0.5));
double x = memo[5]; // x == 0.5
If you consider a map, always consider a vector, too. For values that aren't changed much (or even not at all) during the application running, a pre-sorted std::vector< std::pair<Key,Value> > (with O(N) lookup) more often than never performs faster for lookups than a std::map<key,Value> (with O(log N) lookup) - despite all the theory.
You need to try and measure.
std::map is probably fine as long as speed is not too critical. If the speed of the lookup is critical you could try a vector as mentioned above where you go straight to the element you need (don't use a binary search since you can compute the index from the position). Something like:
vector<double> stored;
// store the values in the vector
double lastresult = 0.0;
for(int s = startposition, index = 0; s <= endposition; s+=sampledist, ++index)
{
lastresult = MyFunction(s, lastresult);
stored[index] = lastresult;
}
//then to lookup
double GetValueAtPosition(int position) returns double
{
int index = (position - startposition) / sampledist;
lower = stored[index];
upper = stored[index+1];
return interpolate(lower, upper, position);
}
please see my comment, but here is map documentation
http://www.cplusplus.com/reference/stl/map/
and important note than another poster did not mention is that if you use [] to search on a key that doesn't exist in the map, map will create an object so that there's something there.
edit: see docs here for this info http://msdn.microsoft.com/en-us/library/fe72hft9%28VS.80%29.aspx
instead, use find(), which returns an iterator. then test this iterator against map.end(), and if it is equal then there was no match.
Refer : http://www.cplusplus.com/reference/stl/map/
You can use Map ,
typedef std::map<int,const double> mapType;
Performance of maps are like :
map:: find
Complexity
Logarithmic in size.
Beware of Operator [ ] in map
If x matches the key of an element in the container, the function returns a reference to its mapped value.
If x does not match the key of any element in the container, the function inserts a new element with that key and returns a reference to its mapped value. Notice that this always increases the map size by one, even if no mapped value is assigned to the element (the element is constructed using its default constructor).
The HASH_MAP is the best STL algoirthim for fast lookup than any other algorithims. But, filling takes little bit more time than map or vector and also it is not sorted. It takes constant time for any value search.
std::hash_map<int, double,> memo;
memo.insert(std::make_pair(5, 0.5));
memo.insert(std::make_pair(7,0.8));
.
.
.
hash_map<int,double>::iterator cur = memo.find(5);
hash_map<int,double>::iterator prev = cur;
hash_map<int,double>::iterator next = cur;
++next;
--prev;
Interpolate current value with (*next).second(), (*prev).second() values..