Boost.MultiIndex spatial operations

Boost.MultiIndex spatial operations - c++

I needed a spatial map for an application. I found Boost.MultiIndex.
I followed its tutorial and understood how to create a type:
typedef boost::multi_index_container<MapNode,
indexed_by<
ordered_non_unique<member<MapNode, int, &MapNode::X>>,
ordered_non_unique<member<MapNode, int, &MapNode::Y>>
>
> Map_T;
And how to insert to it:
Map.insert(Node);
How do I retrieve a value based on its x and y coordinates? and how do I check if there is a value there?

First of all, you do not need boost::multi_index to solve this problem. Simply overload operator< for your type MapNode and use std::set (resp. std::map if multiple nodes with identical coordinates can occur). Therby, operater< compares (e.g.) for x-values first. Iff they are equal, proceed comparing the y coordinate.
There is only one reason for using boost::multi_index: If different access methodes are required. For instance, if you want an additional "view" where the nodes are first sorted by y and than by x too.
However, the member approach of multi_index (as in your code above) is not a good idea. Use identity twice, but provide two different comparison functions. Details can be found in the boost docu.
Finally, all these approches may not be the best possible ones - depending on your application. Specialized data structures are e.g. described in the book "Computational geometry" by Berg et. al.
Unfortunaetlly, I do not know any free implementation of these algorithms...

Related

Searching data using different keys

I am no expert in C++ and STL.
I use a structure in a Map as data. Key is some class C1.
I would like to access the same data but using a different key C2 too (where C1 and C2 are two unrelated classes).
Is this possible without duplicating the data?
I tried searching in google, but had a tough time finding an answer that I could understand.
This is for an embedded target where boost libraries are not supported.
Can somebody offer help?

You may store pointers to Data as std::map values, and you can have two maps with different keys pointing to the same data.
I think a smart pointer like std::shared_ptr is a good option in this case of shared ownership of data:
#include <map> // for std::map
#include <memory> // for std::shared_ptr
....
std::map<C1, std::shared_ptr<Data>> map1;
std::map<C2, std::shared_ptr<Data>> map2;
Instances of Data can be allocated using std::make_shared().

Not in the Standard Library, but Boost offers boost::multi_index

Two keys of different types
I must admit I've misread a bit, and didn't really notice you want 2 keys of different types, not values. The solution for that will base on what's below, though. Other answers have pretty much what will be needed for that, I'd just add that you could make an universal lookup function: (C++14-ish pseudocode).
template<class Key>
auto lookup (Key const& key) { }
And specialize it for your keys (arguably easier than SFINAE)
template<>
auto lookup<KeyA> (KeyA const& key) { return map_of_keys_a[key]; }
And the same for KeyB.
If you wanted to encapsulate it in a class, an obvious choice would be to change lookup to operator[].
Key of the same type, but different value
Idea 1
The simplest solution I can think of in 60 seconds: (simplest meaning exactly that it should be really thought through). I'd also switch to unordered_map as default.
map<Key, Data> data;
map<Key2, Key> keys;
Access via data[keys["multikey"]].
This will obviously waste some space (duplicating objects of Key type), but I am assuming they are much smaller than the Data type.
Idea 2
Another solution would be to use pointers; then the only cost of duplicate is a (smart) pointer:
map<Key, shared_ptr<Data>> data;
Object of Data will be alive as long as there is at least one key pointing to it.

What I usually do in these cases is use non-owned pointers. I store my data in a vector:
std::vector<Data> myData;
And then I map pointers to each element. Since it is possible that pointers are invalidated because of the future growth of the vector, though, I will choose to use the vector indexes in this case.
std::map<Key1, int> myMap1;
std::map<Key2, int> myMap2;
Don't expose the data containers to your clients. Encapsulate element insertion and removal in specific functions, which insert everywhere and remove everywhere.

Bartek's "Idea 1" is good (though there's no compelling reason to prefer unordered_map to map).
Alternatively, you could have a std::map<C2, Data*>, or std::map<C2, std::map<C1, Data>::iterator> to allow direct access to Data objects after one C2-keyed search, but then you'd need to be more careful not to access invalid (erased) Data (or more precisely, to erase from both containers atomically from the perspective of any other users).
It's also possible for one or both maps to move to shared_ptr<Data> - the other could use weak_ptr<> if that's helpful ownership-wise. (These are in the C++11 Standard, otherwise the obvious source - boost - is apparently out for you, but maybe you've implemented your own or selected another library? Pretty fundamental classes for modern C++).
EDIT - hash tables versus balanced binary trees
This isn't particularly relevant to the question, but has received comments/interest below and I need more space to address it properly. Some points:
1) Bartek's casually advising to change from map to unordered_map without recommending an impact study re iterator/pointer invalidation is dangerous, and unwarranted given there's no reason to think it's needed (the question doesn't mention performance) and no recommendation to profile.
3) Relatively few data structures in a program are important to performance-critical behaviours, and there are plenty of times when the relative performance of one versus another is of insignificant interest. Supporting this claim - masses of code were written with std::map to ensure portability before C++11, and perform just fine.
4) When performance is a serious concern, the advice should be "Care => profile", but saying that a rule of thumb is ok - in line with "Don't pessimise prematurely" (see e.g. Sutter and Alexandrescu's C++ Coding Standards) - and if asked for one here I'd happily recommend unordered_map by default - but that's not particularly reliable. That's a world away from recommending every std::map usage I see be changed.
5) This container performance side-track has started to pull in ad-hoc snippets of useful insight, but is far from being comprehensive or balanced. This question is not a sane venue for such a discussion. If there's another question addressing this where it makes sense to continue this discussion and someone asks me to chip in, I'll do it sometime over the next month or two.

You could consider having a plain std::list holding all your data, and then various std::map objects mapping arbitrary key values to iterators pointing into the list:
std::list<Data> values;
std::map<C1, std::list<Data>::iterator> byC1;
std::map<C2, std::list<Data>::iterator> byC2;
I.e. instead of fiddling with more-or-less-raw pointers, you use plain iterators. And iterators into a std::list have very good invalidation guarantees.

I had the same problem, at first holding two map for shared pointers sound very cool. But you will still need to manage this two maps(inserting, removing etc...).
Than I came up with other way of doing this.
My reason was; accessing a data with x-y or radius-angle. Think like each point will hold data but point could be described as cartesian x,y or radius-angle .
So I wrote a struct like
struct MyPoint
{
std::pair<int, int> cartesianPoint;
std::pair<int, int> radianPoint;
bool operator== (const MyPoint& rhs)
{
if (cartesianPoint == rhs.cartesianPoint || radianPoint == rhs.radianPoint)
return true;
return false;
}
}
After that I could used that as key,
std::unordered_map<MyPoint, DataType> myMultIndexMap;
I am not sure if your case is the same or adjustable to this scenerio but it can be a option.

How to find right data structure for a searching application?

My question can be asked in two different aspects: one is from data structure perspective, and the other is from image processing perspective. Let's begin with the data structure perspective: suppose now I have a component composed of several small items as the following class shows:
class Component
{
public:
struct Point
{
float x_;
float y_;
};
Point center;
Point bottom;
Point top;
}
In the above example, the Component class is composed of member variables such as center, bottom and top (small items).
Now I have a stack of components (the number of components is between 1000 and 10000), and each component in the stack has been assigned different values, which means there are no duplicate components in the stack. Then, if one small item in the component, for example, 'center' in the illustrated class is known, we can find the unique component in the stack. After that, we can retrieve other properties in the component. Then my question is, how to build a right container data structure to make the searching easier? Now I am considering to use vector and find algorithm in STL(Pseudocode）:
vector<Component> comArray;
comArray.push_back( component1);
.....
comArray.push_back(componentn);
find(comArray.begin(), comArray.end(), center);
I was wondering whether there are more efficient containers to solve this problem.
I can also explain my question from image processing perspective. In image processing, connect component analysis is a very important step for object recognition. Now for my application I can obtain all the connect components in the image, and I also find interesting objects should fulfill the following requirement: their connect component centers should be in a specific range. Therefore, given this constraint, I can eliminate many connected components and then work on the candidate ones. The key step in the above procedure is to how to search for candidate connected components if the central coordinate constraint is given. Any idea will be appreciated.

If you need to be able to get them rather fast, here's a little strange solution for you.
Note that it is a bad solution general-speaking, but it may suit you.
You could make an ordinary vector< component >. Or that can even be a simple array. Then make three maps:
map< Point, Component *>center
map< Point, Component *>bottom
map< Point, Component *>top
Fill them with all the available values of center, bottom and top as keys, and provide pointers to the corresponding Components as values (you could also use just indexes in a vector, so it would be map< Point, int >).
After that, you just use center[Key], bottom[Key] or top[Key], and get either your value (if you store pointers), or the index of your value in the array (if you store indexes).
I wouldn't use such an approach often, but it could work if the values will not change (so you can fill the index maps once), and the data amount is rather big (so searching through an unsorted vector is slow), and you will need to search often.

You probably want spatial indexing data structures.

I think you want to use a map or a hash_map to efficiently lookup your component based on a "center" value.
std::map<Component::Point, Component> lookuptable;
lookuptable[component1.center] = component1;
....
auto iterator = lookuptable.find(someCenterValue)
if (iterator != lookuptable.end())
{
componentN = iterator->second;
}
As for finding elements in your set that are within a given coordinated range. There are several ways to do this. One easy way is to just to have two sorted arrays of the component list, one sorted on the X axis and the other on the Y axis. Then to find the matching elements, you just do a binary search on either axis for the one closest to your target. Then expand scan up and down the array until you go out of range. You could also look at using a kd-tree and find all the nearest neighbors.

If you want to access them in const time and don't want to modify it. I think std::set is good choice for you code.
set<Component> comArray;
comArray.push_back( component1);
.....
comArray.push_back(componentn);
set<Component>::iterator iter = comArray.find(center)
of course, you should write operator== for class Component and nesting struct Point.

map inside map ( Map as key)

I have created map inside following way.
Ex: map first;
and I have to created second map into following way as per my requirement.
map second.
So first is the key value for in second map.
I have inserted data into both map.
first.insert("Test1",1);
second.insert(first,2).
First Just I wantt to know is it correct way to do implementation. or Should I use another stl.?
I am facing one issue with this code (Not compliation issue). If I get data from database in following way than the value does not insert into second map.
first.insert("Test1",2);
second.insert(first,1). But I belive that it should enter into map as ("Test1" && 1) and
("Test" && 2) both are diffirent key for second map.

Why would you like to use a map as a key type?
Keys should be small, since you have no guarantee how many copies of them will STL do. Using (potentially large) std::map as a key will kill your apllication's performance.

First of all, for "STL", let me quote !stl from ##c++ at freenode:
`STL' is sometimes used to mean: (1) C++ standard library; (2) the library Stepanov designed at HP; (3) the parts of [1] based on [2]; (4) specific vendor implementations of either [1], [2], or [3]; (5) the underlying principles of [2]. As such, the term is highly ambiguous, and must be used with extreme caution. If you meant [1] and insist on abbreviating, "stdlib" is a far better choice.
Next: of course you can use map as key, but there is probably no comparator for it (I doubt there is std::less for map...). But remember - comparator doesn't check if parameters are equal - it checks, whether first is less than/greater than the second, because it's easier to model every possible relations using "less than":
a == b <=> !(a < b) && !(b < a)
And now, more ontopic:
From what you have written, I don't quite get the point of having map<map, anything else>. Could you provide some testcase? I will be able to give you complete answer, then.

how boost multi_index is implemented

I have some difficulties understanding how Boost.MultiIndex is implemented. Lets say I have the following:
typedef multi_index_container<
employee,
indexed_by<
ordered_unique<member<employee, std::string, &employee::name> >,
ordered_unique<member<employee, int, &employee::age> >
>
> employee_set;
I imagine that I have one array, Employee[], which actually stores the employee objects, and two maps
map<std::string, employee*>
map<int, employee*>
with name and age as keys. Each map has employee* value which points to the stored object in the array. Is this ok?

A short explanation on the underlying structure is given here, quoted below:
The implementation is based on nodes interlinked with pointers, just as say your favorite std::set implementation. I'll elaborate a bit on this: A std::set is usually implemented as an rb-tree where nodes look like
struct node
{
// header
color c;
pointer parent,left,right;
// payload
value_type value;
};
Well, a multi_index_container's node is basically a "multinode" with as many headers as indices as well as the payload. For instance, a multi_index_container with two so-called ordered indices uses an internal node that looks like
struct node
{
// header index #0
color c0;
pointer parent0,left0,right0;
// header index #1
color c1;
pointer parent1,left1,right2;
// payload
value_type value;
};
(The reality is more complicated, these nodes are generated through some metaprogramming etc. but you get the idea) [...]

Conceptually, yes.
From what I understand of Boost.MultiIndex (I've used it, but not seen the implementation), your example with two ordered_unique indices will indeed create two sorted associative containers (like std::map) which store pointers/references/indices into a common set of employees.
In any case, every employee is stored only once in the multi-indexed container, whereas a combination of map<string,employee> and map<int,employee> would store every employee twice.
It may very well be that there is indeed a (dynamic) array inside some multi-indexed containers, but there is no guarantee that this is true:
[Random access indices] do not provide memory contiguity,
a property of std::vectors by which
elements are stored adjacent to one
another in a single block of memory.
Also, Boost.Bimap is based on Boost.MultiIndex and the former allows for different representations of its "backbone" structure.

Actually I do not think it is.
Based on what is located in detail/node_type.hpp. It seems to me that like a std::map the node will contain both the value and the index. Except that in this case the various indices differ from one another and thus the node interleaving would actually differ depending on the index you're following.
I am not sure about this though, Boost headers are definitely hard to parse, however it would make sense if you think in term of memory:
less allocations: faster allocation/deallocation
better cache locality
I would appreciate a definitive answer though, if anyone knows about the gore.

element-wise operations with boost c++ ublas matrix and vector types

i'd like to perform element-wise functions on boost matrix and vector types, e.g. take the logarithm of each element, exponentiate each element, apply special functions, such as gamma and digamma, etc. (similar to matlab's treatment of these functions applied to matrices and vectors.)
i suppose writing a helper function that brute-forced this for each desired function would suffice, but this seems wasteful.
likewise, the boost wiki offers some code to vectorize standard functions, but this seems quite complex.
valarray has been suggested, but i'd like to avoid converting between data types, as i need the ublas data types for other operations (matrix products, sparse matrices, etc.)
any help is greatly appreciated.

The use of begin1() / end1() won't work because it provides access to the element in the default column position (0): consequently, you just access all the elements in the first column. It is better (in the sense that you get the expected behavior) to get sequential access via:
std::transform(mat.data().begin(), mat.data().end(),
mat.data().begin(), boost::math::tgamma) ;
I suspect this may be a case where the implementation is not quite complete.
Enjoy!

WARNING
The following answer is incorrect. See Edit at the bottom. I've left the original answer as is to give context and credit to those who pointed out the error.
I'm not particularly familiar with the boost libraries, so there may be a more standard way to do this, but I think you can do what you want with iterators and the STL transform function template. The introduction to the uBLAS library documentation says its classes are designed to be compatible with the same iterator behavior that is used in the STL. The boost matrix and vector templates all have iterators which can be used to access the individual elements. The vector has begin() and end(), and the matrix has begin1(), end1(), begin2(), and end2(). The 1 varieties are column-wise iterators and the 2 varieties are row-wise iterators. See the boost documentation on VectorExpression and MatrixExpression for a little more info.
Using the STL transform algorithm, you can apply a function to each element of an iterable sequence and assign the result to a different iterable sequence of the same length, or the same sequence. So to use this on a boost uBLAS vector you could do this:
using namespace boost::numeric::ublas;
// Create a 30 element vector of doubles
vector<double> vec(30);
// Assign 8.0 to each element.
std::fill(vec.begin(), vec.end(), 8.0);
// Perform the "Gamma" function on each element and assign the result back
// to the original element in the vector.
std::transform(vec.begin(), vec.end(), vec.begin(), boost::math::tgamma);
For a matrix it would be basically the same thing, you would use either the 1 or 2 family of iterators. Which one you choose to use depends on whether the memory layout of your matrix is row major or column major. A cursory scan of the uBLAS documentation leads me to believe that it could be either one, so you will need to examine the code and determine which one is being used so you choose the most efficient iteration order.
matrix<double> mat(30, 30);
.
.
.
std::transform(mat.begin1(), mat.end1(), mat.begin1(), boost::math::tgamma);
The function you pass as the last argument can be a function taking a single double argument and returning a double value. It can also be a functor.
This is not exactly the same as the vectorization example you cited, but it seems like it should be pretty close to what you want.
EDIT
Looks like I should have tested my recommendation before making it. As has been pointed out by others, the '1' and '2' iterators only iterate along a single row / column of the matrix. The overview documentation in Boost is seriously misleading on this. It claims that begin1() "Returns a iterator1 pointing to the beginning of the matrix" and that end1() "Returns a iterator1 pointing to the end of the matrix". Would it have killed them to say "a column of the matrix" instead of "matrix"? I assumed that an iterator1 was a column-wise iterator that would iterate over the whole matrix. For the correct way to do this, see Lantern Rouge's answer.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js