How can I associate "user data" - i.e. arbitrary data for my application - with bodies in the PlayRho 0.10.0 physics engine?
In the Box2D 2.4.1 physics engine, I can associate "user data" with bodies, using the userData field of a b2BodyDef instance that I pass to the b2World::CreateBody function and get the value back by calling b2Body::GetUserData(). How do you do this in PlayRho?
Possible Solution:
In your application, you can use an array whose elements are your user-data values and whose indices match the underlying values returned from creating your bodies in PlayRho.
For example, a simple/naive implementation for any void* compatible user data might be like:
int main() {
struct MyEntity {
int totalHealth;
int currentHealth;
int strength;
int intellect;
};
std::vector<void*> myUserData;
auto world = World{};
// If you want to pre-allocate 100 spaces...
myUserData.resize(100);
const auto body = world.CreateBody();
// If your # bodies is unlimited or you don't want to preallocate space...
if (body.get() >= myUserData.size()) myUserData.resize(body.get());
// Set your user data...
myUserData[body.get()] = new MyEntity();
// Gets your user data...
const auto myEntity = static_cast<MyEntity*>(myUserData[body.get()]);
// Frees your dynamically allocated MyEntity instances.
// Even with Box2D `userData` you may want to free these.
for (const auto& element: myUserDAta) {
delete element;
}
return 0;
}
But if you'd like to avoid dealing with headaches like memory leaks, myUserData could instead be std::vector<MyEntity> myUserData;, and new MyEntity() and delete element; calls could be avoided.
Some advantages to this:
Provides greater flexibility. User data is often application specific. Since you implement this storage yourself when using PlayRho, you're freer to make elements be any type you want and you don't have to make all of your uses of the physics engine have the same user data type. Your user data type could be world specific for instance whereas in Box2D all of your uses of its userData field would have the same type.
Avoids wasting memory. Not all applications need user data so those that don't won't be wasting this memory or having you modify the library code to avoid it.
Disadvantages:
This is different than in Box2D.
This may require more coding effort on your part if you don't care about having the extra flexibility (since the extra flexibility might save you some coding effort too).
Background/explanation:
While the PlayRho physics engine started off as a fork of the Box2D physics engine, PlayRho has moved away from reference semantics (pointers) and towards value semantics (values). So pointers were replaced and "user data" was outright removed in favor of alternatives like this possible solution. Additionally, with this shift to value semantics, the concept of creating a body changed from getting a pointer to a new body back from the world, to basically getting an integer index to the new body back from the world instead. That index acts as an identifier to the new body within the world and is basically an incremented counter from the world starting from 0 and incremented every time you create a new body. This means that you can have O(1) lookups from an array using the underlying body ID value as the index to the element that stores your user data. Using std::unordered_map<b2Body*, b2BodyUserData> would also provide O(1) lookups but hashed maps tend to be less cache friendly on modern hardware than arrays so it makes more sense to avoid such overhead in Box2D by setting aside storage for a user data value per body than it does in PlayRho.
Related
I need to know few things about array element allocation over domain map in chapel
Let me keep this as short as possible
region = {1..10,5..10}
regionbox = {1..5,1..5}
grid2d = /*a 2D arrangement of locales*/
Space = domain(2) dmapped Block( boundingBox = regionbox,
target_locales = grid2d
) = region.
var : myarray[Space] int;
Now Space is a distributed domain.
So here comes in.
In a distributed domain, whether we have to keep all our indexes in each locality that is
For the above example.
whether we have to keep the indexes which maps to locales, locally on all locales ?
I hope that domain map supports global-view programming so when we are accessing myarray[3,5], it dynamically maps to associative locale using the dist.
Please correct me If I'm wrong
And how are arrays allocated over the distributed domains?
Is it that domain maps have some features, which calculate the individual local size at start, from the given parameters, and allocate local_size elements in each locale ?
Like
blocking 10 elements over 2 locales needs a local size of 5.
I want to know how the array elements are created over the distributed domain and also whether the index which are mapped to locality according to distribution, got stored in that locality ?
Please let me know if this question needs more info
Thank you for your kind help
As with your previous question, the answer to this question depends on the specific domain map. The Chapel language and compiler expect a domain map (and its implementation of domains and arrays) to support a standard interface, but how it implements that interface is completely up to its author. This interface includes things like "allocate a new domain for me", "allocate a new array over that domain for me", "iterate over the indices/elements of the domain/array", "randomly access the array", etc. Thus, a given domain map implementation may be very space efficient and minimal, or it can allocate everything on every locale redundantly, as its author thinks best.
That said, if we consider standard domain maps like Block, they behave the way you would expect: E.g., for a {1..n, 1..n} array mapped across 4 locales, each locale will store ~(n**2 / 4) elements of the array rather than all n**2 elements. A random access to that array will be implemented by determining which locale owns the element and having the compiler/runtime manage the communication required to get at that remote element (as implemented by the domain map). Information is stored redundantly when it only requires O(1) storage, since this redundancy is better than communicating to get the values. E.g., each locale would store the {1..n, 1..n} bounds of the domain/array since it is cheaper to store those bounds than to communicate with some centralized location to get them.
This is one of those cases where a picture can be worth a thousand words. Taking a look at the slides for the talks where we introduced these concepts (like slide 34 of this presentation) could be much more instructive than the following text-based description.
Walking through your declarations and cleaning them up a bit, here's roughly what happens as this code is executed:
const region = {1..10,5..10},
regionbox = {1..5,1..5},
grid2d = /*a 2D arrangement of locales*/;
Nothing about these declarations refer to other locales (no on-clauses, no dmapped clauses), so these would all result in domains and arrays that are stored locally on the locale where the task encountering the declarations is executing (locale #0 at the program's start-up time).
const Space : domain(2) dmapped Block( boundingBox = regionbox,
target_locales = grid2d
) = region.
The dmapped Block(...) clause causes an instance of the Block domain map class to be allocated on each locale in grid2d. Each instance of the class stores the bounding box (regionbox) and the set of target locales. Each locale also gets an instance of a class representing the local view of the distribution named LocBlock which stores the subset of the 2D plane which is owned by that locale as defined by the bounding box and the target locale set.
The declaration and initialization of Space invokes a method on the current locale's copy of the Block domain map object created in the previous step, asking it to create a new domain. This causes each locale in grid2d to allocate a pair of classes corresponding to the global and local views of the domain, respectively. The global descriptor describes the domain's indices as a whole (e.g., region) while the local descriptor describes that locale's personal subset of region.
var myarray: [Space] int;
This declaration asks the current locale's copy of the global Space domain class created in the previous step to create a new array. This causes each locale in grid2d to allocate a pair of classes representing the global and local views of the array. The global view of the array tends not to store much state and is used primarily to dispatch methods on the array to the appropriate local descriptor. The local descriptor stores the array elements corresponding to the locale's subarray.
I hope this helps clarify the issues you are asking about.
I have an object-based adjacency list graph that consists of nodes and edges stored in a vector.
class Graph
{
struct NodePrivate
{
QVector<int> m_FromEdges, m_ToEdges;
};
struct EdgePrivate
{
int m_iFrom, m_iFromIndex, m_iTo, m_iToIndex;
};
//...
private:
QVector<NodePrivate> m_Nodes;
QVector<EdgePrivate> m_Edges;
};
In order to ensure contiguity (and constant speed) of the graph elements when removing them I do removals by swapping the last element with the one to be removed.
Now when user of the graph accesses the elements he does so via Node and Edge classes that are really just a wrapper around an index to the graph (and int).
class Item
{
//...
private:
int m_Index = -1; //or QSharedPointer<int>, see below
const Graph *m_Graph = nullptr;
};
class Node : public Item {};
class Edge : public Item {};
By removing a node or an edge these indexes might become invalid. I would like these to be persistent and insofar have tried (successfuly) two strategies but I do not like either of them very much:
1) Track all objects of type Node and Edge by registering them and deregistering them in constructor(s) and destructor respectively. These are then used to update the internal index whenever the relevant index changes. Biggest drawback of this is quite a lot of unnecessary registered temporaries.
2) The other option is to use smart-pointer approach by having the index dynamic (std::shared_ptr<int>). The index is then updated through that which is arguably better than updating all objects but at the cost of dynamic memory.
Is there any other option to implement this or improve upon these two designs?
First of all, I must admit that I don't think this problem can be solved perfectly. If you really want to make a lot of small changes to your graphs regularly, then you should switch to storing everything in linked lists instead of arrays. Also, you can just give up and say explicitly, that all Node and Edge handles are invalidated, just like std::vector::iterator-s are invalidated when you add an element to std::vector.
General discussion
In your case, vertices and adjacency lists are stored in arrays. Also, you have Node and Edge helpers, which allow user to point to the real nodes and edges whenever they want. I'll call them handles (they are like C++ iterators without any iteration capabilities). I see two different ways for maintaining the handles after changes.
The first way is to store direct pointer (or index) to a physical object in each handle, as you do it now. In this case you have to change all handles to an object, whenever the object is moved. That is why you absolutely must register all the handles you give away somewhere. This is exactly the first solution you suggest, and it leads to "heavy" handles: creating, deleting and copying handles becomes costly, regardless of whether any objects are actually moved.
The second way is to store pointer to some intermediate thing inside a Handle. Then make sure that this thing is never changed during object's lifetime, even if objects move. Clearly, the thing you point to in a handle must be something different from real physical index of your node of edge, since they change. In this approach you have to pay for indirect access each time a handle is dereferenced, so handle access becomes slightly heavier.
The second solution you propose is following this second approach. The intermediate things (which are being pointed to by your handles) are dynamically allocated int-s wrapped in shared_ptr, one never-moving int per object. You have to suffer at least from separate dynamic allocation (+deallocation) per each object created, also from reference counters updates. The reference counters can be easily removed: store unique_ptr-s in NodePrivate and EdgePrivate objects, and raw pointers in Node and Edge objects.
New approach
The other solution following the second approach is to use IDs as intermediate things pointed to be handles. Whenever you create a node, assign it a new node ID, same for edges. Assign IDs sequentally, starting from zero. Now you can maintain bidirectional correspondence between physical indices and these IDs, and update it in O(1) time on a change.
struct NodePrivate
{
QVector<int> m_FromEdges, m_ToEdges;
int id; //getting ID by physical index
};
struct EdgePrivate
{
int m_iFrom, m_iFromIndex, m_iTo, m_iToIndex;
int id; //getting ID by physical index
};
private:
QVector<NodePrivate> m_Nodes;
QVector<EdgePrivate> m_Edges;
QVector<int> m_NodeById; //getting physical index by ID
QVector<int> m_EdgeById; //getting physical index by ID
Note that these new m_NodeById and m_EdgeById vectors grow when objects are created, but do not shrink when objects are deleted. So you'll have empty cells in these arrays, which will only be deallocated when you delete your graph. So you can use this solution only if you are sure that the total amount of nodes and edges created during graph's lifetime is relatively small, since you take 4 bytes of memory per each such object.
Improving memory consumption
You might have already noticed the similarity between the new solution just presented and the shared_ptr-based solution you had. In fact, if we do not distinguish C pointers and array indices, then they are the same, except for: in your solution int-s are allocated in heap, but in the proposed solution int-s are allocated in a pool allocator.
A very well-known improvement to a no-free pool allocator is the technique known as 'free lists', and we can apply it to the solution described above. Instead of always assigning new IDs to created objects, we allow to reuse them. In order to achieve that, we store a stack of free IDs, When an object is removed, we add its ID to this stack. When a new object is created, we take an ID for it from the stack. If stack is empty, then we assign a new ID.
struct EdgePrivate
{
int m_iFrom, m_iFromIndex, m_iTo, m_iToIndex;
int id; //getting ID by physical index
};
private:
QVector<EdgePrivate> m_Edges;
QVector<int> m_EdgeById; //getting physical index by ID
QVector<int> m_FreeEdgeIds; //freelist: stack of IDs to be reused
This improvement makes sure that memory consumption is proportional of the maximum number of objects you ever had alive simultaneously (not the total number objects created). But of course it increases memory overhead per object even further. It saves you from malloc/free cost, but you can have issues with memory fragmentation for large graphs after many operations.
I'm coding a chess engine in C++ and I'm currently working on move generation. I'm confused as to how I should be storing moves as they are generated. I'm relatively new to C++, but is there some some of dynamic object that I can use to store moves as they come (since I cannot know how many there are).
There are many containers in C++, depending of situation you can use an std::vector, or something else.
As to choose a container would require more information from your chess engine (like how many times would it be resized, does movements can be added at front and back of your container, etc), we cannot give you a direct answer with data that you gave.
Please take a look at this question to define which one would be the most adapted for your case.
You're looking for something like an std::vector - a template that represents a collection whose size changes dynamically:
Vectors are sequence containers representing arrays that can change in size.
Just like arrays, vectors use contiguous storage locations for their elements, which means that their elements can also be accessed using offsets on regular pointers to its elements, and just as efficiently as in arrays. But unlike arrays, their size can change dynamically, with their storage being handled automatically by the container.
Many chess engines heavily use recursion. When your think say 5 moves (5 ply) ahead you're actually getting into 5 recursive calls. If you enter a call, the local variables of that function invocation are stored on the stack. So it theoretically it would suffice to have a local "chessboard", e.g. an array of fields, each holding a piece (or being empty) since all the chessboards would be retainend on the stack automatically until their function invocation returns. Since stack space is usually limited, you could also have each invocation (stack frame) only hold a pointer to a piece of heap memory, allocate it when you enter the function, dealloc when you leave it again. Each function invocation returns to it's caller the "score" (cumulative value) of the combinations at a "deeper" recursion level (using the value of pieces as usual (pawn = 1, queen = 9 etc.). Instead of allocating separate boards you could store them in a vector. The advantage is that your memory is less likely to get fragmented. Each invocation can then e.g. hold the index of its chess board state in the vector.
I have small class Entity with some int fields and field that is two dimensional array of 50 ints. Nothing special.
I generate a lot (millions of such entities), each entity is differ: the array is differs and fields are differs.
For my surprise I found that it is > 2x faster to not create each time new entity and just reuse existent and just set to 0 it
fields and array. Is the memory initialization/deletion so time-consuming?
There is overhead associated with the memory management of objects. This can result in slowdowns.
The best way to know is to time it, as you have done.
Sometimes it won't bother you, other times, you will be very sensitive to it.
Think about which loop would be faster:
while (/* not done */) {
Ask system for memory
Create object
Write into object
Destroy object
}
or
while (/* not done */) {
Write into object
}
I'm writing a little arcade-like game in C++ (a multidirectional 2d space shooter) and I'm finishing up the collision detection part.
Here's how I organized it (I just made it up so it might be a shitty system):
Every ship is composed of circular components - the amount of components in each ship is sort of arbitrary (more components, more CPU cycles). I have a maxComponent distance which I calculate upon creation of the ship which is basically the longest line I can draw from the center of the ship to the edge of the furthest component. I keep track of stuff onscreen and use this maxComponentDistance to see if they're even close enough to be colliding.
If they are in close proximity I start checking to see if the components of different ships intersect. Here is where my efficiency question comes in.
I have a (x,y) locations of the component relative to the ship's center, but it doesn't account for how the ship is currently rotated. I keep them relative because I don't want to have to recalculate components every single time the ship moves. So I have a little formula for the rotation calculation and I return a 2d-vector corresponding to rotation-considerate position relative to the ships center.
The collision detection is in the GameEngine and it uses the 2d-vector. My question is about the return types. Should I just create and return a 2d-vector object everytime that function is called
or
should I give that component object an additional private 2d-vector variable, edit the private variable when the function is called, and return a pointer to that object?
I'm not sure about the efficiency of memory allocation vs having a permanent, editable, private variable. I know that memory would also have to be allocated for the private variable, but not every time it was checked for collisions, only when a new component was created. Components are not constant in my environment as they are deleted when the ship is destroyed.
That's my main dilemma. I would also appreciate any pointers with the design of my actual collision detection system. It's my first time giving a hack at it (maybe should have read up a bit)
Thanks in advance.
You should absolutely try to avoid doing memory allocations for your component-vector on each call to the getter-function. Do the allocation as seldom as possible, instead. For instance, you could do it when the component composition of the ship changes, or even more seldom (by over-allocating).
You could of course also investigate memory pools, where you pre-allocate lots of such components and put in a pool, so you can allocate a new component in constant time.
As a general (and apologies if it's too obvious) point when doing this kind of collision-detection: square the distances, rather than computing the square roots. :)
If your 2D vector is just:
class Vector2D { double x, y; };
Then by all means return it! E.g:
Vector2D function( ... );
Or pass by reference:
void function( Vector2D * theReturnedVector2D, ... );
Avoid at all costs:
vector<double> function(...);
The constant heap allocation/deallocation inherent to the Vector class is a mistake!
Copying your own Vector2D class is very cheap, computationally speaking. Unlike Vector<>, your own Vector2D class can incorporate whatever methods you like.
I've used this feature in the past to incorporate methods such as distanceToOtherPointSquared(), scanfFromCommandLineArguments(), printfNicelyFormatted(), and operator[](int).
or should I give that component object an additional private 2d-vector variable, edit the private variable when the function is called, and return a pointer to that object?
Watch out for multiple function calls invalidating previous data. That's a recipe for disaster!
You can start by just returning a vector, and benchmark it. Who knows, it could be fast enough. With a profiler you can even see what part takes the run time.
You can use a Memory Pool to reuse vectors and reduce copying
You can try the Flyweight pattern for the coordinates to reduce copying and allocating, if they repeat throughout the engine.
Keeping the data in the component is a good way to reduce allocations, but introduces some gotchas into your design, like that whoever uses the vector depends on the lifecycle of the component. A memory pool is probably better.
Do not use a 2D vector. Rather, use a vector of points. Likewise for your collision detection. Using a 2D vector here is just the wrong data structure.
Depending on the content of the function, the compiler will be able to perform NRVO (that is, named return value optimization) which means that in the optimal case, returning a vector has no overhead, i.e. it's never copied. However, this only happens when you use the return value of your function to initialize a new instance, and when the compiler is able to trace the execution paths inside the function and see that for each return path, the same object is returned. Consider the following two:
vector<int> f(int baz) {
vector<int> ret;
if (baz == 42)
ret.push_back(42);
return ret;
}
vector<int> g(int baz) {
if (baz == 42)
return vector<int>(1, 42);
else
return vector<int>();
}
The compiler can perform NRVO for calls to f, but not for g.
There's a big difference between allocating memory on the heap and on the stack. Allocating on the heap, e.g., using new/delete or malloc/free is very slow. Allocating on the stack is really quite fast. With the stack, usually the slow part is copying your object. So watch out for vector and such, but returning simple structures is probably OK.