Getting Unique Numbers and Knowing When They're Freed - c++

I have a physics simulation (using Box2D) where bodies with identical integer IDs do not collide, for instance, bodies that belong to the same character. I have a problem though in that I need to be able to get a unique number for each possible entity, so that no two characters accidentally get the same ID. There's a finite number of bodies, but they are created and destroyed as the simulation dictates, so it's necessary to free unique IDs once the body they belonged to is gone.
A class World is responsible for creating and destroying all bodies, and is also the entity that manages the unique number generation, and anything else where physics simulation is concerned.
I thought of two methods so far but I'm not sure which would be better, if either of them at all:
Keep a vector<short>, with the data being the number of references floating around, and the position in the vector being the ID itself. This method has the disadvantage of creating unneeded complexity when coding entities that manipulate group IDs, since they would need to ensure they tell the World how many references they're taking out.
Keep a vector<bool>, with the data being if that ID is free or not, and the position in the vector being the ID itself. The vector would grow with every new call for a unique ID, if there exist no free slots. The disadvantage is that once the vector reaches a certain size, an audit of the entire simulation would need to be done, but has the advantage of entities being able to grab unique numbers without having to help manage reference counting.
What do you folks think, is there a better way?

You could maintain a "free" list of unused IDs as a singly linked list inside your master World object.
When an object is destroyed by World (making its ID unused) you could push that ID onto the head of the free list.
When you are creating a new object you could do the following:
If the free list is non-empty: pop the head item and take that ID.
Else increment a global ID counter and assign it's current value.
While you could still run out of IDs (if you simultaneously had more objects than the max value of your counter), this strategy will allow you to recycle IDs, and to do everything with O(1) runtime complexity.
EDIT: As per #Matthieu's comments below, a std::deque container could also be used to maintain the "free" list. This container also supports the push_front, pop_front operations with O(1) complexity .
Hope this helps.

How many bodies are there? Is it realistic that you'd ever run out of integers if you didn't reassign them? The simplest solution is to just have one integer storing the next ID -- you would increment this integer when you assign a new ID to a body.

Related

Moving values between lockfree lists

Background
I am trying to design and implement lock-free hashmap using chaining method in C++. Each hash table cell is supposed to contain lockfree list. To enable resizing, my data structure is supposed to contain two arrays - small one which is always available and a bigger one for resizing, when the smaller one is no longer sufficient. When the bigger one is created I would like the data stored in small one to be transfered to bigger one by one, whenever any thread does something with the data structure (adds element, searches or removes one). When all data is transfered, the bigger array is moved in place of smaller and the latter one is deleted. The cycle repeats whenever the array needs to be enlarged.
Problem
As mentioned before, each array is supposed to conatin lists in cells. I am trying to find a way to transfer a value or node from one lockfree list to another in such a manner that would keep value visible in any (or both) of the lists. It is needed to ensure that search in hash map won't give the user false negatives. So my questions are:
Is such lockfree list implementation possible?
If so, what would be the general concept of such list and "moving node/value" operation? I would be thankful for any pseudocode, C++ code or scientific article describing it.
To be able to resize the array, while maintaining the lock-free progress guarantees, you will need to use operation descriptors. Once the resize starts, add a descriptor that contains references to the old and the new arrays.
On any operation (add, search, or remove):
Add operation, search the old array, if the element already exists, then move the element to the new array before returning. Indicate, with a descriptor or a special null value that the element has already been moved so that other threads don't attempt the move again
Search, search the old array and move the element as indicated above.
Remove - Remove too will have to search the old array first.
Now the problem is that you will have a thread that has to verify that the move is complete, so that you can remove the descriptor and free up the old array. To maintain lock-freedom, you will need to have all active threads attempt to do this validation, thus it becomes very expensive.
You can look at:
https://dl.acm.org/citation.cfm?id=2611495
https://dl.acm.org/citation.cfm?id=3210408

Which data structure is sorted by insertion and has fast "contains" check?

I am looking for a data structure that preserves the order in which the elements were inserted and offers a fast "contains" predicate. I also need iterator and random access. Performance during insertion or deletion is not relevant. I am also willing to accept overhead in terms of memory consumption.
Background: I need to store a list of objects. The objects are instances of a class called Neuron and stored in a Layer. The Layer object has the following public interface:
class Layer {
public:
Neuron *neuronAt(const size_t &index) const;
NeuronIterator begin();
NeuronIterator end();
bool contains(const Neuron *const &neuron) const;
void addNeuron(Neuron *const &neuron);
};
The contains() method is called quite often when the software runs, I've asserted that using callgrind. I tried to circumvent some of the calls to contains(), but is still a hot spot. Now I hope to optimize exactly this method.
I thought of using std::set, using the template argument to provide my own comparator struct. But the Neuron class itself does not give its position in the Layer away. Additionally, I'd like to have *someNeuronIterator = anotherNeuron to work without screwing up the order.
Another idea was to use a plain old C array. Since I do not care about the performance of adding a new Neuron object, I thought I could make sure that the Neuron objects are always stored linear in memory. But that would invalidate the pointer I pass to addNeuron(); at least I'd have to change it to point to the new copy I created to keep things linear aligned. Right?
Another idea was to use two data structures in the Layer object: A vector/list for the order, and a map/hash for lookup. But that would contradict my wish for an iterator that allowed operator* without a const reference, wouldn't it?
I hope somebody can hint an idea for a data structure or a concept that would satisfy my needs, or at least give me an idea for an alternative. Thanks!
If this contains check is really where you need the fastest execution, and assuming you can be a little intrusive with the source code, the fastest way to check if a Neuron belongs in a layer is to simply flag it when you insert it into a layer (ex: bit flag).
You have guaranteed O(1) checks at that point to see if a Neuron belongs in a layer and it's also fast at the micro-level.
If there can be numerous layer objects, this can get a little trickier, as you'll need a separate bit for each potential layer a neuron can belong to unless a Neuron can only belong in a single layer at once. This is reasonably manageable, however, if the number of layers are relatively fixed in size.
If the latter case and a Neuron can only belong to one layer at once, then all you need is a backpointer to Layer*. To see if a Neuron belongs in a layer, simply see if that backpointer points to the layer object.
If a Neuron can belong to multiple layers at once, but not too many at one time, then you could store like a little array of backpointers like so:
struct Neuron
{
...
Layer* layers[4]; // use whatever small size that usually fits the common case
Layer* ptr;
int num_layers;
};
Initialize ptr to point to layers if there are 4 or fewer layers to which the Neuron belongs. If there are more, allocate it on the free store. In the destructor, free the memory if ptr != layers. You can also optimize away num_layers if the common case is like 1 layer, in which case a null-terminated solution might work better. To see if a Neuron belongs to a layer, simply do a linear search through ptr. That's practically constant-time complexity with respect to the number of Neurons provided that they don't belong in a mass number of layers at once.
You can also use a vector here but you might reduce cache hits on those common case scenarios since it'll always put its contents in a separate block, even if the Neuron only belongs to like 1 or 2 layers.
This might be a bit different from what you were looking for with a general-purpose, non-intrusive data structure, but if your performance needs are really skewed towards these kinds of set operations, an intrusive solution is going to be the fastest in general. It's not quite as pretty and couples your element to the container, but hey, if you need max performance...
Another idea was to use a plain old C array. Since I do not care about the performance of adding a new Neuron object, I thought I could make sure that the Neuron objects are always stored linear in memory. But that would invalidate the pointer I pass to addNeuron(); [...]
Yes, but it won't invalidate indices. While not as convenient to use as pointers, if you're working with mass data like vertices of a mesh or particles of an emitter, it's common to use indices here to avoid the invalidation and possibly to save an extra 32-bits per entry on 64-bit systems.
Update
Given that Neurons only exist in one Layer at a time, I'd go with the back pointer approach. Seeing if a neuron belongs to a layer becomes a simple matter of checking if the back pointer points to the same layer.
Since there's an API involved, I'd suggest, just because it sounds like you're pushing around a lot of data and have already profiled it, that you focus on an interface which revolves around aggregates (layers, e.g.) rather than individual elements (neurons). It'll just leave you a lot of room to swap out underlying representations when your clients aren't performing operations at the individual scalar element-type interface.
With the O(1) contains implementation and the unordered requirement, I'd go with a simple contiguous structure like std::vector. However, you do expose yourself to potential invalidation on insertion.
Because of that, if you can, I'd suggest working with indices here. However, that become a little unwieldy since it requires your clients to store both a pointer to the layer in which a neuron belongs in addition to its index (though if you do this, the backpointer becomes unnecessary as the client is tracking where things belong).
One way to mitigate this is to simply use something like std::vector<Neuron*> or ptr_vector if available. However, that can expose you to cache misses and heap overhead, and if you want to optimize that, this is where the fixed allocator comes in handy. However, that's a bit of a pain with alignment issues and a bit of a research topic, and so far it seems like your main goal is not to optimize insertion or sequential access quite as much as this contains check, so I'd start with the std::vector<Neuron*>.
You can get O(1) contains-check, O(1) insert and preserve insertion order. If you are using Java, looked at LinkedHashMap. If you are not using Java, look at LinkedHashMap and figure out a parallel data structure that does it or implement it yourself.
It's just a hashmap with a doubly linked list. The link list is to preserve order and the hashmap is to allow O(1) access. So when you insert an element, it makes an entry with the key, and the map will point to a node in the linked list where your data will reside. To look up, you go to the hash table to find the pointer directly to your linked list node (not the head), and get the value in O(1). To access them sequentially, you just traverse the linked list.
A heap sounds like it could be useful to you. It's like a tree, but the newest element is always inserted at the top, and then works its way down based on its value, so there is a method to quickly check if it's there.
Otherwise, you could store a hash table (quick method to check if the neuron is contained in the table) with the key for the neuron, and values of: the neuron itself, and the timestep upon which the neuron was inserted (to check its chronological insertion time).

Arrays vs. Doubly Linked Lists for Queue simulation

I am working on an assignment for school simulating a line with students and multiple windows open at the Registrar's Office.
I got the queue for the students down but it was suggested by someone that I use an array for the windows implementing our queue class we made on our own.
I don't understand why an array would work when there are other variables I want to know about each window besides just the student time decrementing.
I'm just looking for some direction or more in depth explanation on how that's possible to use an array to just store the time each student is at the window as opposed to another doubly linked list?
The way I see it you've got a variable number of students and a fixed number of windows (buildings don't usually change all that often). If I were to make a representation of this in code I would use a dynamically sized container (a list, vector, queue, etc.) to contain all the students and a fixed-size array for the registers. This would embody the intent of the real situation in code, making it less likely that someone else using your code makes any mistakes related to the size of the Registrar's Office. Often choosing a container type is all about its intended use!
Thus you can design a class to hold all the registers using a fixed-size array (or even nicer: a template-dictated size seeing as your using C++). Then you can write all your other Registrar-related functions using the given size argument and thus never go out-of-bounds in your Registrar-array.
Lastly: an array holds whatever information you want it to hold. You can have it hold only numbers (like int) but you can also have it hold objects of any type! What I mean to say is: create a Registrar class that holds all the information you want to collect for every individual Registrar. Then create an array that holds Registrar objects. Then whenever you access an individual element in the array you can access all the information of the individual Registrar through that single reference.

Determining the best ADT for a priority queue with changeable elements (C++)

First post here and I'm a beginner - hope I'm making myself useful...
I'm trying to find and understand the ADT/concept that does the job I'm after. I'm guessing it's already out there.
I have an array/list/tree (container to be decided) of objects each of which has a count associated with how much it hasn't been used over iterations of a process. As iterations proceed the count for each object accumulates by 1. The idea is that sooner or later I'm going to need the memory that any unused objects are using so I'll delete them to make space for an object not in RAM (which will have an initial count of '0') - But, if it turns out that I use an object that is still in memory it's count is reset to '0', and I pat myself on the back for not having had to access the disk for its contents.
A cache?
The main process loop would have something similar to the following in it:
if (object needs to be added && (totalNumberOfObjects > someConstant))
object with highest count deleted from RAM and the (heap??)
newObject added with a count of '0'
if (an object already in RAM is accessed by the process)
accessedObject count is set to '0'
for (All objects in RAM)
count++
I could bash about for a (long and buggy time) and build my own mess, but I thought it'd be interesting to learn the most efficient way from word go.
Something like a heap?
You could use a heap for this, but I think it would be overkill. It sounds like you're not going to have a lot of different values for the counts, and you'll have a lot of objects with each count. If that's true, then you only need thread the objects onto a list of objects with the same count. These lists are themselves arranged in a dequeue (or 'deque' as C++ insists on calling it).
The key here is that you need to increment the count of all objects, and presumably you want that to be O(1) if possible, rather than O(N). And it is possible: the key is that each list's header contains also the difference of its count from the next smaller count. The header of the list with the smallest count contains a delta from 0, which is the smallest count. To increment the count of all objects, you only have to increase this single number by one.
To set an object's count to 0, you remove the object from its list (which means you always need to refer to objects by their list iterator, or you need to implement your own intrusive linked list), and either (1) add it to the bottom list, if that list has a count of 0, or (2) create a new bottom list with a count of 0 containing only that object.
The procedure for creating a new object is the same, except that you don't have to unlink it from its current list.
To evict an object from memory, you choose the object at the head of the top list (which is the list with the largest count). If that list becomes empty, you pop it off the dequeue. If you need more memory, you can repeat this operation.
So all operations, including "increment all counts", are O(1). Unfortunately, the storage overhead is two pointers per object, plus two pointers and an integer per unique count (at worst, this is the same as the number of objects, but presumably in practice it's much less). Since it's hard to imagine any other algorithm which uses less than one pointer plus a count for each object, this is probably not even a space-time tradeoff; the additional space requirements are minimal.

Efficient way to organize used and unused elements in a large concurrent array

I have about 18 million elements in an array that are initialized and ready to be used by a simple manager called ElementManager (this number will later climb to a little more than a billion in later iterations of the program). A class, A, which must use the elements communicates with ElementManager that returns the next available element for consumption. That element is now in use and cannot be reused until recycled, which may happen often. Class A is concurrent, that is, it can ask ElementManager for an available element in several threads. The elements in this case is an object that stores three vertices to make a triangle.
Currently, the ElementManager is using Intel TBB concurrent_bounded_queue called mAllAvailableElements. There is also another container (a TBB concurrent_vector) that contains all elements, regardless of whether they are available for use or not, called mAllElements. Class A asks for the next available element, the manager tries to pop the next available element from the queue. The popped element is now in use.
Now when class A has done what it has to do, control is handed to class B which now has to iterate through all elements that are in use and create meshes (to take advantage of concurrency, the array is split into several smaller arrays to create submeshes which scales with the number of available threads - the reason for this is that creating a mesh must be done serially). For this I am currently iterating over the container mAllElements (this is also concurrent) and grabbing any element that is in use. The elements, as mentioned above, contain polygonal information to create meshes. Iteration in this case takes a long time as it has to check each element and query whether it is in use or not, because if it is not in use then it should not be part of a mesh.
Now imagine if only 1 million out of the possible 18 million elements were in use (but more than 5-6 million were recycled). Worse yet, due to constant updates to only part of the mesh (which happens concurrently) means the in use elements are fragmented throughout the mAllElements container.
I thought about this for quite some time now and one flawed solution that I came up with was to create another queue of elements named mElementsInUse, which is also a concurrent_queue. I can push any element that is now in use. Problem with this approach is that since it is a queue, any element in that queue can be recycled at any time (an update in a part of the mesh) and declared not in use and since I can only pop the front element, this approach fails. The only other approach I can think of is to defragment the concurrent_vector mAllElements every once in a while when no operations are taking place.
I think my approach to this problem is wrong and thus my post here. I hope I explained the problem in enough detail. It seems like a common memory management problem, but I cannot come up with any search terms to search for it.
How about using a bit vector to indicate which of your elements are in use? It's easy to partition it for parallel processing when building your full mesh, and you can use atomic operations on words in the vector and thus avoid locks.