Efficiently processing large number of unique elements (std::set vs other containers)

Efficiently processing large number of unique elements (std::set vs other containers) - c++

I have std::set having large number unique objects as its elements.
In the main thread of program:
I take some objects from the set
Assign data to be processed to each of them
Remove those objects from set
And finally pass the objects to threads in threadpool for processing
Once those threads finish processing objects, they adds them back to the set. (So that in the next iteration, main thread can again
assigns next batch of data to those objects for processing)
This arrangement works perfect. But if I encounter error while adding back object to the set (for example, std::set.insert() throws bad_alloc) then it all goes on toss.
If I ignore that error and proceed, then there is no way for the object to get back in the processing set and it remains out of the program flow forever causing memory leaks.
To address this issue I tried to not to remove object from set. Instead, have a member flag that indicates the object is 'being processed'. But in that case the problem is, main thread encounters 'being processed' objects again and again while iterating through all elements of set. And it badly hampers performance (Number of objects in set are quite large).
What are better alternatives here?
Can std::list be used instead of std::set? List will not have bad_alloc problem while adding back element, as it just needs to assign pointers while adding element to list. But how can we make list elements unique? If at all we achieve it, will it be efficient as std::set?
Instead of removing and adding back elements to the std::set, is there any way to move element to the start or end of the set? So that unprocessed objects and processed will accumulate together towards start and end of the set.
Any other solution please?

Related

Tracking elements for order in a std::map

I have a std::map like so: std::map<UINT32,USER_DEFINED_X> with a variable number N of elements. This map is part of an overall application that runs on a real time framework. The map contains elements such that it includes times for when certain activities are supposed to occur. During each frame, the map is scanned to see if any of those times match up with current time. There is one condition that needs to be checked though before processing the activities. I need to check to see if the element that is going to be processed is the first one in the list that is being processed. I am not sure how to do that. One approach I thought about using was to create another temporary map/array where I would store the element that has been processed in order, then get the order from that temporary array/map?
Does anybody know of a better way I can conduct this operation?

Moving values between lockfree lists

Background
I am trying to design and implement lock-free hashmap using chaining method in C++. Each hash table cell is supposed to contain lockfree list. To enable resizing, my data structure is supposed to contain two arrays - small one which is always available and a bigger one for resizing, when the smaller one is no longer sufficient. When the bigger one is created I would like the data stored in small one to be transfered to bigger one by one, whenever any thread does something with the data structure (adds element, searches or removes one). When all data is transfered, the bigger array is moved in place of smaller and the latter one is deleted. The cycle repeats whenever the array needs to be enlarged.
Problem
As mentioned before, each array is supposed to conatin lists in cells. I am trying to find a way to transfer a value or node from one lockfree list to another in such a manner that would keep value visible in any (or both) of the lists. It is needed to ensure that search in hash map won't give the user false negatives. So my questions are:
Is such lockfree list implementation possible?
If so, what would be the general concept of such list and "moving node/value" operation? I would be thankful for any pseudocode, C++ code or scientific article describing it.

To be able to resize the array, while maintaining the lock-free progress guarantees, you will need to use operation descriptors. Once the resize starts, add a descriptor that contains references to the old and the new arrays.
On any operation (add, search, or remove):
Add operation, search the old array, if the element already exists, then move the element to the new array before returning. Indicate, with a descriptor or a special null value that the element has already been moved so that other threads don't attempt the move again
Search, search the old array and move the element as indicated above.
Remove - Remove too will have to search the old array first.
Now the problem is that you will have a thread that has to verify that the move is complete, so that you can remove the descriptor and free up the old array. To maintain lock-freedom, you will need to have all active threads attempt to do this validation, thus it becomes very expensive.
You can look at:
https://dl.acm.org/citation.cfm?id=2611495
https://dl.acm.org/citation.cfm?id=3210408

Determining the best ADT for a priority queue with changeable elements (C++)

First post here and I'm a beginner - hope I'm making myself useful...
I'm trying to find and understand the ADT/concept that does the job I'm after. I'm guessing it's already out there.
I have an array/list/tree (container to be decided) of objects each of which has a count associated with how much it hasn't been used over iterations of a process. As iterations proceed the count for each object accumulates by 1. The idea is that sooner or later I'm going to need the memory that any unused objects are using so I'll delete them to make space for an object not in RAM (which will have an initial count of '0') - But, if it turns out that I use an object that is still in memory it's count is reset to '0', and I pat myself on the back for not having had to access the disk for its contents.
A cache?
The main process loop would have something similar to the following in it:
if (object needs to be added && (totalNumberOfObjects > someConstant))
object with highest count deleted from RAM and the (heap??)
newObject added with a count of '0'
if (an object already in RAM is accessed by the process)
accessedObject count is set to '0'
for (All objects in RAM)
count++
I could bash about for a (long and buggy time) and build my own mess, but I thought it'd be interesting to learn the most efficient way from word go.
Something like a heap?

You could use a heap for this, but I think it would be overkill. It sounds like you're not going to have a lot of different values for the counts, and you'll have a lot of objects with each count. If that's true, then you only need thread the objects onto a list of objects with the same count. These lists are themselves arranged in a dequeue (or 'deque' as C++ insists on calling it).
The key here is that you need to increment the count of all objects, and presumably you want that to be O(1) if possible, rather than O(N). And it is possible: the key is that each list's header contains also the difference of its count from the next smaller count. The header of the list with the smallest count contains a delta from 0, which is the smallest count. To increment the count of all objects, you only have to increase this single number by one.
To set an object's count to 0, you remove the object from its list (which means you always need to refer to objects by their list iterator, or you need to implement your own intrusive linked list), and either (1) add it to the bottom list, if that list has a count of 0, or (2) create a new bottom list with a count of 0 containing only that object.
The procedure for creating a new object is the same, except that you don't have to unlink it from its current list.
To evict an object from memory, you choose the object at the head of the top list (which is the list with the largest count). If that list becomes empty, you pop it off the dequeue. If you need more memory, you can repeat this operation.
So all operations, including "increment all counts", are O(1). Unfortunately, the storage overhead is two pointers per object, plus two pointers and an integer per unique count (at worst, this is the same as the number of objects, but presumably in practice it's much less). Since it's hard to imagine any other algorithm which uses less than one pointer plus a count for each object, this is probably not even a space-time tradeoff; the additional space requirements are minimal.

Getting Unique Numbers and Knowing When They're Freed

I have a physics simulation (using Box2D) where bodies with identical integer IDs do not collide, for instance, bodies that belong to the same character. I have a problem though in that I need to be able to get a unique number for each possible entity, so that no two characters accidentally get the same ID. There's a finite number of bodies, but they are created and destroyed as the simulation dictates, so it's necessary to free unique IDs once the body they belonged to is gone.
A class World is responsible for creating and destroying all bodies, and is also the entity that manages the unique number generation, and anything else where physics simulation is concerned.
I thought of two methods so far but I'm not sure which would be better, if either of them at all:
Keep a vector<short>, with the data being the number of references floating around, and the position in the vector being the ID itself. This method has the disadvantage of creating unneeded complexity when coding entities that manipulate group IDs, since they would need to ensure they tell the World how many references they're taking out.
Keep a vector<bool>, with the data being if that ID is free or not, and the position in the vector being the ID itself. The vector would grow with every new call for a unique ID, if there exist no free slots. The disadvantage is that once the vector reaches a certain size, an audit of the entire simulation would need to be done, but has the advantage of entities being able to grab unique numbers without having to help manage reference counting.
What do you folks think, is there a better way?

You could maintain a "free" list of unused IDs as a singly linked list inside your master World object.
When an object is destroyed by World (making its ID unused) you could push that ID onto the head of the free list.
When you are creating a new object you could do the following:
If the free list is non-empty: pop the head item and take that ID.
Else increment a global ID counter and assign it's current value.
While you could still run out of IDs (if you simultaneously had more objects than the max value of your counter), this strategy will allow you to recycle IDs, and to do everything with O(1) runtime complexity.
EDIT: As per #Matthieu's comments below, a std::deque container could also be used to maintain the "free" list. This container also supports the push_front, pop_front operations with O(1) complexity .
Hope this helps.

How many bodies are there? Is it realistic that you'd ever run out of integers if you didn't reassign them? The simplest solution is to just have one integer storing the next ID -- you would increment this integer when you assign a new ID to a body.

Efficient way to organize used and unused elements in a large concurrent array

I have about 18 million elements in an array that are initialized and ready to be used by a simple manager called ElementManager (this number will later climb to a little more than a billion in later iterations of the program). A class, A, which must use the elements communicates with ElementManager that returns the next available element for consumption. That element is now in use and cannot be reused until recycled, which may happen often. Class A is concurrent, that is, it can ask ElementManager for an available element in several threads. The elements in this case is an object that stores three vertices to make a triangle.
Currently, the ElementManager is using Intel TBB concurrent_bounded_queue called mAllAvailableElements. There is also another container (a TBB concurrent_vector) that contains all elements, regardless of whether they are available for use or not, called mAllElements. Class A asks for the next available element, the manager tries to pop the next available element from the queue. The popped element is now in use.
Now when class A has done what it has to do, control is handed to class B which now has to iterate through all elements that are in use and create meshes (to take advantage of concurrency, the array is split into several smaller arrays to create submeshes which scales with the number of available threads - the reason for this is that creating a mesh must be done serially). For this I am currently iterating over the container mAllElements (this is also concurrent) and grabbing any element that is in use. The elements, as mentioned above, contain polygonal information to create meshes. Iteration in this case takes a long time as it has to check each element and query whether it is in use or not, because if it is not in use then it should not be part of a mesh.
Now imagine if only 1 million out of the possible 18 million elements were in use (but more than 5-6 million were recycled). Worse yet, due to constant updates to only part of the mesh (which happens concurrently) means the in use elements are fragmented throughout the mAllElements container.
I thought about this for quite some time now and one flawed solution that I came up with was to create another queue of elements named mElementsInUse, which is also a concurrent_queue. I can push any element that is now in use. Problem with this approach is that since it is a queue, any element in that queue can be recycled at any time (an update in a part of the mesh) and declared not in use and since I can only pop the front element, this approach fails. The only other approach I can think of is to defragment the concurrent_vector mAllElements every once in a while when no operations are taking place.
I think my approach to this problem is wrong and thus my post here. I hope I explained the problem in enough detail. It seems like a common memory management problem, but I cannot come up with any search terms to search for it.

How about using a bit vector to indicate which of your elements are in use? It's easy to partition it for parallel processing when building your full mesh, and you can use atomic operations on words in the vector and thus avoid locks.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js