Lock-free map or hash table or ...? - c++

Multiple producers - one consumer. Producers write tagged(uint64) values, consumer reads it one by one. Consumer must read the last value with the same tag.
Now i have simple lock-free queue. Thus, consumer must check the entire queue to get the last added value with the same tag.
I want to replace queue with hashtable and write new values instead of old with the same tag. How should reader get it? Should it remove the last value from hash-table or i should reduce input queue to achieve just one tag with hashtable buffer or ...?
Also, advise me pls C++ libs, that have lock-free hashtable implementation, coz boost doesnt.

Related

Blocking queue with order guarantee

I have 8 threads that process an image in strips. The strips are ordered in raster order. When each thread is finished with a strip, the thread adds its strip id number to a blocking queue. I want the queue to only allow a pop when the number is in sequence from 0 to N. So, regardless of the order in which the threads add their IDs, the queue output will be 0,1,2,3,.....N.
Is there an existing construct in the STL that has this functionality ?
I suppose a simple implementation would be a vanilla queue with a counter starting at 0. When 0 gets added, it pops and moves the counter to 1, and keeps popping until it doesn't find a match. But this sounds inefficient.
Edit: if I wrap an STL priority queue to make it blocking, this could work.
The structure you want is a min heap (see std::priority_queue). This gives the element with the lowest ID.
Wake up the consumer thread every time the newly added element is at the beginning of the queue.
Consume all elements that are in sequence in one go.
This doesn't look like a queue at all! Queue should only support push_back and pop_front. There is no peeking inside.
I would suggest a map<ID,image>, and maintain the last processed image ID. Then you can quickly check if that map's front() is next in your sequence, and remove it.

Tracking elements for order in a std::map

I have a std::map like so: std::map<UINT32,USER_DEFINED_X> with a variable number N of elements. This map is part of an overall application that runs on a real time framework. The map contains elements such that it includes times for when certain activities are supposed to occur. During each frame, the map is scanned to see if any of those times match up with current time. There is one condition that needs to be checked though before processing the activities. I need to check to see if the element that is going to be processed is the first one in the list that is being processed. I am not sure how to do that. One approach I thought about using was to create another temporary map/array where I would store the element that has been processed in order, then get the order from that temporary array/map?
Does anybody know of a better way I can conduct this operation?

Moving values between lockfree lists

Background
I am trying to design and implement lock-free hashmap using chaining method in C++. Each hash table cell is supposed to contain lockfree list. To enable resizing, my data structure is supposed to contain two arrays - small one which is always available and a bigger one for resizing, when the smaller one is no longer sufficient. When the bigger one is created I would like the data stored in small one to be transfered to bigger one by one, whenever any thread does something with the data structure (adds element, searches or removes one). When all data is transfered, the bigger array is moved in place of smaller and the latter one is deleted. The cycle repeats whenever the array needs to be enlarged.
Problem
As mentioned before, each array is supposed to conatin lists in cells. I am trying to find a way to transfer a value or node from one lockfree list to another in such a manner that would keep value visible in any (or both) of the lists. It is needed to ensure that search in hash map won't give the user false negatives. So my questions are:
Is such lockfree list implementation possible?
If so, what would be the general concept of such list and "moving node/value" operation? I would be thankful for any pseudocode, C++ code or scientific article describing it.
To be able to resize the array, while maintaining the lock-free progress guarantees, you will need to use operation descriptors. Once the resize starts, add a descriptor that contains references to the old and the new arrays.
On any operation (add, search, or remove):
Add operation, search the old array, if the element already exists, then move the element to the new array before returning. Indicate, with a descriptor or a special null value that the element has already been moved so that other threads don't attempt the move again
Search, search the old array and move the element as indicated above.
Remove - Remove too will have to search the old array first.
Now the problem is that you will have a thread that has to verify that the move is complete, so that you can remove the descriptor and free up the old array. To maintain lock-freedom, you will need to have all active threads attempt to do this validation, thus it becomes very expensive.
You can look at:
https://dl.acm.org/citation.cfm?id=2611495
https://dl.acm.org/citation.cfm?id=3210408

Implementing Bentley-Ottmann in C++ using the STL

I want to implement the Bentley-Ottmann line segment crossing algorithm based on this description, using STL elements.
Bentley-Ottmann Wikipedia
What I am struggling with is the implementation of the priority queue. The description asks me to erase any intersection:
If p is the left endpoint of a line segment s, insert s into T. Find the segments r and t that are immediately below and above s in T (if they exist) and if their crossing forms a potential future event in the event queue, remove it. If s crosses r or t, add those crossing points as potential future events in the event queue.
It doesn't seem to be possible to use an STL priority queue as the event queue, since its searching complexity is linear and I would need to find and remove any crossing of s and t. Should I use a set instead? Or is it possible with a priority queue?
There are priority queue structures that you can quickly delete from, but they will require a lot of additional memory.
It is actually more efficient just to leave the r-t intersection in the queue. Then, when it comes time to process the event, just ignore it if it's invalid (because r and t are not adjacent) or if it's already been done.
In order to detect when r-t has already been done, just make sure that your priority queue is ordered by a total ordering, i.e., don't just compare the x value of the events. When multiple events have the same x value, use the identifiers of the segments involved to break ties. Then, when r-t appears multiple times in the queue, all of the occurrences will be together and you can just pop them all off in sequence.

Implementation of Concurrent Queue + map in c++

I am not very good at data structures, so this might be very silly question. I am looking for a way to implement a hybrid behavior of queue + maps.
I am currently using tbb::concurrent_bounded_queue (documented at Intel's developer zone) from www.threadingbuildingblocks.org in a multithreaded single producer single consumer process. The queue has market data quote objects and the producer side of the process is actually highly time sensitive, so what I need is a queue that is keyed on a market data identifier such as USDCAD, EURUSD. The Value points (through unique_ptr) to most latest market data quote that I received for this key.
So, let us say my queue has 5 elements for 5 unique identifiers and suddenly we get updated market data quote for the identifier at 3rd position in the queue, then I just store the most latest value and discard the value I previously had. So, essentially I just move my unique_ptr to the new market data quote for this key.
It's like it is similar to concurrent_bounded_queue<pair<string, unique_ptr<Quote>>> but is keyed on the first element of the pair.
I am not sure if this is already available in a third-party library (may be tbb itself) or what it is called if it is a standard data structure.
I would highly appreciate any help or guidance on this.
Thanks.
First, observe that we can easily write...
int idn_to_index(idn); // map from identifier to contiguous number sequence
...it doesn't matter much if that uses a std::map or std::unordered_map, binary search in a sorted std::vector, your own character-by-character hardcoded parser....
Then the producer could:
update (using a mutex) a std::vector<unique_ptr<Quote>> at [idn_to_index(idn)]
post the index to concurrent_bounded_queue<int>
The consumer:
pop an index
compares the pointer in std::vector<unique_ptr<Quote>> at [index] to its own array of last-seen pointers, and if they differ process the quote
The idea here is not to avoid having duplicate identifier-specific indices in the queue, but to make sure that the stalest of those still triggers processing of the newest quote, and that less-stale queue entries are ignored harmlessly until the data's genuinely been updated again.
TBB provides
concurrent_undordered_map: no concurrent erase, stable iterators, no element access protection;
concurrent_hash_map: has concurrent erase, concurrent operations invalidate iterators, per-element access management via 'accessors'
So, if the question
"It's like it is similar to concurrent_bounded_queue<pair<string, unique_ptr<Quote>>> but is keyed on the first element of the pair" means suggest a corresponding concurrent associative map container, these two are at your service. Basically, you have to choose between the ability to erase identifiers concurrently (hash_map) and the ability to traverse concurrently across all the elements (unordered_map). concurrent_hash_map also simplifies synchronization of accesses to the elements which looks useful for your case.
I was able to solve this problem as below:
I use a queue and a hashmap both from tbb library. Now, I push my unique identifiers on the queue and not the Quote's. My hashmap has my unique identifier as key and Quote as value
So, when I receive a Quote I iterate through the queue and check whether the queue contains that identifier, if it does, then I insert the corresponding Quote directly into the hashmap and do not add the unique identifier on the queue. If it does not, then I push the identifier on the queue and corresponding Quote in hashmap. This, ensures that my queue always as unique set of identifiers and my hashmap has the most latest Quote available for that identifier.
On the consumer side, I pop the queue to get my next identifier and get the Quote for that identifier from the hashmap.
This works pretty fast. Please let me know in case I am missing any hidden issues with this.