std::map write/read from multiple threads - c++

I want to be able to read and write in a std::map from multiple threads. Is there a way to do that without mutex (maybe with std::atomic)?
If not, what's the simplest way to do that in C++11?

If the values are std::atomic<>, you can modify and read them from arbitrary threads, but you'll need to have saved their addresses somehow - pointers, references or iterators - and you'll need to know no other thread will call erase on them....
Still, even atomic values won't make it safe to modify the container (e.g. inserting or erasing elements), while other threads are also doing modifications or lookups or iteration. Even the "const" functions - like size(), empty(), begin(), end(), count(), and iterator movement - are unsafe, because the mutating operations may be in the middle of rewiring the inter-node links or updating the same data.
For anything more than the above, you will need a mutex.
For a concrete example, say you've inserted a node with std::string key "client_counter" - you could start a thread that gets an iterator to that element and does atomic updates to the counter, while other threads can find the element and read from it but must not erase it. You could still have other nodes inserted in the map, with other updaters and readers, without any extra synchronisation with the client_counter-updating thread.

If you don't want to use mutex then you need to wait for concurrent containers (C++17?). If you want to use std::atomic operations inside std::map then you probably want to make or found on the Internet fully implementation of concurrent atomic std::map.
If you want to use std::map of std::atomic then you probably need to know that this will protect only elements inside std::map, but not std::map in self.

Related

Concurrent hash table in C++

In my application I basically have multiple threads that perform inserts and mostly one thread that is iterating through a map and removing items if it meets certain criteria. The reason I wanted to use a concurrent structure is that it would have provided finer grain locking in the code that removes items from the queue which looks similar to this which is not ideal for various reasons including that the thread could get pre-empted while holding the lock.
Function_reap()
{
while(timetaken != timeoutTime)
{
my_map_mutex.lock();
auto iter = my_unordered_map.begin();
while(iter != my_unordered_map.end())
{
if(status_completed == iter->second.status)
{
iter = my_unordered_map.erase(iter);
}
}
my_map_mutex.unlock();
}
}
Was going through the documentation for Intel TBB(Threading Building Blocks) and more specifically the concurrent_unordered_map documentation (https://software.intel.com/en-us/node/506171) to see if this is a good fit for my application and came across this excerpt.
Description concurrent_unordered_map and concurrent_unordered_multimap support concurrent insertion and
traversal, but not concurrent erasure. The interfaces have no visible
locking. They may hold locks internally, but never while calling
user-defined code. They have semantics similar to std::unordered_map
and std::unordered_multimap respectively, except as follows:
The erase and extract methods are prefixed with unsafe_, to indicate that they are not concurrency safe.
Why does TBB not provide safe synchronized deletion from the map? what is the technical reason for this?
What if any other options do i have here? Ideally something that definitely works on Linux and if possible portable to windows.
Well, it is difficult to design a solution that (efficiently) supports all operations. TBB has the concurrent_unordered_map which supports concurrent insert, find and iteration, but no erase - and the concurrent_hash_map which supports concurrent insert, find and erase, but no iteration.
There are several other libraries that provide concurrent hash maps like libcds, or my own one called xenium.
ATM xenium contains two concurrent hash map implementations:
harris_michael_hash_map - fully lock-free; supports concurrent insert, erase, find and iteration. However, the number of buckets has to be defined at construction time and cannot be adapted afterwards. Each bucket contains a linked list of items, which is not very cache friendly.
vyukov_hash_map - is a very fast hash map that uses fine grained locking for insert, erase and iteration; find operations are lock-free. However, if are using iterators you have to be careful to avoid deadlocks (i.e., a thread should not try to insert or erase a key while holding an iterator). However, there is an erase overload that takes an iteration, so you can safely remove the item the iterator points to.
I am currently working to make xenium fully windows compatible.

C++ do we need mutex when using map?

I have two threads, one is inserting and another is deleting an entry in the map. I am wondering whether I have a mutex around these function calls? And also one thread incrementing a counter inside this map and the other decrementing that counter. Do I need mutex for that as well?
Thanks,
Changes to the map itself (insertions, deletions) need to be synchronized. The same is true for traversal and lookup (i.e. begin(), find(), [], etc.).
Multiple threads can access different elements safely, though.
If you are incrementing and decrementing the SAME element in the map (or what may be the same element and you can't say for sure), then you need to have some sort of synchronisation. You could use an std::atomic<int> to avoid having to use a mutex tho'.
Any insert or remove in the tree will need to be protected with a mutex or similar - and of course, that also means that any access to the content of the tree needs to be protected in the same way, so if you use std::map<T>::iterator (at least for erase in the tree) will be invalidated. So you really need to ensure that no erase happens when you use any other access to the tree. This includes "ready-made" functions such as find.

Mutithreading accessing one std::map , will cause unsafe behavior?

If more than one thread access one map object, but, I can make sure any of these threads accessing will not have the same key, and the accessing is just like:
//find value by key
//if find
// erase the object or change the value
//else
// add new object of the key
Will the operation cause synchronization problem?
Yes, doing concurrent updates without proper synchronization may cause crashes, even if your threads access different keys: the std::map is based on trees, trees get rebalanced, so you can cause a write to a parent of a node with a seemingly unrelated key.
Moreover, it is not safe to perform read-only access concurrently with writing, or searching unlocked + locking on write: if you have threads that may update or delete nodes, you must lock out all readers before you write.
You will have concurrency problems if any of the threads inserts into the tree. STL map is implemented using a red-black tree (or at least that's what I'm familiar with — I don't know whether the Standard mandates red-black tree). Red-black trees may be rebalanced upon insert, which would lead to all sorts of races between threads.
Read-only access (absolutely no writers) would be fine, but keep in mind operator[] is not read-only; it potentially adds a new element. You'd need to use the find() method, get the iterator, and derefence it yourself.
Unless the docs (ie, the ISO C++11 standard) say it's thread-safe (and they don't), then that's it. Period. It's not thread-safe.
There may be implementations of a std::map that would allow this but it's by no means portable.
Maps are often built on red-black trees or other auto-balancing data structures so that a modification to the structure (such as inserting or deleting a key) will cause rebalancing.
You should wrap read and write operations on the map with something like a mutex semaphore, to ensure that synchronisation is done correctly.

thread-safe alternative for std::map?

I have a parallized loop and write access to a std::map. I would like to different parts of the map at the same time, i.e. I want to access map[a] and map[b] for a,b different. At I found out that this is not possible, I wonder, however, if there is a good alternative or how to achieve this in a different way!
I could be wrong, but I believe that modifying existing elements to a map is safe as long as you're not touching the same elements (as this does not modify the underlying structure of the map). So if you insert map[a] and map[b] ahead of time, your separate threads should be able to modify those existing elements.
That said, it's probably cleaner and safer just to use normal synchronization techniques such as mutexes to protect access to the map.
It is quite possible to mutate map[a] and map[b] separately, as long as you do not mutate the underling map.
If you wish to mutate an associative container concurrently, check out concurrent_unordered_map from PPL or TBB.
If possible, you could try giving each worker its own copy of the map and then merging the results. This way no locking would be needed at all.

Thread-safety of c++ maps

This is about thread safety of std::map. Now, simultaneous reads are thread-safe but writes are not. My question is that if I add unique element to the map everytime, will that be thread-safe?
So, for an example, If I have a map like this std:map<int, std::string> myMap
and I always add new keys and never modify the existing key-value, will that be thread-safe?
More importantly, will that give me any random run-time behavior?
Is adding new keys also considered modification? If the keys are always different while adding, shouldn't it be thread-safe as it modifies an independent part of the memory?
Thanks
Shiv
1) Of course not
2) Yes, I hope you'll encounter it during testing, not later
3) Yes, it is. The new element is added in a different location, but many pointers are modified during that.
The map is implemented by some sort of tree in most if not all implementations. Inserting a new element in a tree modifies it by rearranging nodes by means of resetting pointers to point to different nodes. So it is not thread safe
no, yes, yes. You need to obtain exclusive lock when modifying container (including insertion of new keys), though while there's no modification going on you can, of course, safely read concurrently.
edit: http://www.sgi.com/tech/stl/thread_safety.html might be of interest for you.