Algorithm for a simple non-locking-on-read concurrent map? - c++

I'm trying to write a map that is thread-safe, but never locks or blocks on read. My attempt is to use a read-only map that gets copied out on write. The idea is get() is lock-free, and put() copies the current read-only underlying map to a new one, does the put, and swaps out the current underlying map for a new one. (yes, put() is inefficient since it copies the entire map, but I do not care for my use case)
My first stab at this used std::atomic<*StringMap> for the read-only map BUT there is a huge bug with this, probably due to my java background. get() atomically gets a pointer to the underlying map, which may or may not be the current one when it loads it (which is ok). But put() deletes the underlying map after it swaps it out for the new one. If get() is calling the old map just as it gets deleted, this will obviously crash.
A friend of mine suggested shared_ptr, but he's not sure if the shared_ptr operations do any locking under the covers. The docs say it is thread-safe, though. Edit: As nosid points out, it is not thread safe and I need the special atomic operation from std::atomic.
So my questions are: 1. Is this algorithm viable? 2. Do shared_ptr operations do any locking, especially on access?
#include <unordered_map>
#include <atomic>
#include <pthread.h>
#include <memory>
typedef std::unordered_map<std::string, std::string> StringMap;
class NonBlockingReadMap {
private:
pthread_mutex_t fMutex;
std::shared_ptr<StringMap> fspReadMapReference;
public:
NonBlockingReadMap() {
fspReadMapReference = std::make_shared<StringMap>();
}
~NonBlockingReadMap() {
//so, nothing here?
}
std::string get(std::string &key) {
//does this access trigger any locking?
return fspReadMapReference->at(key);
}
void put(std::string &key, std::string &value) {
pthread_mutex_lock(&fMutex);
std::shared_ptr<StringMap> spMapCopy = std::make_shared<StringMap>(*fspReadMapReference);
std::pair<std::string, std::string> kvPair(key, value);
spMapCopy->insert(kvPair);
fspReadMapReference.swap(spMapCopy);
pthread_mutex_unlock(&fMutex);
}
void clear() {
pthread_mutex_lock(&fMutex);
std::shared_ptr<StringMap> spMapCopy = std::make_shared<StringMap>(*fspReadMapReference);
fspReadMapReference.swap(spMapCopy);
spMapCopy->clear();
pthread_mutex_unlock(&fMutex);
}
};

Your code contains a data race on the std::shared_ptr, and the behaviour of programs with data races is undefined in C++.
The problem is: The class std::shared_ptr is not thread-safe. However, there are special atomic operations for std::shared_ptr, which can be used to solve the problem.
You can find more information about these atomic operations on the following webpage:
http://en.cppreference.com/w/cpp/memory/shared_ptr/atomic

Reader should do two operations: get and put. get always retrieves the new pointer and increments an atomic count. Put releases the pointer and decrements. Delete the map when the count goes to zero.
The writer creates a new map and does a get on it. It then does a put on the old map to mark it for delete.

Related

How to individually lock unordered_map elements in C++

I have an unordered_map that I want to be accessible by multiple threads but locking the whole thing with a mutex would be too slow.
To get around this I put a mutex in each element of the unordered_map:
class exampleClass{
std::mutex m;
int data;
};
std::unordered_map<int,exampleClass> exampleMap;
The issue is I'm unable to safely erase elements, because in order to destroy a mutex it must be unlocked but if it's unlocked then another thread could lock it and be writing to or reading the element during destruction.
unordered_map is not suitable for fine-grained parallelism. It is not legal
to add or remove elements without ensuring mutual exclusion during the process.
I would suggest using something like tbb::concurrent_hash_map instead, which will result in less lock contention than locking the map as a whole. (There are other concurrent hash table implementations out there; the advantage of TBB is that it's well-supported and stable.)
#Sneftel's answer is good enough.
But if you insist on using std::unordered_map, I suggest you two use one mutex to protect the insertion/deletion of the map, and another mutex for each element for modifying the element.
class exampleClass{
std::mutex m;
int data;
};
std::unordered_map<int,exampleClass> exampleMap;
std::mutex mapLock;
void add(int key, int value) {
std::unique_lock<std::mutex> _(mapLock);
exampleMap.insert({key, value});
}
void delete(int key) {
std::unique_lock<std::mutex> _(mapLock);
auto it = exampleMap.find(key);
if (it != exampleMap.end()) {
std::unique_lock<std::mutex> _1(it->m);
exampleMap.erase(it);
}
}
These should perform better for a big lock on the whole map if delete is not a frequent operation.
But be careful of these kinds of code, because it is hard to reason and to get right.
I strongly recommend #Sneftel's answer.
You have the following options:
Lock the entire mutex
Use a container of shared_ptr so the actual class can be modified (with or without a mutex) unrelated to the container.

Updating cache without blocking

I currently have a program that has a cache like mechanism. I have a thread listening for updates from another server to this cache. This thread will update the cache when it receives an update. Here is some pseudo code:
void cache::update_cache()
{
cache_ = new std::map<std::string, value>();
while(true)
{
if(recv().compare("update") == 0)
{
std::map<std::string, value> *new_info = new std::map<std::string, value>();
std::map<std::string, value> *tmp;
//Get new info, store in new_info
tmp = cache_;
cache_ = new_cache;
delete tmp;
}
}
}
std::map<std::string, value> *cache::get_cache()
{
return cache_;
}
cache_ is being read from many different threads concurrently. I believe how I have it here I will run into undefined behavior if one of my threads call get_cache(), then my cache updates, then the thread tries to access the stored cache.
I am looking for a way to avoid this problem. I know I could use a mutex, but I would rather not block reads from happening as they have to be as low latency as possible, but if need be, I can go that route.
I was wondering if this would be a good use case for a unique_ptr. Is my understanding correct in that if a thread calls get_cache, and that returns a unique_ptr instead of a standard pointer, once all threads that have the old version of cache are finished with it(i.e leave scope), the object will be deleted.
Is using a unique_ptr the best option for this case, or is there another option that I am not thinking of?
Any input will be greatly appreciated.
Edit:
I believe I made a mistake in my OP. I meant to use and pass a shared_ptr not a unique_ptr for cache_. And when all threads are finished with cache_ the shared_ptr should delete itself.
A little about my program: My program is a webserver that will be using this information to decide what information to return. It is fairly high throughput(thousands of req/sec) Each request queries the cache once, so telling my other threads when to update is no problem. I can tolerate slightly out of date information, and would prefer that over blocking all of my threads from executing if possible. The information in the cache is fairly large, and I would like to limit any copies on value because of this.
update_cache is only run once. It is run in a thread that just listens for an update command and runs the code.
I feel there are multiple issues:
1) Do not leak memory: for that never use "delete" in your code and stick with unique_ptr (or shared_ptr in specific cases)
2) Protect accesses to shared data, for that either using locking (mutex) or lock-free mecanism (std::atomic)
class Cache {
using Map = std::map<std::string, value>();
std::unique_ptr<Map> m_cache;
std::mutex m_cacheLock;
public:
void update_cache()
{
while(true)
{
if(recv().compare("update") == 0)
{
std::unique_ptr<Map> new_info { new Map };
//Get new info, store in new_info
{
std::lock_guard<std::mutex> lock{m_cacheLock};
using std::swap;
swap(m_cache, new_cache);
}
}
}
}
Note: I don't like update_cache() being part of a public interface for the cache as it contains an infinite loop. I would probably externalize the loop with the recv and have a:
void update_cache(std::unique_ptr<Map> new_info)
{
{ // This inner brace is not useless, we don't need to keep the lock during deletion
std::lock_guard<std::mutex> lock{m_cacheLock};
using std::swap;
swap(m_cache, new_cache);
}
}
Now for the reading to the cache, use proper encapsulation and don't leave the pointer to the member map escape:
value get(const std::string &key)
{
// lock, fetch, and return.
// Depending on value type, you might want to allocate memory
// before locking
}
Using this signature you have to throw an exception if the value is not present in the cache, another option is to return something like a boost::optional.
Overall you can keep a low latency (everything is relative, I don't know your use case) if you take care of doing costly operations (memory allocation for instance) outside of the locking section.
shared_ptr is very reasonable for this purpose, C++11 has a family of functions for handling shared_ptr atomically. If the data is immutable after creation, you won't even need any additional synchronization:
class cache {
public:
using map_t = std::map<std::string, value>;
void update_cache();
std::shared_ptr<const map_t> get_cache() const;
private:
std::shared_ptr<const map_t> cache_;
};
void cache::update_cache()
{
while(true)
{
if(recv() == "update")
{
auto new_info = std::make_shared<map_t>();
// Get new info, store in new_info
// Make immutable & publish
std::atomic_store(&cache_,
std::shared_ptr<const map_t>{std::move(new_info)});
}
}
}
auto cache::get_cache() const -> std::shared_ptr<const map_t> {
return std::atomic_load(&cache_);
}

A thread-safe implementation of a generic container of type pair<unsigned int, boost::any> using shared_ptrs

I have created a generic message queue for use in a multi-threaded application. Specifically, single producer, multi-consumer. Main code below.
1) I wanted to know if I should pass a shared_ptr allocated with new into the enqueue method by value, or is it better to have the queue wrapper allocate the memory itself and just pass in a genericMsg object by const reference?
2) Should I have my dequeue method return a shared_ptr, have a shared_ptr passed in as a parameter by reference (current strategy), or just have it directly return a genericMsg object?
3) Will I need signal/wait in enqueue/dequeue or will the read/write locks suffice?
4) Do I even need to use shared_ptrs? Or will this depend solely on the implementation I use? I like that the shared_ptrs will free memory once all references are no longer using the object. I can easily port this to regular pointers if that's recommended, though.
5) I'm storing a pair here because I'd like to discriminate what type of message I'm dealing with else w/o having to do an any_cast. Every message type has a unique ID that refers to a specific struct. Is there a better way of doing this?
Generic Message Type:
template<typename Message_T>
class genericMsg
{
public:
genericMsg()
{
id = 0;
size = 0;
}
genericMsg (unsigned int &_id, unsigned int &_size, Message_T &_data)
{
id = _id;
size = _size;
data = _data;
}
~genericMsg()
{}
unisgned int id;
unsigned int size;
Message_T data; //All structs stored here contain only POD types
};
Enqueue Methods:
// ----------------------------------------------------------------
// -- Thread safe function that adds a new genericMsg object to the
// -- back of the Queue.
// -----------------------------------------------------------------
template<class Message_T>
inline void enqueue(boost::shared_ptr< genericMsg<Message_T> > data)
{
WriteLock w_lock(myLock);
this->qData.push_back(std::make_pair(data->id, data));
}
VS:
// ----------------------------------------------------------------
// -- Thread safe function that adds a new genericMsg object to the
// -- back of the Queue.
// -----------------------------------------------------------------
template<class Message_T>
inline void enqueue(const genericMsg<Message_T> &data_in)
{
WriteLock w_lock(myLock);
boost::shared_ptr< genericMsg<Message_T> > data =
new genericMsg<Message_T>(data_in.id, data_in.size, data_in.data);
this->qData.push_back(std::make_pair(data_in.id, data));
}
Dequeue Method:
// ----------------------------------------------------------------
// -- Thread safe function that grabs a genericMsg object from the
// -- front of the Queue.
// -----------------------------------------------------------------
template<class Message_T>
void dequeue(boost::shared_ptr< genericMsg<Message_T> > &msg)
{
ReadLock r_lock(myLock);
msg = boost::any_cast< boost::shared_ptr< genericMsg<Message_T> > >(qData.front().second);
qData.pop_front();
}
Get message ID:
inline unsigned int getMessageID()
{
ReadLock r_lock(myLock);
unsigned int tempID = qData.front().first;
return tempID;
}
Data Types:
std::deque < std::pair< unsigned int, boost::any> > qData;
Edit:
I have improved upon my design. I now have a genericMessage base class that I directly subclass from in order to derive the unique messages.
Generic Message Base Class:
class genericMessage
{
public:
virtual ~genericMessage() {}
unsigned int getID() {return id;}
unsigned int getSize() {return size;}
protected:
unsigned int id;
unsigned int size;
};
Producer Snippet:
boost::shared_ptr<genericMessage> tmp (new derived_msg1(MSG1_ID));
theQueue.enqueue(tmp);
Consumer Snippet:
boost::shared_ptr<genericMessage> tmp = theQueue.dequeue();
if(tmp->getID() == MSG1_ID)
{
boost::shared_ptr<derived_msg1> tObj = boost::dynamic_pointer_cast<derived_msg1>(tmp);
tObj->printData();
}
New Queue:
std::deque< boost::shared_ptr<genericMessage> > qData;
New Enqueue:
void mq_class::enqueue(const boost::shared_ptr<genericMessage> &data_in)
{
boost::unique_lock<boost::mutex> lock(mut);
this->qData.push_back(data_in);
cond.notify_one();
}
New Dequeue:
boost::shared_ptr<genericMessage> mq_class::dequeue()
{
boost::shared_ptr<genericMessage> ptr;
{
boost::unique_lock<boost::mutex> lock(mut);
while(qData.empty())
{
cond.wait(lock);
}
ptr = qData.front();
qData.pop_front();
}
return ptr;
}
Now, my question is am I doing dequeue correctly? Is there another way of doing it? Should I pass in a shared_ptr as a reference in this case to achieve what I want?
Edit (I added answers for parts 1, 2, and 4).
1) You should have a factory method that creates new genericMsgs and returns a std::unique_ptr. There is absolutely no good reason to pass genericMsg in by const reference and then have the queue wrap it in a smart pointer: Once you've passed by reference you have lost track of ownership, so if you do that the queue is going to have to construct (by copy) the entire genericMsg to wrap.
2) I can't think of any circumstance under which it would be safe to take a reference to a shared_ptr or unique_ptr or auto_ptr. shared_ptrs and unique_ptrs are for tracking ownership and once you've taken a reference to them (or the address of them) you have no idea how many references or pointers are still out there expecting the shared_ptr/unique_ptr object to contain a valid naked pointer.
unique_ptr is always preferred to a naked pointer, and is preferred to a shared_ptr in cases where you only have a single piece of code (validly) pointing to an object at a time.
https://softwareengineering.stackexchange.com/questions/133302/stdshared-ptr-as-a-last-resort
http://herbsutter.com/gotw/_103/
Bad practice to return unique_ptr for raw pointer like ownership semantics? (the answer explains why it is good practice not bad).
3) Yes, you need to use a std::condition_variable in your dequeue function. You need to test whether qData is empty or not before calling qData.front() or qData.pop_front(). If qData is empty you need to wait on a condition variable. When enqueue inserts an item it should signal the condition variable to wake up anyone who may have been waiting.
Your use of reader/writer locks is completely incorrect. Don't use reader/writer locks. Use std::mutex. A reader lock can only be used on a method that is completely const. You are modifying qData in dequeue, so a reader lock will lead to data races there. (Reader writer locks are only applicable when you have stupid code that is both const and holds locks for extended period of time. You are only keeping the lock for the period of time it takes to insert or remove from the queue, so even if you were const the added overhead of reader/writer locks would be a net lose.)
An example of implementing a (bounded) buffer using mutexes and condition_variables can be found at: Is this a correct way to implement a bounded buffer in C++.
4) unique_ptr is always preferred to naked pointers, and usually preferred to shared_ptr. (The main exception where shared_ptr might be better is for graph-like data structures.) In cases like yours where you are reading something in on side, creating a new object with a factory, moving the ownership to the queue and then moving ownership out of the queue to the consumer it sounds like you should be using unique_ptr.
5) You are reinventing tagged unions. Virtual functions were added to c++ specifically so you wouldn't need to do this. You should subclass your messages from a class that has a virtual function called do_it() (or better yet, operator()() or something like that). Then instead of tagging each struct, make each struct a subclass of your message class. When you dequeue each struct (or ptr to struct) just call do_it() on it. Strong static typing, no casts. See C++ std condition variable covering a lot of share variables for an example.
Also: if you are going to stick with the tagged unions: you can't have separate calls to get the id and the data item. Consider: If thread A calls to get the id, then thread B calls to get the id, then thread B retrieves the data item, now what happens when thread A calls to retrieve a data item? It gets a data item, but not with the type that it expected. You need to retrieve the id and the data item under the same critical section.
First of all, it's better to use 3rd-party concurrency containers than to implement them yourself, except it's for education purpose.
Your messages doesn't look to have costly constructors/destructor, so you can store them by value and forget about all your other questions. Use move semantics (if available) for optimizations.
If your profiler says "by value" is bad idea in your particular case:
I suppose your producer creates messages, puts them into your queue and loses any interest in them. In this case you don't need shared_ptr because you don't have shared ownership. You can use unique_ptr or even a raw pointer. It's implementation details and better to hide them inside the queue.
From performance point of view, it's better to implement lock-free queue. "locks vs. signals" depends completely on your application. For example, if you use thread pool and kind of a scheduler it's better to allow your clients to do something useful while queue is full/empty. In simpler cases reader/writer lock is just fine.
If I want to be thread safe, I usually use const objects and modify only on copy or create constructor. In this way you don't need to use any lock mechanism. In a threaded system, it is usually more effective than use mutexes on a single instance.
In your case only deque would need lock.

Is this an acceptable way to lock a container using C++?

I need to implement (in C++) a thread safe container in such a way that only one thread is ever able to add or remove items from the container. I have done this kind of thing before by sharing a mutex between threads. This leads to a lot of mutex objects being littered throughout my code and makes things very messy and hard to maintain.
I was wondering if there is a neater and more object oriented way to do this. I thought of the following simple class wrapper around the container (semi-pseudo C++ code)
class LockedList {
private:
std::list<MyClass> m_List;
public:
MutexObject Mutex;
};
so that locking could be done in the following way
LockedList lockableList; //create instance
lockableList.Mutex.Lock(); // Lock object
... // search and add or remove items
lockableList.Mutex.Unlock(); // Unlock object
So my question really is to ask if this is a good approach from a design perspective? I know that allowing public access to members is frowned upon from a design perspective, does the above design have any serious flaws in it. If so is there a better way to implement thread safe container objects?
I have read a lot of books on design and C++ in general but there really does seem to be a shortage of literature regarding multithreaded programming and multithreaded software design.
If the above is a poor approach to solving the problem I have could anyone suggest a way to improve it, or point me towards some information that explains good ways to design classes to be thread safe??? Many thanks.
I would rather design a resourece owner that locks a mutex and returns an object that can be used by the thread. Once the thread has finished with it and stops using the object the resource is automatically returned to its owner and the lock released.
template<typename Resource>
class ResourceOwner
{
Lock lock;
Resource resource;
public:
ResourceHolder<Resource> getExclusiveAccess()
{
// Let the ResourceHolder lock and unlock the lock
// So while a thread holds a copy of this object only it
// can access the resource. Once the thread releases all
// copies then the lock is released allowing another
// thread to call getExclusiveAccess().
//
// Make it behave like a form of smart pointer
// 1) So you can pass it around.
// 2) So all properties of the resource are provided via ->
// 3) So the lock is automatically released when the thread
// releases the object.
return ResourceHolder<Resource>(lock, resource);
}
};
The resource holder (not thought hard so this can be improved)
template<typename Resource>
class ResourceHolder<
{
// Use a shared_ptr to hold the scopped lock
// When first created will lock the lock. When the shared_ptr
// destroyes the scopped lock (after all copies are gone)
// this will unlock the lock thus allowding other to use
// getExclusiveAccess() on the owner
std::shared_ptr<scopped_lock> locker;
Resource& resource; // local reference on the resource.
public:
ResourceHolder(Lock& lock, Resource& r)
: locker(new scopped_lock(lock))
, resource(r)
{}
// Access to the resource via the -> operator
// Thus allowing you to use all normal functionality of
// the resource.
Resource* operator->() {return &resource;}
};
Now a lockable list is:
ResourceOwner<list<int>> lockedList;
void threadedCode()
{
ResourceHolder<list<int>> list = lockedList.getExclusiveAccess();
list->push_back(1);
}
// When list goes out of scope here.
// It is destroyed and the the member locker will unlock `lock`
// in its destructor thus allowing the next thread to call getExclusiveAccess()
I would do something like this to make it more exception-safe by using RAII.
class LockedList {
private:
std::list<MyClass> m_List;
MutexObject Mutex;
friend class LockableListLock;
};
class LockableListLock {
private:
LockedList& list_;
public:
LockableListLock(LockedList& list) : list_(list) { list.Mutex.Lock(); }
~LockableListLock(){ list.Mutex.Unlock(); }
}
You would use it like this
LockableList list;
{
LockableListLock lock(list); // The list is now locked.
// do stuff to the list
} // The list is automatically unlocked when lock goes out of scope.
You could also make the class force you to lock it before doing anything with it by adding wrappers around the interface for std::list in LockableListLock so instead of accessing the list through the LockedList class, you would access the list through the LockableListLock class. For instance, you would make this wrapper around std::list::begin()
std::list::iterator LockableListLock::begin() {
return list_.m_List.begin();
}
and then use it like this
LockableList list;
LockableListLock lock(list);
// list.begin(); //This is a compiler error so you can't
//access the list without locking it
lock.begin(); // This gets you the beginning of the list
Okay, I'll state a little more directly what others have already implied: at least part, and quite possibly all, of this design is probably not what you want. At the very least, you want RAII-style locking.
I'd also make the locked (or whatever you prefer to call it) a template, so you can decouple the locking from the container itself.
// C++ like pesudo-code. Not intended to compile as-is.
struct mutex {
void lock() { /* ... */ }
void unlock() { /* ... */ }
};
struct lock {
lock(mutex &m) { m.lock(); }
~lock(mutex &m) { m.unlock(); }
};
template <class container>
class locked {
typedef container::value_type value_type;
typedef container::reference_type reference_type;
// ...
container c;
mutex m;
public:
void push_back(reference_type const t) {
lock l(m);
c.push_back(t);
}
void push_front(reference_type const t) {
lock l(m);
c.push_front(t);
}
// etc.
};
This makes the code fairly easy to write and (for at least some cases) still get correct behavior -- e.g., where your single-threaded code might look like:
std::vector<int> x;
x.push_back(y);
...your thread-safe code would look like:
locked<std::vector<int> > x;
x.push_back(y);
Assuming you provide the usual begin(), end(), push_front, push_back, etc., your locked<container> will still be usable like a normal container, so it works with standard algorithms, iterators, etc.
The problem with this approach is that it makes LockedList non-copyable. For details on this snag, please look at this question:
Designing a thread-safe copyable class
I have tried various things over the years, and a mutex declared beside the the container declaration always turns out to be the simplest way to go ( once all the bugs have been fixed after naively implementing other methods ).
You do not need to 'litter' your code with mutexes. You just need one mutex, declared beside the container it guards.
It's hard to say that the coarse grain locking is a bad design decision. We'd need to know about the system that the code lives in to talk about that. It is a good starting point if you don't know that it won't work however. Do the simplest thing that could possibly work first.
You could improve that code by making it less likely to fail if you scope without unlocking though.
struct ScopedLocker {
ScopedLocker(MutexObject &mo_) : mo(mo_) { mo.Lock(); }
~ScopedLocker() { mo.Unlock(); }
MutexObject &mo;
};
You could also hide the implementation from users.
class LockedList {
private:
std::list<MyClass> m_List;
MutexObject Mutex;
public:
struct ScopedLocker {
ScopedLocker(LockedList &ll);
~ScopedLocker();
};
};
Then you just pass the locked list to it without them having to worry about details of the MutexObject.
You can also have the list handle all the locking internally, which is alright in some cases. The design issue is iteration. If the list locks internally, then operations like this are much worse than letting the user of the list decide when to lock.
void foo(LockedList &list) {
for (size_t i = 0; i < 100000000; i++) {
list.push_back(i);
}
}
Generally speaking, it's a hard topic to give advice on because of problems like this. More often than not, it's more about how you use an object. There are a lot of leaky abstractions when you try and write code that solves multi-processor programming. That is why you see more toolkits that let people compose the solution that meets their needs.
There are books that discuss multi-processor programming, though they are few. With all the new C++11 features coming out, there should be more literature coming within the next few years.
I came up with this (which I'm sure can be improved to take more than two arguments):
template<class T1, class T2>
class combine : public T1, public T2
{
public:
/// We always need a virtual destructor.
virtual ~combine() { }
};
This allows you to do:
// Combine an std::mutex and std::map<std::string, std::string> into
// a single instance.
combine<std::mutex, std::map<std::string, std::string>> mapWithMutex;
// Lock the map within scope to modify the map in a thread-safe way.
{
// Lock the map.
std::lock_guard<std::mutex> locked(mapWithMutex);
// Modify the map.
mapWithMutex["Person 1"] = "Jack";
mapWithMutex["Person 2"] = "Jill";
}
If you wish to use an std::recursive_mutex and an std::set, that would also work.

unordered_map thread safety

I am changing a single thread program into multi thread using boost:thread library. The program uses unordered_map as a hasp_map for lookups. My question is..
At one time many threads will be writing, and at another many will be reading but not both reading and writing at the same time i.e. either all the threads will be reading or all will be writing. Will that be thread safe and the container designed for this? And if it will be, will it really be concurrent and improve performance? Do I need to use some locking mechanism?
I read somewhere that the C++ Standard says the behavior will be undefined, but is that all?
UPDATE: I was also thinking about Intel concurrent_hash_map. Will that be a good option?
STL containers are designed so that you are guaranteed to be able to have:
A. Multiple threads reading at the same time
or
B. One thread writing at the same time
Having multiple threads writing is not one of the above conditions and is not allowed. Multiple threads writing will thus create a data race, which is undefined behavior.
You could use a mutex to fix this. A shared_mutex (combined with shared_locks) would be especially useful as that type of mutex allows multiple concurrent readers.
http://eel.is/c++draft/res.on.data.races#3 is the part of the standard which guarantees the ability to concurrently use const functions on different threads. http://eel.is/c++draft/container.requirements.dataraces specifies some additional non-const operations which are safe on different threads.
std::unordered_map meets the requirements of Container (ref http://en.cppreference.com/w/cpp/container/unordered_map). For container thread safety see: http://en.cppreference.com/w/cpp/container#Thread_safety.
Important points:
"Different elements in the same container can be modified concurrently by different threads"
"All const member functions can be called concurrently by different threads on the same container. In addition, the member functions begin(), end(), rbegin(), rend(), front(), back(), data(), find(), lower_bound(), upper_bound(), equal_range(), at(), and, except in associative containers, operator[], behave as const for the purposes of thread safety (that is, they can also be called concurrently by different threads on the same container)."
Will that be thread safe and the container designed for this?
No, the standard containers are not thread safe.
Do I need to use some locking mechanism?
Yes, you do. Since you're using boost, boost::mutex would be a good idea; in C++11, there's std::mutex.
I read somewhere that the C++ Standard says the behavior will be undefined, but is that all?
Indeed, the behaviour is undefined. I'm not sure what you mean by "is that all?", since undefined behaviour is the worst possible kind of behaviour, and a program that exhibits it is by definition incorrect. In particular, incorrect thread synchronisation is likely to lead to random crashes and data corruption, often in ways that are very difficult to diagnose, so you would be wise to avoid it at all costs.
UPDATE: I was also thinking about Intel concurrent_hash_map. Will that be a good option?
It sounds good, but I've never used it myself so I can't offer an opinion.
The existing answers cover the main points:
you must have a lock to read or write to the map
you could use a multiple-reader / single-writer lock to improve concurrency
Also, you should be aware that:
using an earlier-retrieved iterator, or a reference or pointer to an item in the map, counts as a read or write operation
write operations performed in other threads may invalidate pointers/references/iterators into the map, much as they would if they were done in the same thread, even if a lock is again acquired before an attempt is made to continue using them...
You can use concurrent_hash_map or employ an mutex when you access unordered_map. one of issue on using intel concurrent_hash_map is you have to include TBB, but you already use boost.thread. These two components have overlapped functionality, and hence complicate your code base.
std::unordered_map is a good fit for some multi-threaded situations.
There are also other concurrent maps from Intel TBB:
tbb:concurrent_hash_map. It supports fine-grained, per-key locking for insert/update, which is something that few other hashmaps can offer. However, the syntax is slightly more wordy. See full sample code. Recommended.
tbb:concurrent_unordered_map. It is essentially the same thing, a key/value map. However, it is much lower level, and more difficult to use. One has to supply a hasher, a equality operator, and an allocator. There is no sample code anywhere, even in the official Intel docs. Not recommended.
If you don't need all of the functionality of unordered_map, then this solution should work for you. It uses mutex to control access to the internal unordered_map. The solution supports the following methods, and adding more should be fairly easy:
getOrDefault(key, defaultValue) - returns the value associated with key, or defaultValue if no association exists. A map entry is not created when no association exists.
contains(key) - returns a boolean; true if the association exists.
put(key, value) - create or replace association.
remove(key) - remove association for key
associations() - returns a vector containing all of the currently known associations.
Sample usage:
/* SynchronizedMap
** Functional Test
** g++ -O2 -Wall -std=c++11 test.cpp -o test
*/
#include <iostream>
#include "SynchronizedMap.h"
using namespace std;
using namespace Synchronized;
int main(int argc, char **argv) {
SynchronizedMap<int, string> activeAssociations;
activeAssociations.put({101, "red"});
activeAssociations.put({102, "blue"});
activeAssociations.put({102, "green"});
activeAssociations.put({104, "purple"});
activeAssociations.put({105, "yellow"});
activeAssociations.remove(104);
cout << ".getOrDefault(102)=" << activeAssociations.getOrDefault(102, "unknown") << "\n";
cout << ".getOrDefault(112)=" << activeAssociations.getOrDefault(112, "unknown") << "\n";
if (!activeAssociations.contains(104)) {
cout << 123 << " does not exist\n";
}
if (activeAssociations.contains(101)) {
cout << 101 << " exists\n";
}
cout << "--- associations: --\n";
for (auto e: activeAssociations.associations()) {
cout << e.first << "=" << e.second << "\n";
}
}
Sample output:
.getOrDefault(102)=green
.getOrDefault(112)=unknown
123 does not exist
101 exists
--- associations: --
105=yellow
102=green
101=red
Note1: The associations() method is not intended for very large datasets as it will lock the table during the creation of the vector list.
Note2: I've purposefully not extended unordered_map in order to prevent my self from accidentally using a method from unordered_map that has not been extended and therefore might not be thread safe.
#pragma once
/*
* SynchronizedMap.h
* Wade Ryan 20200926
* c++11
*/
#include <unordered_map>
#include <mutex>
#include <vector>
#ifndef __SynchronizedMap__
#define __SynchronizedMap__
using namespace std;
namespace Synchronized {
template <typename KeyType, typename ValueType>
class SynchronizedMap {
private:
mutex sync;
unordered_map<KeyType, ValueType> usermap;
public:
ValueType getOrDefault(KeyType key, ValueType defaultValue) {
sync.lock();
ValueType rs;
auto value=usermap.find(key);
if (value == usermap.end()) {
rs = defaultValue;
} else {
rs = value->second;
}
sync.unlock();
return rs;
}
bool contains(KeyType key) {
sync.lock();
bool exists = (usermap.find(key) != usermap.end());
sync.unlock();
return exists;
}
void put(pair<KeyType, ValueType> nodePair) {
sync.lock();
if (usermap.find(nodePair.first) != usermap.end()) {
usermap.erase(nodePair.first);
}
usermap.insert(nodePair);
sync.unlock();
}
void remove(KeyType key) {
sync.lock();
if (usermap.find(key) != usermap.end()) {
usermap.erase(key);
}
sync.unlock();
}
vector<pair<KeyType, ValueType>> associations() {
sync.lock();
vector<pair<KeyType, ValueType>> elements;
for (auto it=usermap.begin(); it != usermap.end(); ++it) {
pair<KeyType, ValueType> element (it->first, it->second);
elements.push_back( element );
}
sync.unlock();
return elements;
}
};
}
#endif