thread safety of find() from a STL container of std::unique_ptr - c++

Example code.
class Obj
{
public:
void doSome(void)
{
std::cout << "Hello World!" << std::endl;
}
};
std::unordered_map<int, std::unique_ptr<Obj>> map;
// insert -- done with single thread and before find()
map[123] = std::move( std::unique_ptr<Obj>(new Obj) );
// find -- run from multiple threads
auto search = map.find(123); // <=== (Q)
if (search != map.end())
{
search->second->doSome();
}
(Q)
How about the thread safty if there are multiple threads running //find section with map.find(123)?
will map.find(123) always find the obj in every thread? as long as the search->second not assigned to someone else?

When more than one thread accesses the same variable and at least one of them writes to it you have a data race. That's not the case here, where everyone is reading the same data. That's okay. There's another issue, though, which isn't addressed in this code: depending on when the data is stored into the map object, some threads might not see the updated version of the map object. The simplest way to deal with this synchronization problem is to set up the map object before creating any of the reader threads.

Neither find(), nor any other method in the unordered map is thread safe. If it's possible for one execution thread to call find() while any other thread calls any unordered map method that modifies it, this results in undefined behavior.
If multiple execution threads are calling find() with the same key, provided that there is no undefined behavior all execution threads will get the same value for that key.

Related

accessing to map in multithreaded environment

In my application, multiple threads need to access to a map object for inserting new items or reading the existing items. (There is no 'erase' operation).
The threads uses this code for accessing map elements:
struct PayLoad& ref = myMap[index];
I just want to know do I still need to wrap this block of this code inside of mutex ? Or is it safe to not using mutex for this purpose ?
Thanks.
Since there is at least one write operation, i.e. an insert, then you need to have thread synchronization when accessing the map. Otherwise you have a race condition.
Also, returning a reference to the value in a map is not thread-safe:
struct PayLoad& ref = myMap[index];
since multiple threads could access the value, and at least one of them could involve a write. That would also lead to a race condition. It is better to return the value by value like this:
Payload GetPayload(int index)
{
std::lock_guard<std::mutex> lock(mutex);
return myMap[index];
}
where mutex is an accessible std::mutex object.
Your insert/write operation also needs to lock the same mutex:
void SetPayload(int index, Payload payload)
{
std::lock_guard<std::mutex> lock(mutex);
myMap[index] = std::move(payload);
}

is it ok to access value(entry in thread safe map) pointed by pointer inside non-thread safe container?

For example,
// I am using thread safe map from
// code.google.com/p/thread-safe-stl-containers
#include <thread_safe_map.h>
class B{
vector<int> b1;
};
//Thread safe map
thread_safe::map<int, B> A;
B b_object;
A[1] = b_object;
// Non thread safe map.
map<int, B*> C;
C[1] = &A[1].second;
So are following operations still thread safe?
Thread1:
for(int i=0; i<10000; i++) {
cout << C[1]->b1[i];
}
Thread2:
for(int i=0; i<10000; i++) {
C[1]->b1.push_back(i);
}
Is there any problem in the above code? If so how can I fix it?
Is it OK to access value(entry in thread safe map) pointed by pointer inside non-thread safe container?
No, what you are doing there is not safe. The way your thread_safe_map is implemented is to take a lock for the duration of every function call:
//Element Access
T & operator[]( const Key & x ) { boost::lock_guard<boost::mutex> lock( mutex ); return storage[x]; }
The lock is released as soon as the access function ends which means that any modification you make through the returned reference has no protection.
As well as being not entirely safe this method is very slow.
A safe(er), efficient, but highly experimental way to lock containers is proposed here: https://github.com/isocpp/CppCoreGuidelines/issues/924
with source code here https://github.com/galik/GSL/blob/lockable-objects/include/gsl/gsl_lockable (shameless self promotion disclaimer).
In general, STL containers can be accessed from multiple threads as long as all threads either:
read from the same container
modify elements in a thread safe manner
You cannot push_back (or erase, insert, etc.) from one thread and read from another thread. Suppose that you are trying to access an element in thread 1 while push_back in thread 2 is in the middle of reallocation of vector's storage. This might crash the application, might return garbage (or might work, if you're lucky).
The second bullet point applies to situations like this:
std::vector<std::atomic_int> elements;
// Thread 1:
elements[10].store(5);
// Thread 2:
int v = elements[10].load();
In this case, you're concurrently reading and writing an atomic variable, but the vector itself is not modified - only its element is.
Edit: using thread_safe::map doesn't change anything in you're case. While the modifying the map is ok, modifying its elements is not. Putting std::vector in a thread-safe collection doesn't automagically make it thread-safe too.

Mutex when writing to queue held in map for thread safety

I have a map<int, queue<int>> with one thread writing into it i.e. pushing messages into the queues. They key refers to a client_id, and the queue holds messages for the client. I am looking to make this read-write thread safe.
Currently, the thread that writes into it does something like this
map<int, queue<int>> msg_map;
if (msg_map.find(client_id) != msg_map.end())
{
queue<int> dummy_queue;
dummy_queue.push(msg); //msg is an int
msg_map.insert(make_pair(client_id, dummy_queue);
}
else
{
msg_map[client_id].push(msg);
}
There are many clients reading - and removing - from this map.
if (msg_map.find(client_id) != msg_map.end())
{
if (!msg_map.find(client_id)->second.empty())
{
int msg_rxed = msg_map[client_id].front();
//processing message
msg_map[client_id].pop();
}
}
I am reading this on mutexes (haven't used them before) and I was wondering when and where I ought to lock the mutex. My confusion lies in the fact that they are accessing individual queues (held within the same map). Do I lock the queues, or the map?
Is there a standard/accepted way to do this - and is using a mutex the best way to do this? There are '0s of client threads, and just that 1 single writing thread.
Simplifying and optimizing your code
For now we'll not concern ourselves with mutexes, we'll handle that later when the code is cleaned up a bit (it will be easier then).
First, from the code you showed there seems to be no reason to use an ordered std::map (logarithmic complexity), you could use the much more efficient std::unordered_map (average constant-time complexity). The choice is entirely up to you, if you don't need the container to be ordered you just have to change its declaration:
std::map<int, std::queue<int>> msg_map;
// or
std::unordered_map<int, std::queue<int>> msg_map; // C++11 only though
Now, maps are quite efficient by design but if you insist on doing lookups for each and every operation then you lose all the advantage of maps.
Concerning the writer thread, all your block of code (for the writer) can be efficiently replaced by just this line:
msg_map[client_id].push(msg);
Note that operator[] for both std::map and std::unordered_map is defined as:
Inserts a new element to the container using key as the key and a default constructed mapped value and returns a reference to the newly constructed mapped value. If an element with key key already exists, no insertion is performed and a reference to its mapped value is returned.
Concerning your reader threads, you can't directly use operator[] because it would create a new entry if none currently exists for a specific client_id so instead, you need to cache the iterator returned by find in order to reuse it and thus avoid useless lookups:
auto iter = msg_map.find(client_id);
// iter will be either std::map<int, std::queue<int>>::iterator
// or std::unordered_map<int, std::queue<int>>::iterator
if (iter != msg_map.end()) {
std::queue<int>& q = iter->second;
if (!q.empty()) {
int msg = q.front();
q.pop();
// process msg
}
}
The reason why I pop the message immediately, before processing it, is because it will improve concurrency when we add mutexes (we can unlock the mutex sooner, which is always good).
Making the code thread-safe
#hmjd's idea about multiple locks (one for the map, and one per queue) is interesting, but based on the code you showed us I disagree: any benefit you'll get from the additional concurrency will quite probably be negated by the additional time it takes to lock the queue mutexes (indeed, locking mutexes is a very expensive operation), not to mention the additional code complexity you'll have to handle. I'll bet my money on a single mutex (protecting the map and all the queues at once) being more efficient.
Incidentally, a single mutex solves the iterator invalidation problem if you want to use the more efficient std::unordered_map (std::map doesn't suffer from that problem though).
Assuming C++11, just declare a std::mutex along with your map:
std::mutex msg_map_mutex;
std::map<int, std::queue<int>> msg_map; // or std::unordered_map
Protecting the writer thread is quite straightforward, just lock the mutex before accessing the map:
std::lock_guard<std::mutex> lock(msg_map_mutex);
// the lock is held while the lock_guard object stays in scope
msg_map[client_id].push(msg);
Protecting the reader threads is barely any harder, the only trick is that you'll probably want to unlock the mutex ASAP in order to improve concurrency so you'll have to use std::unique_lock (which can be unlocked early) instead of std::lock_guard (which can only unlock when it goes out of scope):
std::unique_lock<std::mutex> lock(msg_map_mutex);
auto iter = msg_map.find(client_id);
if (iter != msg_map.end()) {
std::queue<int>& q = iter->second;
if (!q.empty()) {
int msg = q.front();
q.pop();
// assuming you don't need to access the map from now on, let's unlock
lock.unlock();
// process msg, other threads can access the map concurrently
}
}
If you can't use C++11, you'll have to replace std::mutex et al. with whatever your platform provides (pthreads, Win32, ...) or with the boost equivalent (which has the advantage of being as portable and as easy to use as the new C++11 classes, unlike the platform-specific primitives).
Read and write access to both the map and the queue need synchronized as both structures are being modified, including the map:
map<int, queue<int>> msg_map;
if (msg_map.find(client_id) != msg_map.end())
{
queue<int> dummy_queue;
dummy_queue.push(msg); //msg is an int
msg_map.insert(make_pair(client_id, dummy_queue);
}
else
{
msg_map[client_id].push(msg); // Modified here.
}
Two options are a mutex that locks both the map and queue or have a mutex for the map and a mutex per queue. The second approach is preferable as it reduces the length of time a single lock is held and means multiple threads can be updating several queues concurrently.

Is it possible to use mutex to lock only one element of a data structure ?

Is it possible to use mutex to lock only one element of a data structure ?
e.g.
boost::mutex m_mutex;
map<string, int> myMap;
// initialize myMap so that it has 10 elements
// then in thread 1
{
boost::unique_lock<boost::mutex> lock(m_mutex);
myMap[1] = 5 ; // write map[1]
}
// in thread 2
{
boost::unique_lock<boost::mutex> lock(m_mutex);
myMap[2] = 4 ; // write map[1]
}
My question:
When thread 1 is writing map[1], thread 2 can writing map[2] at the same time ?
The thread lock the whole map data structure or only an element, e.g. map[1] or map[2].
thanks
If you can guarantee that nobody is modifying the container itself (via insert and erase etc.), then as long as each thread accesses a different element of the container, you should be fine.
If you need per-element locking, you could modify the element type to something that offers synchronized access. (Worst case a pair of a mutex and the original value.)
You need a different mutex for every element of the map. You can do this with a map of mutex or adding a mutex to the mapped type (in your case it is int, so you can't do it without creating a new class like SharedInt)
Mutexes lock executable regions not objects. I always think about locking any code regions that read/modify thread objects. If an object is locked within a region but that object is accessible within another un-synchronized code region, you are not safe (ofcourse). In your case, I'd lock access to the entire object as insertions and reading from containers can easily experience context switching and thus increase the likelihood of data corruption.
Mutex is all about discipline. One thread can call write and other thread can call write1. C++ runtime will assume it is intentional. But most of the cases it is not the programmer intended. Summary is as long as all threads/methods follow the discipline (understand the the critical section and respect it) there will be consistency.
int i=0;
Write()
{
//Lock
i++;
//Unlock
}
Write1()
{
i++;
}

Thread-Safe implementation of an object that deletes itself

I have an object that is called from two different threads and after it was called by both it destroys itself by "delete this".
How do I implement this thread-safe? Thread-safe means that the object never destroys itself exactly one time (it must destroys itself after the second callback).
I created some example code:
class IThreadCallBack
{
virtual void CallBack(int) = 0;
};
class M: public IThreadCallBack
{
private:
bool t1_finished, t2_finished;
public:
M(): t1_finished(false), t2_finished(false)
{
startMyThread(this, 1);
startMyThread(this, 2);
}
void CallBack(int id)
{
if (id == 1)
{
t1_finished = true;
}
else
{
t2_finished = true;
}
if (t1_finished && t2_finished)
{
delete this;
}
}
};
int main(int argc, char **argv) {
M* MObj = new M();
while(true);
}
Obviously I can't use a Mutex as member of the object and lock the delete, because this would also delete the Mutex. On the other hand, if I set a "toBeDeleted"-flag inside a mutex-protected area, where the finised-flag is set, I feel unsure if there are situations possible where the object isnt deleted at all.
Note that the thread-implementation makes sure that the callback method is called exactly one time per thread in any case.
Edit / Update:
What if I change Callback(..) to:
void CallBack(int id)
{
mMutex.Obtain()
if (id == 1)
{
t1_finished = true;
}
else
{
t2_finished = true;
}
bool both_finished = (t1_finished && t2_finished);
mMutex.Release();
if (both_finished)
{
delete this;
}
}
Can this considered to be safe? (with mMutex being a member of the m class?)
I think it is, if I don't access any member after releasing the mutex?!
Use Boost's Smart Pointer. It handles this automatically; your object won't have to delete itself, and it is thread safe.
Edit:
From the code you've posted above, I can't really say, need more info. But you could do it like this: each thread has a shared_ptr object and when the callback is called, you call shared_ptr::reset(). The last reset will delete M. Each shared_ptr could be stored with thread local storeage in each thread. So in essence, each thread is responsible for its own shared_ptr.
Instead of using two separate flags, you could consider setting a counter to the number of threads that you're waiting on and then using interlocked decrement.
Then you can be 100% sure that when the thread counter reaches 0, you're done and should clean up.
For more info on interlocked decrement on Windows, on Linux, and on Mac.
I once implemented something like this that avoided the ickiness and confusion of delete this entirely, by operating in the following way:
Start a thread that is responsible for deleting these sorts of shared objects, which waits on a condition
When the shared object is no longer being used, instead of deleting itself, have it insert itself into a thread-safe queue and signal the condition that the deleter thread is waiting on
When the deleter thread wakes up, it deletes everything in the queue
If your program has an event loop, you can avoid the creation of a separate thread for this by creating an event type that means "delete unused shared objects" and have some persistent object respond to this event in the same way that the deleter thread would in the above example.
I can't imagine that this is possible, especially within the class itself. The problem is two fold:
1) There's no way to notify the outside world not to call the object so the outside world has to be responsible for setting the pointer to 0 after calling "CallBack" iff the pointer was deleted.
2) Once two threads enter this function you are, and forgive my french, absolutely fucked. Calling a function on a deleted object is UB, just imagine what deleting an object while someone is in it results in.
I've never seen "delete this" as anything but an abomination. Doesn't mean it isn't sometimes, on VERY rare conditions, necessary. Problem is that people do it way too much and don't think about the consequences of such a design.
I don't think "to be deleted" is going to work well. It might work for two threads, but what about three? You can't protect the part of code that calls delete because you're deleting the protection (as you state) and because of the UB you'll inevitably cause. So the first goes through, sets the flag and aborts....which of the rest is going to call delete on the way out?
The more robust implementation would be to implement reference counting. For each thread you start, increase a counter; for each callback call decrease the counter and if the counter has reached zero, delete the object. You can lock the counter access, or you could use the Interlocked class to protect the counter access, though in that case you need to be careful with potential race between the first thread finishing and the second starting.
Update: And of course, I completely ignored the fact that this is C++. :-) You should use InterlockExchange to update the counter instead of the C# Interlocked class.