How to avoid freezing other threads which try to access the same map that is being locked by current thread? see below code:
//pseudo code
std::map<string, CSomeClass* > gBigMap;
void AccessMapForWriting(string aString){
pthread_mutex_lock(&MapLock);
CSomeClass* obj = gBigMap[aString];
if (obj){
gBigMap.erase(aString);
delete obj;
obj = NULL;
}
pthread_mutex_unlock(&MapLock);
}
void AccessMapForReading(string aString){
pthread_mutex_lock(&MapLock);
CSomeClass* obj = gBigMap[aString];
//below code consumes much time
//sometimes it even sleeps for milliseconds
if (obj){
obj->TimeConsumingOperation();
}
pthread_mutex_unlock(&MapLock);
}
//other threads will also call
//the same function -- AccessMap
void *OtherThreadFunc(void *){
//call AccessMap here
}
Consider using a read write lock instead, pthread_rwlock_t
There are some details here
It says
"Using a normal mutex, when a thread obtains the mutex all other
threads are forced to block until that mutex is released by the owner.
What about the situation where the vast majority of threads are simply
reading the data? If this is the case then we should not care if there
is 1 or up to N readers in the critical section at the same time. In
fact the only time we would normally care about exclusive ownership is
when a writer needs access to the code section."
You have a std::string as a key. Can you break down that key in a short suffix (possibly just a single letter) and a remainder? Because in that case, you might implement this datastructure as 255 maps with 255 locks. That of course means that most of the time, there's no lock contention because the suffix differs, and therefore the lock.
Related
Example here, just want to protect the iData to ensure only one thread visit it at the same time.
struct myData;
myData iData;
Method 1, mutex inside the call function (multiple mutexes could be created):
void _proceedTest(myData &data)
{
std::mutex mtx;
std::unique_lock<std::mutex> lk(mtx);
modifyData(data);
lk.unlock;
}
int const nMaxThreads = std::thread::hardware_concurrency();
vector<std::thread> threads;
for (int iThread = 0; iThread < nMaxThreads; ++iThread)
{
threads.push_back(std::thread(_proceedTest, iData));
}
for (auto& th : threads) th.join();
Method2, use only one mutex:
void _proceedTest(myData &data, std::mutex &mtx)
{
std::unique_lock<std::mutex> lk(mtx);
modifyData(data);
lk.unlock;
}
std::mutex mtx;
int const nMaxThreads = std::thread::hardware_concurrency();
vector<std::thread> threads;
for (int iThread = 0; iThread < nMaxThreads; ++iThread)
{
threads.push_back(std::thread(_proceedTest, iData, mtx));
}
for (auto& th : threads) th.join();
I want to make sure that the Method 1 (multiple mutexes) ensures that only one thread can visit the iData at the same time.
If Method 1 is correct, not sure Method 1 is better of Method 2?
Thanks!
I want to make sure that the Method 1 (multiple mutexes) ensures that only one thread can visit the iData at the same time.
Your 1st example creates a local mutex variable on the stack, it won't be shared with the other threads. Thus it's completely useless.
It won't guarantee exclusive access to iData.
If Method 1 is correct, not sure Method 1 is better of Method 2?
It isn't correct.
The other answers are correct on the technical level, but there is an important language independent thing missing: you always prefer to minimize the number of different mutexes/locks/... !
Because: as soon as you have more than one thing that a thread needs to acquire in order to do something (to then release all acquired locks) order becomes crucial.
When you have two locks, and you have to different pieces of code, like:
getLockA() {
getLockB() {
do something
release B
release A
And
getLockB() {
getLockA() {
you can quickly run into deadlocks - because two threads/processes can acquire one lock each - and then they are both stuck, waiting for the other one to release its lock. Of course - when looking at the above example "you would never make a mistake, and always go A first then B". But what if those locks exist in completely different parts of your application? And they aren't acquired in the same method or class, but over the course of say 3, 5 nested method invocations?
Thus: when you can solve your problem with one lock - use one lock only! The more locks you need to get something done, the higher the risk to end up in dead locks.
Method 1 only works if you make the mutex variable static.
void _proceedTest(myData &data)
{
static std::mutex mtx;
std::unique_lock<std::mutex> lk(mtx);
modifyData(data);
lk.unlock;
}
This will make mtx be shared by all threads that enter _proceedTest.
Since a static function scope variable is only visible to users of the function, it is not really a sufficient lock for the passed in data. This is because it is conceivable that multiple threads could be calling different functions that each want to manipulate data.
Thus, even though Method 1 is salvageable, Method 2 is still better, even though the cohesion between the lock and the data is weak.
The mutex in version 1 will go out of scope once you leave the _proceedTest scope, locking a mutex like that makes no sense because it will never be accessible to the other thread.
In the second version multiple threads can share the mutex (as long as it doesn't go out of scope, for example as a class member), this way one thread can lock it and the other thread can see that it is locked (and won't be able to lock it aswell, hence the term mutual exclusion).
I'm not sure I got the terminology right but here goes - I have this function that is used by multiple threads to write data (using pseudo code in comments to illustrate what I want)
//these are initiated in the constructor
int* data;
std::atomic<size_t> size;
void write(int value) {
//wait here while "read_lock"
//set "write_lock" to "write_lock" + 1
auto slot = size.fetch_add(1, std::memory_order_acquire);
data[slot] = value;
//set "write_lock" to "write_lock" - 1
}
the order of the writes is not important, all I need here is for each write to go to a unique slot
Every once in a while though, I need one thread to read the data using this function
int* read() {
//set "read_lock" to true
//wait here while "write_lock"
int* ret = data;
data = new int[capacity];
size = 0;
//set "read_lock" to false
return ret;
}
so it basically swaps out the buffer and returns the old one (I've removed capacity logic to make the snippets shorter)
In theory this should lead to 2 operating scenarios:
1 - just a bunch of threads writing into the container
2 - when some thread executes the read function, all new writers will have to wait, the reader will wait until all existing writes are finished, it will then do the read logic and scenario 1 can continue.
The question part is that I don't know what kind of a barrier to use for the locks -
A spinlock would be wasteful since there are many containers like this and they all need cpu cycles
I don't know how to apply std::mutex since I only want the write function to be in a critical section if the read function is triggered. Wrapping the whole write function in a mutex would cause unnecessary slowdown for operating scenario 1.
So what would be the optimal solution here?
If you have C++14 capability then you can use a std::shared_timed_mutex to separate out readers and writers. In this scenario it seems you need to give your writer threads shared access (allowing other writer threads at the same time) and your reader threads unique access (kicking all other threads out).
So something like this may be what you need:
class MyClass
{
public:
using mutex_type = std::shared_timed_mutex;
using shared_lock = std::shared_lock<mutex_type>;
using unique_lock = std::unique_lock<mutex_type>;
private:
mutable mutex_type mtx;
public:
// All updater threads can operate at the same time
auto lock_for_updates() const
{
return shared_lock(mtx);
}
// Reader threads need to kick all the updater threads out
auto lock_for_reading() const
{
return unique_lock(mtx);
}
};
// many threads can call this
void do_writing_work(std::shared_ptr<MyClass> sptr)
{
auto lock = sptr->lock_for_updates();
// update the data here
}
// access the data from one thread only
void do_reading_work(std::shared_ptr<MyClass> sptr)
{
auto lock = sptr->lock_for_reading();
// read the data here
}
The shared_locks allow other threads to gain a shared_lock at the same time but prevent a unique_lock gaining simultaneous access. When a reader thread tries to gain a unique_lock all shared_locks will be vacated before the unique_lock gets exclusive control.
You can also do this with regular mutexes and condition variables rather than shared. Supposedly shared_mutex has higher overhead, so I'm not sure which will be faster. With Gallik's solution you'd presumably be paying to lock the shared mutex on every write call; I got the impression from your post that write gets called way more than read so maybe this is undesirable.
int* data; // initialized somewhere
std::atomic<size_t> size = 0;
std::atomic<bool> reading = false;
std::atomic<int> num_writers = 0;
std::mutex entering;
std::mutex leaving;
std::condition_variable cv;
void write(int x) {
++num_writers;
if (reading) {
--num_writers;
if (num_writers == 0)
{
std::lock_guard l(leaving);
cv.notify_one();
}
{ std::lock_guard l(entering); }
++num_writers;
}
auto slot = size.fetch_add(1, std::memory_order_acquire);
data[slot] = x;
--num_writers;
if (reading && num_writers == 0)
{
std::lock_guard l(leaving);
cv.notify_one();
}
}
int* read() {
int* other_data = new int[capacity];
{
std::unique_lock enter_lock(entering);
reading = true;
std::unique_lock leave_lock(leaving);
cv.wait(leave_lock, [] () { return num_writers == 0; });
swap(data, other_data);
size = 0;
reading = false;
}
return other_data;
}
It's a bit complicated and took me some time to work out, but I think this should serve the purpose pretty well.
In the common case where only writing is happening, reading is always false. So you do the usual, and pay for two additional atomic increments and two untaken branches. So the common path does not need to lock any mutexes, unlike the solution involving a shared mutex, this is supposedly expensive: http://permalink.gmane.org/gmane.comp.lib.boost.devel/211180.
Now, suppose read is called. The expensive, slow heap allocation happens first, meanwhile writing continues uninterrupted. Next, the entering lock is acquired, which has no immediate effect. Now, reading is set to true. Immediately, any new calls to write enter the first branch, and eventually hit the entering lock which they are unable to acquire (as its already taken), and those threads then get put to sleep.
Meanwhile, the read thread is now waiting on the condition that the number of writers is 0. If we're lucky, this could actually go through right away. If however there are threads in write in either of the two locations between incrementing and decrementing num_writers, then it will not. Each time a write thread decrements num_writers, it checks if it has reduced that number to zero, and when it does it will signal the condition variable. Because num_writers is atomic which prevents various reordering shenanigans, it is guaranteed that the last thread will see num_writers == 0; it could also be notified more than once but this is ok and cannot result in bad behavior.
Once that condition variable has been signalled, that shows that all writers are either trapped in the first branch or are done modifying the array. So the read thread can now safely swap the data, and then unlock everything, and then return what it needs to.
As mentioned before, in typical operation there are no locks, just increments and untaken branches. Even when a read does occur, the read thread will have one lock and one condition variable wait, whereas a typical write thread will have about one lock/unlock of a mutex and that's all (one, or a small number of write threads, will also perform a condition variable notification).
I've reached a point in my project that requires communication between threads on resources that very well may be written to, so synchronization is a must. However I don't really understand synchronization at anything other than the basic level.
Consider the last example in this link: http://www.bogotobogo.com/cplusplus/C11/7_C11_Thread_Sharing_Memory.php
#include <iostream>
#include <thread>
#include <list>
#include <algorithm>
#include <mutex>
using namespace std;
// a global variable
std::list<int>myList;
// a global instance of std::mutex to protect global variable
std::mutex myMutex;
void addToList(int max, int interval)
{
// the access to this function is mutually exclusive
std::lock_guard<std::mutex> guard(myMutex);
for (int i = 0; i < max; i++) {
if( (i % interval) == 0) myList.push_back(i);
}
}
void printList()
{
// the access to this function is mutually exclusive
std::lock_guard<std::mutex> guard(myMutex);
for (auto itr = myList.begin(), end_itr = myList.end(); itr != end_itr; ++itr ) {
cout << *itr << ",";
}
}
int main()
{
int max = 100;
std::thread t1(addToList, max, 1);
std::thread t2(addToList, max, 10);
std::thread t3(printList);
t1.join();
t2.join();
t3.join();
return 0;
}
The example demonstrates how three threads, two writers and one reader, accesses a common resource(list).
Two global functions are used: one which is used by the two writer threads, and one being used by the reader thread. Both functions use a lock_guard to lock down the same resource, the list.
Now here is what I just can't wrap my head around: The reader uses a lock in a different scope than the two writer threads, yet still locks down the same resource. How can this work? My limited understanding of mutexes lends itself well to the writer function, there you got two threads using the exact same function. I can understand that, a check is made right as you are about to enter the protected area, and if someone else is already inside, you wait.
But when the scope is different? This would indicate that there is some sort of mechanism more powerful than the process itself, some sort of runtime environment blocking execution of the "late" thread. But I thought there were no such things in c++. So I am at a loss.
What exactly goes on under the hood here?
Let’s have a look at the relevant line:
std::lock_guard<std::mutex> guard(myMutex);
Notice that the lock_guard references the global mutex myMutex. That is, the same mutex for all three threads. What lock_guard does is essentially this:
Upon construction, it locks myMutex and keeps a reference to it.
Upon destruction (i.e. when the guard's scope is left), it unlocks myMutex.
The mutex is always the same one, it has nothing to do with the scope. The point of lock_guard is just to make locking and unlocking the mutex easier for you. For example, if you manually lock/unlock, but your function throws an exception somewhere in the middle, it will never reach the unlock statement. So, doing it the manual way you have to make sure that the mutex is always unlocked. On the other hand, the lock_guard object gets destroyed automatically whenever the function is exited – regardless how it is exited.
myMutex is global, which is what is used to protect myList. guard(myMutex) simply engages the lock and the exit from the block causes its destruction, dis-engaging the lock. guard is just a convenient way to engage and dis-engage the lock.
With that out of the way, mutex does not protect any data. It just provides a way to protect data. It is the design pattern that protects data. So if I write my own function to modify the list as below, the mutex cannot protect it.
void addToListUnsafe(int max, int interval)
{
for (int i = 0; i < max; i++) {
if( (i % interval) == 0) myList.push_back(i);
}
}
The lock only works if all pieces of code that need to access the data engage the lock before accessing and disengage after they are done. This design-pattern of engaging and dis-engaging the lock before and after every access is what protects the data (myList in your case)
Now you would wonder, why use mutex at all, and why not, say, a bool. And yes you can, but you will have to make sure that the bool variable will exhibit certain characteristics including but not limited to the below list.
Not be cached (volatile) across multiple threads.
Read and write will be atomic operation.
Your lock can handle situation where there are multiple execution pipelines (logical cores, etc).
There are different synchronization mechanisms that provide "better locking" (across processes versus across threads, multiple processor versus, single processor, etc) at a cost of "slower performance", so you should always choose a locking mechanism which is just about enough for your situation.
Just to add onto what others here have said...
There is an idea in C++ called Resource Acquisition Is Initialization (RAII) which is this idea of binding resources to the lifetime of objects:
Resource Acquisition Is Initialization or RAII, is a C++ programming technique which binds the life cycle of a resource that must be acquired before use (allocated heap memory, thread of execution, open socket, open file, locked mutex, disk space, database connection—anything that exists in limited supply) to the lifetime of an object.
C++ RAII Info
The use of a std::lock_guard<std::mutex> class follows the RAII idea.
Why is this useful?
Consider a case where you don't use a std::lock_guard:
std::mutex m; // global mutex
void oops() {
m.lock();
doSomething();
m.unlock();
}
in this case, a global mutex is used and is locked before the call to doSomething(). Then once doSomething() is complete the mutex is unlocked.
One problem here is what happens if there is an exception? Now you run the risk of never reaching the m.unlock() line which releases the mutex to other threads.
So you need to cover the case where you run into an exception:
std::mutex m; // global mutex
void oops() {
try {
m.lock();
doSomething();
m.unlock();
} catch(...) {
m.unlock(); // now exception path is covered
// throw ...
}
}
This works but is ugly, verbose, and inconvenient.
Now lets write our own simple lock guard.
class lock_guard {
private:
std::mutex& m;
public:
lock_guard(std::mutex& m_):(m(m_)){ m.lock(); } // lock on construction
~lock_guard() { t.unlock(); }} // unlock on deconstruction
}
When the lock_guard object is destroyed, it will ensure that the mutex is unlocked.
Now we can use this lock_guard to handle the case from before in a better/cleaner way:
std::mutex m; // global mutex
void ok() {
lock_guard lk(m); // our simple lock guard, protects against exception case
doSomething();
} // when scope is exited our lock guard object is destroyed and the mutex unlocked
This is the same idea behind std::lock_guard.
Again this approach is used with many different types of resources which you can read more about by following the link on RAII.
This is precisely what a lock does. When a thread takes the lock, regardless of where in the code it does so, it must wait its turn if another thread holds the lock. When a thread releases a lock, regardless of where in the code it does so, another thread may acquire that lock.
Locks protect data, not code. They do it by ensuring all code that accesses the protected data does so while it holds the lock, excluding other threads from any code that might access that same data.
I'm implementing a Signal/Slot framework, and got to the point that I want it to be thread-safe. I already had a lot of support from the Boost mailing-list, but since this is not really boost-related, I'll ask my pending question here.
When is a signal/slot implementation (or any framework that calls functions outside itself, specified in some way by the user) considered thread-safe? Should it be safe w.r.t. its own data, i.e. the data associated to its implementation details? Or should it also take into account the user's data, which might or might not be modified whatever functions are passed to the framework?
This is an example given on the mailing-list (Edit: this is an example use-case --i.e. user code--. My code is behind the calls to the Emitter object):
int * somePtr = nullptr;
Emitter<Event> em; // just an object that can emit the 'Event' signal
void mainThread()
{
em.connect<Event>(someFunction);
// now, somehow, 2 threads are created which, at some point
// execute the thread1() and thread2() functions below
}
void someFunction()
{
// can somePtr change after the check but before the set?
if (somePtr)
*somePtr = 17;
}
void cleanupPtr()
{
// this looks safe, but compilers and CPUs can reorder this code:
int *tmp = somePtr;
somePtr = null;
delete tmp;
}
void thread1()
{
em.emit<Event>();
}
void thread2()
{
em.disconnect<Event>(someFunction);
// now safe to cleanup (?)
cleanupPtr();
}
In the above code, it might happen that Event is emitted, causing someFunction to be executed. If somePtr is non-null, but becomes null just after the if, but before the assignment, we're in trouble. From the point of view of thread2, this is not obvious because it is disconnecting someFunction before calling cleanupPtr.
I can see why this could potentially lead to trouble, but who's responsibility is this? Should my library protect the user from using it in every irresponsible but imaginable way?
I suspect there is no clearly good answer, but clarity will come from documenting the guarantees you wish to make about concurrent access to an Emitter object.
One level of guarantee, which to me is what is implied by a promise of thread safety, is that:
Concurrent operations on the object are guaranteed to leave the object in a consistent state (at least, from the point of view of the accessing threads.)
Non-commutative operations will be performed as if they were scheduled serially in some (unknown) order.
Then the question is, what does the emit method promise semantically: passing control to the connected routine, or evaluation of the function? If the former, then your work sounds like it is already done; if the latter, then the 'as-if ordered' requirement would mean that you need to enforce some level of synchronisation.
Users of the library can work with either, provided it is clear what is being promised.
Firstly the simplest possibility: If you don't claim your library to be thread-safe, you don't have to bother about this.
(But even) if you do:
In your example the user would have to take care about thread-safety, since both functions could be dangerous, even without using your event-system (IMHO, this is a pretty good way to determine who should take care about those kind of problems). A possible way for him to do this in C++11 could be:
#include <mutex>
// A mutex is used to control thread-acess to a shared resource
std::mutex _somePtr_mutex;
int* somePtr = nullptr;
void someFunction()
{
/*
Create a 'lock_guard' to manage your mutex.
Is the mutex '_somePtr_mutex' already locked?
Yes: Wait until it's unlocked.
No: Lock it and continue execution.
*/
std::lock_guard<std::mutex> lock(_somePtr_mutex);
if(somePtr)
*somePtr = 17;
// End of scope: 'lock' gets destroyed and hence unlocks '_somePtr_mutex'
}
void cleanupPtr()
{
/*
Create a 'lock_guard' to manage your mutex.
Is the mutex '_somePtr_mutex' already locked?
Yes: Wait until it's unlocked.
No: Lock it and continue execution.
*/
std::lock_guard<std::mutex> lock(_somePtr_mutex);
int *tmp = somePtr;
somePtr = null;
delete tmp;
// End of scope: 'lock' gets destroyed and hence unlocks '_somePtr_mutex'
}
The last question is easy. If you say your library is threadsafe, it should threadsafe. It makes no sense to say it is partly threadsafe or, it is only threadsafe if you do not abuse it. In that case you have to explain what exactly is not threadsafe.
Now to your first question regarded someFunction:
The operation is non atomic. Which means the CPU can interrupt between the if and the assigment. And that will happen, I know that :-) The other thread can erase the pointer anytime. Even between two short and fast looking statements.
Now to cleanupPtr:
I am not a compiler expert, but if you want to be shure that your assigment take place in the same moment you wrote it in code you should write the keyword volatile in front of the declaration of somePtr. The compiler will now know that you use that attribute in a multithreaded situation and will not buffer the value in a register of the CPU.
If you have a thread situation with a reader thread and a writer thread, the keyword volatile can (IMHO) be enough to sync them. As long as the attributes you use to exchange information between threads are generic.
For other situations you can use mutex or atomics. I will give you an example for mutex. I use C++11 for that, but it works similar with previous versions of C++ using boost.
Using mutex:
int * somePtr = nullptr;
Emitter<Event> em; // just an object that can emit the 'Event' signal
std::recursive_mutex g_mutex;
void mainThread()
{
em.connect<Event>(someFunction);
// now, somehow, 2 threads are created which, at some point
// execute the thread1() and thread2() functions below
}
void someFunction()
{
std::lock_guard<std::recursive_mutex> lock(g_mutex);
// can somePtr change after the check but before the set?
if (somePtr)
*somePtr = 17;
}
void cleanupPtr()
{
std::lock_guard<std::recursive_mutex> lock(g_mutex);
// this looks safe, but compilers and CPUs can reorder this code:
int *tmp = somePtr;
somePtr = null;
delete tmp;
}
void thread1()
{
em.emit<Event>();
}
void thread2()
{
em.disconnect<Event>(someFunction);
// now safe to cleanup (?)
cleanupPtr();
}
I only added a recursive mutex here without changing any other code of the sample, even if it's now cargo code.
There are two kinds of mutex in the std. A utterly useless std::mutex and the std::recursive_mutex which work like you expect a mutex should work. The std::mutex exclude the access of any further call even from the same thread. Which can happen if a method which needs mutex protection calls a public method which use the same mutex. std::recursive_mutex is reentrant for the same thread.
Atomics (or interlocks in win32) are another way, but only to exchange values between threads or access them concurrently. Your example is missing such values, but in your case, I would look a little deeper in them (std::atomic).
UPDATE
If your are the user of a library which is not explicit declared as threadsafe by the developer, take it as non threadsafe and shield every call to it with a mutex lock.
To stick with the example. If you cannot change someFunction the you have to wrap the function like:
void threadsafeSomeFunction()
{
std::lock_guard<std::recursive_mutex> lock(g_mutex);
someFunction();
}
Is it possible to use mutex to lock only one element of a data structure ?
e.g.
boost::mutex m_mutex;
map<string, int> myMap;
// initialize myMap so that it has 10 elements
// then in thread 1
{
boost::unique_lock<boost::mutex> lock(m_mutex);
myMap[1] = 5 ; // write map[1]
}
// in thread 2
{
boost::unique_lock<boost::mutex> lock(m_mutex);
myMap[2] = 4 ; // write map[1]
}
My question:
When thread 1 is writing map[1], thread 2 can writing map[2] at the same time ?
The thread lock the whole map data structure or only an element, e.g. map[1] or map[2].
thanks
If you can guarantee that nobody is modifying the container itself (via insert and erase etc.), then as long as each thread accesses a different element of the container, you should be fine.
If you need per-element locking, you could modify the element type to something that offers synchronized access. (Worst case a pair of a mutex and the original value.)
You need a different mutex for every element of the map. You can do this with a map of mutex or adding a mutex to the mapped type (in your case it is int, so you can't do it without creating a new class like SharedInt)
Mutexes lock executable regions not objects. I always think about locking any code regions that read/modify thread objects. If an object is locked within a region but that object is accessible within another un-synchronized code region, you are not safe (ofcourse). In your case, I'd lock access to the entire object as insertions and reading from containers can easily experience context switching and thus increase the likelihood of data corruption.
Mutex is all about discipline. One thread can call write and other thread can call write1. C++ runtime will assume it is intentional. But most of the cases it is not the programmer intended. Summary is as long as all threads/methods follow the discipline (understand the the critical section and respect it) there will be consistency.
int i=0;
Write()
{
//Lock
i++;
//Unlock
}
Write1()
{
i++;
}