I was trying to understand how spinlock mutex works,
so I wrote a simple code (shown below) which measures interleaving of instructions from
different threads under protection of spinlock (or std::) mutex.
Surprisingly, it shows (in gcc at least) that std::mutex (in contrast to spinlock mutex)
seems to favor the thread that owns it, leading to very small instruction interleaving (at best 5%),
unless the instruction in question is very fast (like incrementing a counter).
In that case we can get even 50%. Spinlock mutex gives at least 80% (and typically more than 90%).
Is this a well known fact? Or maybe my code below has a bug?
I mean, I know the rule of thumb saying that mutex should be always locked for smallest amount of time.
But I was convinced that this is so, because we want to reduce serialization of threads,
and not because std::mutex favors the owning thread...
Here is the code:
#include<atomic>
#include<thread>
#include<iostream>
#include<chrono>
#include<mutex>
class SpinLockMutex{
std::atomic_flag m_flag = ATOMIC_FLAG_INIT;
public:
void lock() { while( m_flag.test_and_set(std::memory_order_acquire) ) /*do nothing*/ ; }
void unlock() { m_flag.clear(std::memory_order_release) ; }
};//class SpinLockMutex
// ******************************************
// // // std::mutex vs SpinLockMutex
//SpinLockMutex globalMutex;
std::mutex globalMutex;
// ******************************************
// This class helps to start threads at the same time :
class Starter{
mutable std::mutex m_m;
bool m_ready = false;
public:
bool isReady() const { std::lock_guard<std::mutex> guard(m_m);
return m_ready;
}
void start() { std::this_thread::sleep_for(std::chrono::seconds(3));
std::lock_guard<std::mutex> guard(m_m);
m_ready = true;
}
};//class Starter
constexpr std::size_t LOOP_SIZE = 100;
std::size_t previous_thread_repeated = 0;
Starter starter;
void mainFcnForThread ()
{
static std::thread::id previous_thread_id = std::this_thread::get_id();
while(!starter.isReady())
; //do nothing
for(std::size_t i = 0; i!=LOOP_SIZE ; ++i){
globalMutex.lock();
if(previous_thread_id == std::this_thread::get_id() ) {
++previous_thread_repeated;
std::this_thread::sleep_for(std::chrono::microseconds(100));
}
previous_thread_id = std::this_thread::get_id();
globalMutex.unlock();
}
}//void mainFcnForThread
int main()
{
std::thread t1(mainFcnForThread);
std::thread t2(mainFcnForThread);
starter.start();
t1.join();
t2.join();
std::cout << double(previous_thread_repeated)/(2*LOOP_SIZE) << '\n';
return 0;
}
Mutex makes zero guarantees about fairness.
Unlocking a mutex does not suspend your current thread. Attempting to lock a mutex does not say "wait, someone else has been waiting longer, they should get a go at it".
Blocking on a mutex can sometimes put your thread to sleep.
After you unlock a mutex, you aren't "the owning thread". You are probably a running thread. And mutex can (and apparently does) favor running threads over threads that are suspended.
Implementing "fairness" can be done on top of C++ synchronization primitives, but it isn't free, and C++ aims not to make you pay for anything you don't ask for.
Related
I have two threads that work the producer and consumer sides of a std::queue. The queue isn't often full, so I'd like to avoid the consumer grabbing the mutex that is guarding mutating the queue.
Is it okay to call empty() outside the mutex then only grab the mutex if there is something in the queue?
For example:
struct MyData{
int a;
int b;
};
class SpeedyAccess{
public:
void AddDataFromThread1(MyData data){
const std::lock_guard<std::mutex> queueMutexLock(queueAccess);
workQueue.push(data);
}
void CheckFromThread2(){
if(!workQueue.empty()) // Un-protected access...is this dangerous?
{
queueAccess.lock();
MyData data = workQueue.front();
workQueue.pop();
queueAccess.unlock();
ExpensiveComputation(data);
}
}
private:
void ExpensiveComputation(MyData& data);
std::queue<MyData> workQueue;
std::mutex queueAccess;
}
Thread 2 does the check and isn't particularly time-critical, but will get called a lot (500/sec?). Thread 1 is very time critical, a lot of stuff needs to run there, but isn't called as frequently (max 20/sec).
If I add a mutex guard around empty(), if the queue is empty when thread 2 comes, it won't hold the mutex for long, so might not be a big hit. However, since it gets called so frequently, it might occasionally happen at the same time something is trying to get put on the back....will this cause a substantial amount of waiting in thread 1?
As written in the comments above, you should call empty() only under a lock.
But I believe there is a better way to do it.
You can use a std::condition_variable together with a std::mutex, to achieve synchronization of access to the queue, without locking the mutex more than you must.
However - when using std::condition_variable, you must be aware that it suffers from spurious wakeups. You can read about it here: Spurious wakeup - Wikipedia.
You can see some code examples here:
Condition variable examples.
The correct way to use a std::condition_variable is demonstrated below (with some comments).
This is just a minimal example to show the principle.
#include <thread>
#include <mutex>
#include <condition_variable>
#include <queue>
#include <iostream>
using MyData = int;
std::mutex mtx;
std::condition_variable cond_var;
std::queue<MyData> q;
void producer()
{
MyData produced_val = 0;
while (true)
{
std::this_thread::sleep_for(std::chrono::milliseconds(1000)); // simulate some pause between productions
++produced_val;
std::cout << "produced: " << produced_val << std::endl;
{
// Access the Q under the lock:
std::unique_lock<std::mutex> lck(mtx);
q.push(produced_val);
cond_var.notify_all(); // It's not a must to nofity under the lock but it might be more efficient (see #DavidSchwartz's comment below).
}
}
}
void consumer()
{
while (true)
{
MyData consumed_val;
{
// Access the Q under the lock:
std::unique_lock<std::mutex> lck(mtx);
// NOTE: The following call will lock the mutex only when the the condition_varible will cause wakeup
// (due to `notify` or spurious wakeup).
// Then it will check if the Q is empty.
// If empty it will release the lock and continue to wait.
// If not empty, the lock will be kept until out of scope.
// See the documentation for std::condition_variable.
cond_var.wait(lck, []() { return !q.empty(); }); // will loop internally to handle spurious wakeups
consumed_val = q.front();
q.pop();
}
std::cout << "consumed: " << consumed_val << std::endl;
std::this_thread::sleep_for(std::chrono::milliseconds(200)); // simulate some calculation
}
}
int main()
{
std::thread p(producer);
std::thread c(consumer);
while(true) {}
p.join(); c.join(); // will never happen in our case but to remind us what is needed.
return 0;
}
Some notes:
In your real code, none of the threads should run forever. You should have some mechanism to notify them to gracefully exit.
The global variables (mtx,q etc.) are better to be members of some context class, or passed to the producer() and consumer() as parameters.
This example assumes for simplicity that the producer's production rate is always low relatively to the consumer's rate. In your real code you can make it more general, by making the consumer extract all elements in the Q each time the condition_variable is signaled.
You can "play" with the sleep_for times for the producer and consumer to test varios timing cases.
I've implemented a "Ticket" class which is shared as a shared_ptr between multiple threads.
The program flow is like this:
parallelQuery() is called to start a new query job. A shared instance of Ticket is created.
The query is split into multiple tasks, each task is enqueued on a worker thread (this part is important, otherwise I'd just join threads and done). Each task gets the shared ticket.
ticket.wait() is called to wait for all tasks of the job to complete.
When one task is done it calls the done() method on the ticket.
When all tasks are done the ticket is unlocked, result data from the task aggregated and returned from parallelQuery()
In pseudo code:
std::vector<T> parallelQuery(std::string str) {
auto ticket = std::make_shared<Ticket>(2);
auto task1 = std::make_unique<Query>(ticket, str+"a");
addTaskToWorker(task1);
auto task2 = std::make_unique<Query>(ticket, str+"b");
addTaskToWorker(task2);
ticket->waitUntilDone();
auto result = aggregateData(task1, task2);
return result;
}
My code works. But I wonder if it is theoretically possible that it can lead to a deadlock in case when unlocking the mutex is executed right before it gets locked again by the waiter thread calling waitUntilDone().
Is this a possibility, and how to avoid this trap?
Here is the complete Ticket class, note the execution order example comments related to the problem description above:
#include <mutex>
#include <atomic>
class Ticket {
public:
Ticket(int numTasks = 1) : _numTasks(numTasks), _done(0), _canceled(false) {
_mutex.lock();
}
void waitUntilDone() {
_doneLock.lock();
if (_done != _numTasks) {
_doneLock.unlock(); // Execution order 1: "waiter" thread is here
_mutex.lock(); // Execution order 3: "waiter" thread is now in a dealock?
}
else {
_doneLock.unlock();
}
}
void done() {
_doneLock.lock();
_done++;
if (_done == _numTasks) {
_mutex.unlock(); // Execution order 2: "task1" thread unlocks the mutex
}
_doneLock.unlock();
}
void cancel() {
_canceled = true;
_mutex.unlock();
}
bool wasCanceled() {
return _canceled;
}
bool isDone() {
return _done >= _numTasks;
}
int getNumTasks() {
return _numTasks;
}
private:
std::atomic<int> _numTasks;
std::atomic<int> _done;
std::atomic<bool> _canceled;
// mutex used for caller wait state
std::mutex _mutex;
// mutex used to safeguard done counter with lock condition in waitUntilDone
std::mutex _doneLock;
};
One possible solution which just came to my mind when editing the question is that I can put _done++; before the _doneLock(). Eventually, this should be enough?
Update
I've updated the Ticket class based on the suggestions provided by Tomer and Phil1970. Does the following implementation avoid mentioned pitfalls?
class Ticket {
public:
Ticket(int numTasks = 1) : _numTasks(numTasks), _done(0), _canceled(false) { }
void waitUntilDone() {
std::unique_lock<std::mutex> lock(_mutex);
// loop to avoid spurious wakeups
while (_done != _numTasks && !_canceled) {
_condVar.wait(lock);
}
}
void done() {
std::unique_lock<std::mutex> lock(_mutex);
// just bail out in case we call done more often than needed
if (_done == _numTasks) {
return;
}
_done++;
_condVar.notify_one();
}
void cancel() {
std::unique_lock<std::mutex> lock(_mutex);
_canceled = true;
_condVar.notify_one();
}
const bool wasCanceled() const {
return _canceled;
}
const bool isDone() const {
return _done >= _numTasks;
}
const int getNumTasks() const {
return _numTasks;
}
private:
std::atomic<int> _numTasks;
std::atomic<int> _done;
std::atomic<bool> _canceled;
std::mutex _mutex;
std::condition_variable _condVar;
};
Don't write your own wait methods but use std::condition_variable instead.
https://en.cppreference.com/w/cpp/thread/condition_variable.
Mutexes usage
Generally, a mutex should protect a given region of code. That is, it should lock, do its work and unlock. In your class, you have multiple method where some lock _mutex while other unlock it. This is very error-prone as if you call the method in the wrong order, you might well be in an inconsistant state. What happen if a mutex is lock twice? or unlocked when already unlocked?
The other thing to be aware with mutex is that if you have multiple mutexes, it that you can easily have deadlock if you need to lock both mutexes but don't do it in consistant order. Suppose that thread A lock mutex 1 first and the mutex 2, and thread B lock them in the opposite order (mutex 2 first). There is a possibility that something like this occurs:
Thread A lock mutex 1
Thread B lock mutex 2
Thread A want to lock mutex 2 but cannot as it is already locked.
Thread B want to lock mutex 1 but cannot as it is already locked.
Both thread will wait forever
So in your code, you should at least have some checks to ensure proper usage. For example, you should verify _canceled before unlocking the mutex to ensure cancel is called only once.
Solution
I will just gave some ideas
Declare a mutux and a condition_variable to manage the done condition in your class.
std::mutex doneMutex;
std::condition_variable done_condition;
Then waitUntilDone would look like:
void waitUntilDone()
{
std::unique_lock<std::mutex> lk(doneMutex);
done_condition.wait(lk, []{ return isDone() || wasCancelled();});
}
And done function would look like:
void done()
{
std::lock_guard<std::mutex> lk(doneMutex);
_done++;
if (_done == _numTasks)
{
doneCondition.notify_one();
}
}
And cancel function would become
void done()
{
std::lock_guard<std::mutex> lk(doneMutex);
_cancelled = true;
doneCondition.notify_one();
}
As you can see, you only have one mutex now so you basically eliminate the possibility of a deadlock.
Variable naming
I suggest you to not use lock in the name of you mutex since it is confusing.
std::mutex someMutex;
std::guard_lock<std::mutex> someLock(someMutex); // std::unique_lock when needed
That way, it is far easier to know which variable refer to the mutex and which one to the lock of the mutex.
Good reading
If you are serious about multithreading, then you should buy that book:
C++ Concurrency in Action
Practical Multithreading
Anthony Williams
Code Review (added section)
Essentially same code has beed posted to CODE REVIEW: https://codereview.stackexchange.com/questions/225863/multithreading-ticket-class-to-wait-for-parallel-task-completion/225901#225901.
I have put an answer there that include some extra points.
You not need to use mutex for operate with atomic values
UPD
my answer to mainn question was wrong. I deleted one.
You can use simple (non atomic) int _numTasks; also. And you not need shared pointer - just create Task on the stack and pass pointer
Ticket ticket(2);
auto task1 = std::make_unique<Query>(&ticket, str+"a");
addTaskToWorker(task1);
or unique ptr if you like
auto ticket = std::make_unique<Ticket>(2);
auto task1 = std::make_unique<Query>(ticket.get(), str+"a");
addTaskToWorker(task1);
because shared pointer can be cut by Occam's razor :)
I am using C++11 and I have a std::thread which is a class member, and it sends information to listeners every 2 minutes. Other that that it just sleeps. So, I have made it sleep for 2 minutes, then send the required info, and then sleep for 2 minutes again.
// MyClass.hpp
class MyClass {
~MyClass();
RunMyThread();
private:
std::thread my_thread;
std::atomic<bool> m_running;
}
MyClass::RunMyThread() {
my_thread = std::thread { [this, m_running] {
m_running = true;
while(m_running) {
std::this_thread::sleep_for(std::chrono::minutes(2));
SendStatusInfo(some_info);
}
}};
}
// Destructor
~MyClass::MyClass() {
m_running = false; // this wont work as the thread is sleeping. How to exit thread here?
}
Issue:
The issue with this approach is that I cannot exit the thread while it is sleeping. I understand from reading that I can wake it using a std::condition_variable and exit gracefully? But I am struggling to find a simple example which does the bare minimum as required in above scenario. All the condition_variable examples I've found look too complex for what I am trying to do here.
Question:
How can I use a std::condition_variable to wake the thread and exit gracefully while it is sleeping? Or are there any other ways of achieving the same without the condition_variable technique?
Additionally, I see that I need to use a std::mutex in conjunction with std::condition_variable? Is that really necessary? Is it not possible to achieve the goal by adding the std::condition_variable logic only to required places in the code here?
Environment:
Linux and Unix with compilers gcc and clang.
How can I use an std::condition_variable to wake the thread and exit gracefully while it was sleeping? Or are there any other ways of achieving the same without condition_variable technique?
No, not in standard C++ as of C++17 (there are of course non-standard, platform-specific ways to do it, and it's likely some kind of semaphore will be added to C++2a).
Additionally, I see that I need to use a std::mutex in conjunction with std::condition_variable? Is that really necessary?
Yes.
Is it not possible to achieve the goal by adding the std::condition_variable logic only to required places in the code piece here?
No. For a start, you can't wait on a condition_variable without locking a mutex (and passing the lock object to the wait function) so you need to have a mutex present anyway. Since you have to have a mutex anyway, requiring both the waiter and the notifier to use that mutex isn't such a big deal.
Condition variables are subject to "spurious wake ups" which means they can stop waiting for no reason. In order to tell if it woke because it was notified, or woke spuriously, you need some state variable that is set by the notifying thread and read by the waiting thread. Because that variable is shared by multiple threads it needs to be accessed safely, which the mutex ensures.
Even if you use an atomic variable for the share variable, you still typically need a mutex to avoid missed notifications.
This is all explained in more detail in
https://github.com/isocpp/CppCoreGuidelines/issues/554
A working example for you using std::condition_variable:
struct MyClass {
MyClass()
: my_thread([this]() { this->thread(); })
{}
~MyClass() {
{
std::lock_guard<std::mutex> l(m_);
stop_ = true;
}
c_.notify_one();
my_thread.join();
}
void thread() {
while(this->wait_for(std::chrono::minutes(2)))
SendStatusInfo(some_info);
}
// Returns false if stop_ == true.
template<class Duration>
bool wait_for(Duration duration) {
std::unique_lock<std::mutex> l(m_);
return !c_.wait_for(l, duration, [this]() { return stop_; });
}
std::condition_variable c_;
std::mutex m_;
bool stop_ = false;
std::thread my_thread;
};
How can I use an std::condition_variable to wake the thread and exit gracefully while it was sleeping?
You use std::condition_variable::wait_for() instead of std::this_thread::sleep_for() and first one can be interrupted by std::condition_variable::notify_one() or std::condition_variable::notify_all()
Additionally, I see that I need to use a std::mutex in conjunction with std::condition_variable? Is that really necessary? Is it not possible to achieve the goal by adding the std::condition_variable logic only to required places in the code piece here?
Yes it is necessary to use std::mutex with std::condition_variable and you should use it instead of making your flag std::atomic as despite atomicity of flag itself you would have race condition in your code and you will notice that sometimes your sleeping thread would miss notification if you would not use mutex here.
There is a sad, but true fact - what you are looking for is a signal, and Posix threads do not have a true signalling mechanism.
Also, the only Posix threading primitive associated with any sort of timing is conditional variable, this is why your online search lead you to it, and since C++ threading model is heavily built on Posix API, in standard C++ Posix-compatible primitives is all you get.
Unless you are willing to go outside of Posix (you do not indicate platform, but there are native platform ways to work with events which are free from those limitations, notably eventfd in Linux) you will have to stick with condition variables and yes, working with condition variable requires a mutex, since it is built into API.
Your question doesn't specifically ask for code sample, so I am not providing any. Let me know if you'd like some.
Additionally, I see that I need to use a std::mutex in conjunction with std::condition_variable? Is that really necessary? Is it not possible to achieve the goal by adding the std::condition_variable logic only to required places in the code piece here?
std::condition_variable is a low level primitive. Actually using it requires fiddling with other low level primitives as well.
struct timed_waiter {
void interrupt() {
auto l = lock();
interrupted = true;
cv.notify_all();
}
// returns false if interrupted
template<class Rep, class Period>
bool wait_for( std::chrono::duration<Rep, Period> how_long ) const {
auto l = lock();
return !cv.wait_until( l,
std::chrono::steady_clock::now() + how_long,
[&]{
return !interrupted;
}
);
}
private:
std::unique_lock<std::mutex> lock() const {
return std::unique_lock<std::mutex>(m);
}
mutable std::mutex m;
mutable std::condition_variable cv;
bool interrupted = false;
};
simply create a timed_waiter somewhere both the thread(s) that wants to wait, and the code that wants to interrupt, can see it.
The waiting threads do
while(m_timer.wait_for(std::chrono::minutes(2))) {
SendStatusInfo(some_info);
}
to interrupt do m_timer.interrupt() (say in the dtor) then my_thread.join() to let it finish.
Live example:
struct MyClass {
~MyClass();
void RunMyThread();
private:
std::thread my_thread;
timed_waiter m_timer;
};
void MyClass::RunMyThread() {
my_thread = std::thread {
[this] {
while(m_timer.wait_for(std::chrono::seconds(2))) {
std::cout << "SendStatusInfo(some_info)\n";
}
}};
}
// Destructor
MyClass::~MyClass() {
std::cout << "~MyClass::MyClass\n";
m_timer.interrupt();
my_thread.join();
std::cout << "~MyClass::MyClass done\n";
}
int main() {
std::cout << "start of main\n";
{
MyClass x;
x.RunMyThread();
using namespace std::literals;
std::this_thread::sleep_for(11s);
}
std::cout << "end of main\n";
}
Or are there any other ways of achieving the same without the condition_variable technique?
You can use std::promise/std::future as a simpler alternative to a bool/condition_variable/mutex in this case. A future is not susceptible to spurious wakes and doesn't require a mutex for synchronisation.
Basic example:
std::promise<void> pr;
std::thread thr{[fut = pr.get_future()]{
while(true)
{
if(fut.wait_for(std::chrono::minutes(2)) != std::future_status::timeout)
return;
}
}};
//When ready to stop
pr.set_value();
thr.join();
Or are there any other ways of achieving the same without condition_variable technique?
One alternative to a condition variable is you can wake your thread up at much more regular intervals to check the "running" flag and go back to sleep if it is not set and the allotted time has not yet expired:
void periodically_call(std::atomic_bool& running, std::chrono::milliseconds wait_time)
{
auto wake_up = std::chrono::steady_clock::now();
while(running)
{
wake_up += wait_time; // next signal send time
while(std::chrono::steady_clock::now() < wake_up)
{
if(!running)
break;
// sleep for just 1/10 sec (maximum)
auto pre_wake_up = std::chrono::steady_clock::now() + std::chrono::milliseconds(100);
pre_wake_up = std::min(wake_up, pre_wake_up); // don't overshoot
// keep going to sleep here until full time
// has expired
std::this_thread::sleep_until(pre_wake_up);
}
SendStatusInfo(some_info); // do the regular call
}
}
Note: You can make the actual wait time anything you want. In this example I made it 100ms std::chrono::milliseconds(100). It depends how responsive you want your thread to be to a signal to stop.
For example in one application I made that one whole second because I was happy for my application to wait a full second for all the threads to stop before it closed down on exit.
How responsive you need it to be is up to your application. The shorter the wake up times the more CPU it consumes. However even very short intervals of a few milliseconds will probably not register much in terms of CPU time.
You could also use promise/future so that you don't need to bother with conditionnal and/or threads:
#include <future>
#include <iostream>
struct MyClass {
~MyClass() {
_stop.set_value();
}
MyClass() {
auto future = std::shared_future<void>(_stop.get_future());
_thread_handle = std::async(std::launch::async, [future] () {
std::future_status status;
do {
status = future.wait_for(std::chrono::seconds(2));
if (status == std::future_status::timeout) {
std::cout << "do periodic things\n";
} else if (status == std::future_status::ready) {
std::cout << "exiting\n";
}
} while (status != std::future_status::ready);
});
}
private:
std::promise<void> _stop;
std::future<void> _thread_handle;
};
// Destructor
int main() {
MyClass c;
std::this_thread::sleep_for(std::chrono::seconds(9));
}
Using MS Visual C++2012
A class has a member of type std::atomic_flag
class A {
public:
...
std::atomic_flag lockFlag;
A () { std::atomic_flag_clear (&lockFlag); }
};
There is an object of type A
A object;
who can be accessed by two (Boost) threads
void thr1(A* objPtr) { ... }
void thr2(A* objPtr) { ... }
The idea is wait the thread if the object is being accessed by the other thread.
The question is: do it is possible construct such mechanism with an atomic_flag object? Not to say that for the moment, I want some lightweight that a boost::mutex.
By the way the process involved in one of the threads is very long query to a dBase who get many rows, and I only need suspend it in a certain zone of code where the collision occurs (when processing each row) and I can't wait the entire thread to finish join().
I've tryed in each thread some as:
thr1 (A* objPtr) {
...
while (std::atomic_flag_test_and_set_explicit (&objPtr->lockFlag, std::memory_order_acquire)) {
boost::this_thread::sleep(boost::posix_time::millisec(100));
}
... /* Zone to portect */
std::atomic_flag_clear_explicit (&objPtr->lockFlag, std::memory_order_release);
... /* the process continues */
}
But with no success, because the second thread hangs. In fact, I don't completely understand the mechanism involved in the atomic_flag_test_and_set_explicit function. Neither if such function returns inmediately or can delay until the flag can be locked.
Also it is a mistery to me how to get a lock mechanism with such a function who always set the value, and return the previous value. with no option to only read the actual setting.
Any suggestion are welcome.
By the way the process involved in one of the threads is very long query to a dBase who get many rows, and I only need suspend it in a certain zone of code where the collision occurs (when processing each row) and I can't wait the entire thread to finish join().
Such a zone is known as the critical section. The simplest way to work with a critical section is to lock by mutual exclusion.
The mutex solution suggested is indeed the way to go, unless you can prove that this is a hotspot and the lock contention is a performance problem. Lock-free programming using just atomic and intrinsics is enormously complex and cannot be recommended at this level.
Here's a simple example showing how you could do this (live on http://liveworkspace.org/code/6af945eda5132a5221db823fa6bde49a):
#include <iostream>
#include <thread>
#include <mutex>
struct A
{
std::mutex mux;
int x;
A() : x(0) {}
};
void threadf(A* data)
{
for(int i=0; i<10; ++i)
{
std::lock_guard<std::mutex> lock(data->mux);
data->x++;
}
}
int main(int argc, const char *argv[])
{
A instance;
auto t1 = std::thread(threadf, &instance);
auto t2 = std::thread(threadf, &instance);
t1.join();
t2.join();
std::cout << instance.x << std::endl;
return 0;
}
It looks like you're trying to write a spinlock. Yes, you can do that with std::atomic_flag, but you are better off using std::mutex instead. Don't use atomics unless you really know what you're doing.
To actually answer the question asked: Yes, you can use std::atomic_flag to create a thread locking object called a spinlock.
#include <atomic>
class atomic_lock
{
public:
atomic_lock()
: lock_( ATOMIC_FLAG_INIT )
{}
void lock()
{
while ( lock_.test_and_set() ) { } // Spin until the lock is acquired.
}
void unlock()
{
lock_.clear();
}
private:
std::atomic_flag lock_;
};
I have a set of data structures I need to protect with a readers/writer lock. I am aware of boost::shared_lock, but I would like to have a custom implementation using std::mutex, std::condition_variable and/or std::atomic so that I can better understand how it works (and tweak it later).
Each data structure (moveable, but not copyable) will inherit from a class called Commons which encapsulates the locking. I'd like the public interface to look something like this:
class Commons {
public:
void read_lock();
bool try_read_lock();
void read_unlock();
void write_lock();
bool try_write_lock();
void write_unlock();
};
...so that it can be publicly inherited by some:
class DataStructure : public Commons {};
I'm writing scientific code and can generally avoid data races; this lock is mostly a safeguard against the mistakes I'll probably make later. Thus my priority is low read overhead so I don't hamper a correctly-running program too much. Each thread will probably run on its own CPU core.
Could you please show me (pseudocode is ok) a readers/writer lock? What I have now is supposed to be the variant that prevents writer starvation. My main problem so far has been the gap in read_lock between checking if a read is safe to actually incrementing a reader count, after which write_lock knows to wait.
void Commons::write_lock() {
write_mutex.lock();
reading_mode.store(false);
while(readers.load() > 0) {}
}
void Commons::try_read_lock() {
if(reading_mode.load()) {
//if another thread calls write_lock here, bad things can happen
++readers;
return true;
} else return false;
}
I'm kind of new to multithreading, and I'd really like to understand it. Thanks in advance for your help!
Here's pseudo-code for a ver simply reader/writer lock using a mutex and a condition variable. The mutex API should be self-explanatory. Condition variables are assumed to have a member wait(Mutex&) which (atomically!) drops the mutex and waits for the condition to be signaled. The condition is signaled with either signal() which wakes up one waiter, or signal_all() which wakes up all waiters.
read_lock() {
mutex.lock();
while (writer)
unlocked.wait(mutex);
readers++;
mutex.unlock();
}
read_unlock() {
mutex.lock();
readers--;
if (readers == 0)
unlocked.signal_all();
mutex.unlock();
}
write_lock() {
mutex.lock();
while (writer || (readers > 0))
unlocked.wait(mutex);
writer = true;
mutex.unlock();
}
write_unlock() {
mutex.lock();
writer = false;
unlocked.signal_all();
mutex.unlock();
}
That implementation has quite a few drawbacks, though.
Wakes up all waiters whenever the lock becomes available
If most of the waiters are waiting for a write lock, this is wastefull - most waiters will fail to acquire the lock, after all, and resume waiting. Simply using signal() doesn't work, because you do want to wake up everyone waiting for a read lock unlocking. So to fix that, you need separate condition variables for readability and writability.
No fairness. Readers starve writers
You can fix that by tracking the number of pending read and write locks, and either stop acquiring read locks once there a pending write locks (though you'll then starve readers!), or randomly waking up either all readers or one writer (assuming you use separate condition variable, see section above).
Locks aren't dealt out in the order they are requested
To guarantee this, you'll need a real wait queue. You could e.g. create one condition variable for each waiter, and signal all readers or a single writer, both at the head of the queue, after releasing the lock.
Even pure read workloads cause contention due to the mutex
This one is hard to fix. One way is to use atomic instructions to acquire read or write locks (usually compare-and-exchange). If the acquisition fails because the lock is taken, you'll have to fall back to the mutex. Doing that correctly is quite hard, though. Plus, there'll still be contention - atomic instructions are far from free, especially on machines with lots of cores.
Conclusion
Implementing synchronization primitives correctly is hard. Implementing efficient and fair synchronization primitives is even harder. And it hardly ever pays off. pthreads on linux, e.g. contains a reader/writer lock which uses a combination of futexes and atomic instructions, and which thus probably outperforms anything you can come up with in a few days of work.
Check this class:
//
// Multi-reader Single-writer concurrency base class for Win32
//
// (c) 1999-2003 by Glenn Slayden (glenn#glennslayden.com)
//
//
#include "windows.h"
class MultiReaderSingleWriter
{
private:
CRITICAL_SECTION m_csWrite;
CRITICAL_SECTION m_csReaderCount;
long m_cReaders;
HANDLE m_hevReadersCleared;
public:
MultiReaderSingleWriter()
{
m_cReaders = 0;
InitializeCriticalSection(&m_csWrite);
InitializeCriticalSection(&m_csReaderCount);
m_hevReadersCleared = CreateEvent(NULL,TRUE,TRUE,NULL);
}
~MultiReaderSingleWriter()
{
WaitForSingleObject(m_hevReadersCleared,INFINITE);
CloseHandle(m_hevReadersCleared);
DeleteCriticalSection(&m_csWrite);
DeleteCriticalSection(&m_csReaderCount);
}
void EnterReader(void)
{
EnterCriticalSection(&m_csWrite);
EnterCriticalSection(&m_csReaderCount);
if (++m_cReaders == 1)
ResetEvent(m_hevReadersCleared);
LeaveCriticalSection(&m_csReaderCount);
LeaveCriticalSection(&m_csWrite);
}
void LeaveReader(void)
{
EnterCriticalSection(&m_csReaderCount);
if (--m_cReaders == 0)
SetEvent(m_hevReadersCleared);
LeaveCriticalSection(&m_csReaderCount);
}
void EnterWriter(void)
{
EnterCriticalSection(&m_csWrite);
WaitForSingleObject(m_hevReadersCleared,INFINITE);
}
void LeaveWriter(void)
{
LeaveCriticalSection(&m_csWrite);
}
};
I didn't have a chance to try it, but the code looks OK.
You can implement a Readers-Writers lock following the exact Wikipedia algorithm from here (I wrote it):
#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
int g_sharedData = 0;
int g_readersWaiting = 0;
std::mutex mu;
bool g_writerWaiting = false;
std::condition_variable cond;
void reader(int i)
{
std::unique_lock<std::mutex> lg{mu};
while(g_writerWaiting)
cond.wait(lg);
++g_readersWaiting;
// reading
std::cout << "\n reader #" << i << " is reading data = " << g_sharedData << '\n';
// end reading
--g_readersWaiting;
while(g_readersWaiting > 0)
cond.wait(lg);
cond.notify_one();
}
void writer(int i)
{
std::unique_lock<std::mutex> lg{mu};
while(g_writerWaiting)
cond.wait(lg);
// writing
std::cout << "\n writer #" << i << " is writing\n";
g_sharedData += i * 10;
// end writing
g_writerWaiting = true;
while(g_readersWaiting > 0)
cond.wait(lg);
g_writerWaiting = false;
cond.notify_all();
}//lg.unlock()
int main()
{
std::thread reader1{reader, 1};
std::thread reader2{reader, 2};
std::thread reader3{reader, 3};
std::thread reader4{reader, 4};
std::thread writer1{writer, 1};
std::thread writer2{writer, 2};
std::thread writer3{writer, 3};
std::thread writer4{reader, 4};
reader1.join();
reader2.join();
reader3.join();
reader4.join();
writer1.join();
writer2.join();
writer3.join();
writer4.join();
return(0);
}
I believe this is what you are looking for:
class Commons {
std::mutex write_m_;
std::atomic<unsigned int> readers_;
public:
Commons() : readers_(0) {
}
void read_lock() {
write_m_.lock();
++readers_;
write_m_.unlock();
}
bool try_read_lock() {
if (write_m_.try_lock()) {
++readers_;
write_m_.unlock();
return true;
}
return false;
}
// Note: unlock without holding a lock is Undefined Behavior!
void read_unlock() {
--readers_;
}
// Note: This implementation uses a busy wait to make other functions more efficient.
// Consider using try_write_lock instead! and note that the number of readers can be accessed using readers()
void write_lock() {
while (readers_) {}
if (!write_m_.try_lock())
write_lock();
}
bool try_write_lock() {
if (!readers_)
return write_m_.try_lock();
return false;
}
// Note: unlock without holding a lock is Undefined Behavior!
void write_unlock() {
write_m_.unlock();
}
int readers() {
return readers_;
}
};
For the record since C++17 we have std::shared_mutex, see: https://en.cppreference.com/w/cpp/thread/shared_mutex