Essentially, my question is:
What does an "good" implementation of a spinlock look like in c++ which works on the "usual" CPU/OS/Compiler combinations (x86 & arm, Windows & Linux, msvc & clang & g++ (maybe also icc) ).
Explanation:
As I wrote in the answer to a different question, it is fairly easy to write a working spinlock in c++11. However, as pointed out (in the comments as well as in e.g. spinlock-vs-stdmutextry-lock), such an implementation comes with some performance problems in case of congestion, which imho can only be solved by using platform specific instructions (intrinsics / os primitives / assembly?).
I'm not looking for a super optimized version (I expect that would only make sense if you have very precise knowledge about the exact platform and workload and need every last bit of efficiency) but something that lives around the mythical 20/80 tradeoff point i.e. I want to avoid the most important pitfalls in most cases while still keeping the solution as simple and understandable as possible.
In general, I'd expect the result to look something like thist:
#include <atomic>
#ifdef _MSC_VER
#include <Windows.h>
#define YIELD_CPU YieldProcessor();
#elif defined(...)
#define YIELD_CPU ...
...
#endif
class SpinLock {
std::atomic_flag locked = ATOMIC_FLAG_INIT;
public:
void lock() {
while (locked.test_and_set(std::memory_order_acquire)) {
YIELD_CPU;
}
}
void unlock() {
locked.clear(std::memory_order_release);
}
};
But I don't know
if a YIELD_CPU macro inside the loop is all that's needed or if there are any other problematic aspects (e.g. can/should we indicate if we expect the test_and_set to succeed most of the time)
what the appropriate mapping for YIELD_CPU on the different CPU/OS/Compiler combinations is (and if possible I'd like to avoid dragging in a heavy weight header like Windows.h)
Note: I'm also interested in answers that only cover a subset of the mentioned platforms, but might not mark them as the accepted answer and/or merge them into a separate community answer.
I have created multiple threads in my application. I want to assign a name to each pthread so I used pthread_setname_np which worked on Ubuntu but is not working on SUSE Linux.
I googled it and came to know '_np' means 'non portable' and this api is not available on all OS flavors of Linux.
So now I want to do it only if the API is available. How to determine whether the api is available or not ? I need something like this.
#ifdef SOME_MACRO
pthread_setname_np(tid, "someName");
#endif
You can use the feature_test_macro _GNU_SOURCE to check if this function might be available:
#ifdef _GNU_SOURCE
pthread_setname_np(tid, "someName");
#endif
But the manual states that the pthread_setname_np and pthread_getname_np are introduced in glibc 2.12. So if you are using an older glibc (say 2.5) then defining _GNU_SOURCE will not help.
So it's best to avoid these non portable function and you can easily name the threads yourself as part of your thread creation, for example, using a map between thread ID and a array such as:
pthread_t tid[128];
char thr_names[128][256]; //each name corresponds to on thread in 'tid'
You can check the glibc version using:
getconf GNU_LIBC_VERSION
Since this function was introduced in glibc 2.12, you could use:
#if ((__GLIBC__ > 2) || ((__GLIBC__ == 2) && (__GLIBC_MINOR__ >= 12)))
pthread_setname_np(tid, "someName");
#endif
This kind of thing - finding out if a particular function exists in your compilation environment - is what people use GNU Autoconf scripts for.
If your project is already using autoconf, you can add this to your configure source, after the point where you have checked for the pthreads compiler and linker flags:
AC_CHECK_FUNCS(pthread_setname_np)
...and it will define a macro HAVE_PTHREAD_SETNAME_NP if the function exists.
What is the differance between boost::details::pool::pthread_mutex and boost::details::pool::null_mutex.
I see that in latest boost version - 1.42, the class boost::details::pool::pthread_mutex was deleted. What should I use instead?
boost::details::pool::null_mutex is a mutex that does nothing (a lock always succeeds immediately). It's appropriate when you're not using threads. The Boost pool library selects what kind of mutex it will use to synchronize access to critical sections with a typedef for the mutex type based on the following snippet from boost\pool\detail\mutex.hpp:
#if !defined(BOOST_HAS_THREADS) || defined(BOOST_NO_MT) || defined(BOOST_POOL_NO_MT)
typedef null_mutex default_mutex;
#else
typedef boost::mutex default_mutex;
#endif
In other words, if the configuration says that no threading is involved (either for Boost as a whole, or for the pool library in particular), then the null_mutex will be used (which is basically a nop).
If threading is to be supported, then the boost::mutex type will be used, which comes from the Boost thread library (and will be a pthread-based mutex if your system uses pthreads).
my case is one thread read and want to
decide if needed to change the value or not?
some thing like below
void set(bool status)
{
if(status == m_status)
return;
monitor.lock();
m_status = status;
}
if this possible?
Using a synchronization object for boolean state is overkill.
On Windows you can use Interlocked Variable Access.
For cross platform solution .. see Boost Atomic
std::atomic from C++11 is also a solution
I think you need to clarify your question a bit. Is it possible? Yes. Is it necessary? Probably. Are there other ways to do it? Yes, as another answer has noted.
Don't forget to unlock when you're done with the things you want to change. And just a stylistic note, I find it much clearer to use your 'if' statement to encase the code block instead of return'ing out of the function. Like this:
void set(bool status)
{
if(status != m_status)
{
monitor.lock();
m_status = status;
monitor.unlock();
}
}
Just my opinion, of course.
Generally it's not possible. It will work most of the time on most platforms, but it's formally undefined and there are cases where cache coherency issues will come to hunt you.
If you can get C++11, use std::atomic<bool> from the new <atomic> header. If not, you should be using legacy compiler-specific equivalent. Windows have Interlocked* functions, GCC has __sync keyword. There is actually a cross-platform implementation of the important bits of the C++11 standard buried deep in Boost.Interprocess library, but it's unfortunately not exposed to the user.
I'm looking for a good reader/writer lock in C++. We have a use case of a single infrequent writer and many frequent readers and would like to optimize for this. Preferable I would like a cross-platform solution, however a Windows only one would be acceptable.
Since C++ 17 (VS2015) you can use the standard:
#include <shared_mutex>
typedef std::shared_mutex Lock;
typedef std::unique_lock< Lock > WriteLock;
typedef std::shared_lock< Lock > ReadLock;
Lock myLock;
void ReadFunction()
{
ReadLock r_lock(myLock);
//Do reader stuff
}
void WriteFunction()
{
WriteLock w_lock(myLock);
//Do writer stuff
}
For older compiler versions and standards you can use boost to create a read-write lock:
#include <boost/thread/locks.hpp>
#include <boost/thread/shared_mutex.hpp>
typedef boost::shared_mutex Lock;
typedef boost::unique_lock< Lock > WriteLock;
typedef boost::shared_lock< Lock > ReadLock;
Newer versions of boost::thread have read/write locks (1.35.0 and later, apparently the previous versions did not work correctly).
They have the names shared_lock, unique_lock, and upgrade_lock and operate on a shared_mutex.
Using standard pre-tested, pre-built stuff is always good (for example, Boost as another answer suggested), but this is something that's not too hard to build yourself. Here's a dumb little implementation pulled out from a project of mine:
#include <pthread.h>
struct rwlock {
pthread_mutex_t lock;
pthread_cond_t read, write;
unsigned readers, writers, read_waiters, write_waiters;
};
void reader_lock(struct rwlock *self) {
pthread_mutex_lock(&self->lock);
if (self->writers || self->write_waiters) {
self->read_waiters++;
do pthread_cond_wait(&self->read, &self->lock);
while (self->writers || self->write_waiters);
self->read_waiters--;
}
self->readers++;
pthread_mutex_unlock(&self->lock);
}
void reader_unlock(struct rwlock *self) {
pthread_mutex_lock(&self->lock);
self->readers--;
if (self->write_waiters)
pthread_cond_signal(&self->write);
pthread_mutex_unlock(&self->lock);
}
void writer_lock(struct rwlock *self) {
pthread_mutex_lock(&self->lock);
if (self->readers || self->writers) {
self->write_waiters++;
do pthread_cond_wait(&self->write, &self->lock);
while (self->readers || self->writers);
self->write_waiters--;
}
self->writers = 1;
pthread_mutex_unlock(&self->lock);
}
void writer_unlock(struct rwlock *self) {
pthread_mutex_lock(&self->lock);
self->writers = 0;
if (self->write_waiters)
pthread_cond_signal(&self->write);
else if (self->read_waiters)
pthread_cond_broadcast(&self->read);
pthread_mutex_unlock(&self->lock);
}
void rwlock_init(struct rwlock *self) {
self->readers = self->writers = self->read_waiters = self->write_waiters = 0;
pthread_mutex_init(&self->lock, NULL);
pthread_cond_init(&self->read, NULL);
pthread_cond_init(&self->write, NULL);
}
pthreads not really being Windows-native, but the general idea is here. This implementation is slightly biased towards writers (a horde of writers can starve readers indefinitely); just modify writer_unlock if you'd rather the balance be the other way around.
Yes, this is C and not C++. Translation is an exercise left to the reader.
Edit
Greg Rogers pointed out that the POSIX standard does specify pthread_rwlock_*. This doesn't help if you don't have pthreads, but it stirred my mind into remembering: Pthreads-w32 should work! Instead of porting this code to non-pthreads for your own use, just use Pthreads-w32 on Windows, and native pthreads everywhere else.
Whatever you decide to use, benchmark your work load against simple locks, as read/write locks tend to be 3-40x slower than simple mutex, when there is no contention.
Here is some reference
C++17 supports std::shared_mutex . It is supported in MSVC++ 2015 and 2017.
Edit: The MSDN Magazine link isn't available anymore. The CodeProject article is now available on https://www.codeproject.com/Articles/32685/Testing-reader-writer-locks and sums it up pretty nicely. Also I found a new MSDN link about Compound Synchronisation Objects.
There is an article about reader-writer locks on MSDN that presents some implementations of them. It also introduces the Slim reader/writer lock, a kernel synchronisation primitive introduced with Vista. There's also a CodeProject article about comparing different implementations (including the MSDN article's ones).
Intel Thread Building Blocks also provide a couple of rw_lock variants:
http://www.threadingbuildingblocks.org/
They have a spin_rw_mutex for very short periods of contention and a queueing_rw_mutex for longer periods of contention. The former can be used in particularly performance sensitive code. The latter is more comparable in performance to that provided by Boost.Thread or directly using pthreads. But profile to make sure which one is a win for your access patterns.
Boost.Thread has since release 1.35.0 already supports reader-writer locks. The good thing about this is that the implementation is greatly cross-platform, peer-reviewed, and is actually a reference implementation for the upcoming C++0x standard.
I can recommend the ACE library, which provides a multitude of locking mechanisms and is ported to various platforms.
Depending on the boundary conditions of your problem, you may find the following classes useful:
ACE_RW_Process_Mutex
ACE_Write_Guard and ACE_Read_Guard
ACE_Condition
http://www.codeproject.com/KB/threads/ReaderWriterLock.aspx
Here is a good and lightweight implementation suitable for most tasks.
Multiple-Reader, Single-Writer Synchronization Lock Class for Win32 by Glenn Slayde
http://www.glennslayden.com/code/win32/reader-writer-lock
#include <shared_mutex>
class Foo {
public:
void Write() {
std::unique_lock lock{mutex_};
// ...
}
void Read() {
std::shared_lock lock{mutex_};
// ...
}
private:
std::shared_mutex mutex_;
};
You could copy Sun's excellent ReentrantReadWriteLock. It includes features such as optional fairness, lock downgrading, and of course reentrancy.
Yes it's in Java, but you can easily read and transpose it to C++, even if you don't know any Java. The documentation I linked to contains all the behavioral properties of this implementation so you can make sure it does what you want.
If nothing else, it's a guide.