Given the situation where a Producer Thread creates an object o of an arbitrary type O that then must be read (and only read) by a Consumer Thread, which is the ideal way to accomplish this in an efficient and thread safe way in C++11?
As of now my implementation relies on a producer-consumer model, using a mutexed/conditioned work queue based on a template:
template<typename T> class WorkQueue {
std::list<T> queue;
std::mutex mut;
std::condition_variable cond;
public:
...
}
If o's type is defined as follows:
class WorkItem {
const int value;
public:
WorkItem(int v) : value(v) {}
const int getValue() {
return value;
}
}
and the threads produce-consume WorkItem objects in the heap like this:
WorkQueue<WorkItem*> workQueue;
...
void producerThread() {
workQueue.add(new WorkItem(0));
}
void consumerThread() {
WorkItem *item = workQueue.remove();
doSomething(item.getValue());
delete item;
}
Am I guaranteed that the heap objects will be properly readable by the consumer?
If the answer happens to be no, I'm guessing that WorkItem's members should be protected by a mutex, but that would be quite an inefficient solution, as no locks should be required after the WorkItem has been made available to all the threads. Alternatively, I'm guessing that an atomization based approach could be better in this case.
What you propose looks OK. You might store smart pointers like std::unique_ptr<WorkItem> to gain exception safety if e.g. doSomething() throws. And you should be aware of lock-free queues, but you do not necessarily need to use one, especially if it would mean adding new library dependencies to your project. Finally, all that new memory allocation may someday be a bottleneck, in which case look into using an object pool or otherwise reusing your WorkItems. But most of those things would be premature optimizations right now.
Related
Suppose I have a class:
class WonnaBeMovedClass
{
public:
WonnaBeMovedClass(WonnaBeMovedClass&& other);
void start();
private:
void updateSharedData();
std::vector<int> _sharedData;
std::thread _threadUpdate;
std::mutex _mutexShaderData;
//other stuff
};
WonnaBeMovedClass::WonnaBeMovedClass(WonnaBeMovedClass&& other)
{
_sharedData = std::move(other._sharedData);
_threadUpdate = std::move(other._threadUpdate);
_mutexShaderData = std::move(other._mutexShaderData); //won't compile.
}
void WonnaBeMovedClass::start()
{
_threadUpdate = std::thread(&updateSharedData, this);
}
void WonnaBeMovedClass::updateSharedData()
{
std::lock_guard<std::mutex> lockSharedData(_mutexShaderData);
for (auto& value : _sharedData)
{
++value;
}
}
It won't compile because mutex cannot be moved. It doesn't make sense.
Then I thought that it is possible to workaround this by using pointers instead of actual variables and came up with the following:
class WonnaBeMovedClass
{
public:
WonnaBeMovedClass(WonnaBeMovedClass&& other);
void start();
private:
void updateSharedData();
std::vector<int> _sharedData;
std::unique_ptr<std::thread> _threadUpdate //pointer;
std::unique_ptr<std::mutex> _mutexShaderData //pointer;
//other stuff
};
WonnaBeMovedClass::WonnaBeMovedClass(WonnaBeMovedClass&& other)
{
_sharedData = std::move(other._sharedData);
_threadUpdate = std::move(other._threadUpdate);
_mutexShaderData = std::move(other._mutexShaderData); //won't compile.
}
void WonnaBeMovedClass::start()
{
_threadUpdate = std::make_unique<std::thread>(&updateSharedData, this);
}
void WonnaBeMovedClass::updateSharedData()
{
std::lock_guard<std::mutex> lockSharedData(*_mutexShaderData);
for (auto& value : _sharedData)
{
++value;
}
}
So now when I:
WonnaBeMovedClass object1;
WonnaBeMovedClass object2;
//do stuff
object1 = std::move(object2);
I actually move addresses of both mutex and thread.
It makes more sense now... Or not?
The thread is still working with the data of object1, not object2, so it still doesn't make any sense.
I may have moved the mutex, but the thread is unaware of object2. Or is it?
I am unable to find the answer so I am asking you for help.
Am I doing something completely wrong and copying/moving threads and mutexes is just a bad design and I should rethink the architecture of the program?
Edit:
There was a question about actual purpose of the class. It is actually a TCP/IP client (represented as a class) that holds:
latest data from the server (several data tables, similar to std::vector).
contains methods that manage threads (update state, send/receive messages).
More that one connection could be established at a time, so somewhere in a code there is a std::vector<Client> field that represents all active connections.
Connections are determined by the configuration file.
//read configurations
...
//init clients
for (auto& configuration : _configurations)
{
Client client(configuration);
_activeClients.push_back(client); // this is where compiler reminded me that I am unable to move my object (aka WonnaBeMovedClass object).
}}
I've changed _activeClients from std::vector<Client> to std::vector<std::unique_ptr<Client>> and modified initialization code to create pointer objects instead of objects directly and worked around my issues, but the question remained so I decided to post it here.
Let's break the issue in two.
Moving mutexes. This cannot be done because mutexes are normally implemented in terms of OS objects which must have fixed addresses. In other words, the OS (or the runtime library, which is the same as the OS for our purposes) keeps an address of your mutex. This can be worked around by storing (smart) pointers to mutexes in your code, and moving those instead. The mutex itself doesn't move. Thread objects can be moved so there's no issue.
Moving your own data whereas some active code (a thread or a running function or a std::function stored somewhere or whatever) has the address of your data and can access it. This is actually very similar to the previous case, only instead of the OS it's your own code that holds on the data. The solution, as before, is in not moving your data. Store and move a (smart) pointer to the data instead.
To summarise,
class WonnaBeMovedClass
{
public:
WonnaBeMovedClass
(WonnaBeMovedClass&& other);
void start();
private:
struct tdata {
std::vector<int> _sharedData;
std::thread _threadUpdate;
std::mutex _mutexShaderData;
};
std::shared_ptr<tdata> data;
static void updateSharedData(std::shared_ptr<tdata>);
};
void WonnaBeMovedClass::start()
{
_threadUpdate = std::thread(&updateSharedData, data);
}
It makes more sense now... Or not?
Not really.
If an std::mutex gets moved, the other threads will not be aware of the modification of memory address of that mutex! This discards thread safety.
However, a solution with std::unique_ptr exists in Copy or Move Constructor for a class with a member std::mutex (or other non-copyable object)?
Last, but not least C++14 seems to have something to bring into the play. Read more in How should I deal with mutexes in movable types in C++?
WonnaBeMovedClass is a handle holding thread and mutex, so it is not a bad design to provide them with a move semantics (but not copy).
The second solutions looks fine, but don't forget about proper resource management for your mutex (construct and destruct). I don't really undertand the real life purpose of the class, so depending on the whole solution desing, it might be better to use shared_ptr instead of unique_ptr (in case that multiple WonnaBeMovedClass can share same mutex).
std::thread is itself a handle to system thread, so it doesn't have to be wrapped in a pointer, resource management (i.e OS thread handle) is managed by the standard library itself.
Note that mutex are actually kernel objects (usually implemented as an opaque pointer, for example in Windows API), and thus should not be modified or changed by user code in any way.
I have a threaded class from which I would like to occasionally acquire a pointer an instance variable. I would like this access to be guarded by a mutex so that the thread is blocked from accessing this resource until the client is finished with its pointer.
My initial approach to this is to return a pair of objects: one a pointer to the resource and one a shared_ptr to a lock object on the mutex. This shared_ptr holds the only reference to the lock object so the mutex should be unlocked when it goes out of scope. Something like this:
void A::getResource()
{
Lock* lock = new Lock(&mMutex);
return pair<Resource*, shared_ptr<Lock> >(
&mResource,
shared_ptr<Lock>(lock));
}
This solution is less than ideal because it requires the client to hold onto the entire pair of objects. Behaviour like this breaks the thread safety:
Resource* r = a.getResource().first;
In addition, my own implementation of this is deadlocking and I'm having difficulty determining why, so there may be other things wrong with it.
What I would like to have is a shared_ptr that contains the lock as an instance variable, binding it with the means to access the resource. This seems like something that should have an established design pattern but having done some research I'm surprised to find it quite hard to come across.
My questions are:
Is there a common implementation of this pattern?
Are there issues with putting a mutex inside a shared_ptr that I'm overlooking that prevent this pattern from being widespread?
Is there a good reason not to implement my own shared_ptr class to implement this pattern?
(NB I'm working on a codebase that uses Qt but unfortunately cannot use boost in this case. However, answers involving boost are still of general interest.)
I'm not sure if there are any standard implementations, but since I like re-implementing stuff for no reason, here's a version that should work (assuming you don't want to be able to copy such pointers):
template<class T>
class locking_ptr
{
public:
locking_ptr(T* ptr, mutex* lock)
: m_ptr(ptr)
, m_mutex(lock)
{
m_mutex->lock();
}
~locking_ptr()
{
if (m_mutex)
m_mutex->unlock();
}
locking_ptr(locking_ptr<T>&& ptr)
: m_ptr(ptr.m_ptr)
, m_mutex(ptr.m_mutex)
{
ptr.m_ptr = nullptr;
ptr.m_mutex = nullptr;
}
T* operator ->()
{
return m_ptr;
}
T const* operator ->() const
{
return m_ptr;
}
private:
// disallow copy/assignment
locking_ptr(locking_ptr<T> const& ptr)
{
}
locking_ptr& operator = (locking_ptr<T> const& ptr)
{
return *this;
}
T* m_ptr;
mutex* m_mutex; // whatever implementation you use
};
You're describing a variation of the EXECUTE AROUND POINTER pattern, described by Kevlin Henney in Executing Around Sequences.
I have a prototype implementation at exec_around.h but I can't guarantee it works correctly in all cases as it's a work in progress. It includes a function mutex_around which creates an object and wraps it in a smart pointer that locks and unlocks a mutex when accessed.
There is another approach here. Far less flexible and less generic, but also far simpler. While it still seems to fit your exact scenario.
shared_ptr (both standard and Boost) offers means to construct it while providing another shared_ptr instance which will be used for usage counter and some arbitrary pointer that will not be managed at all. On cppreference.com it is the 8th form (the aliasing constructor).
Now, normally, this form is used for conversions - like providing a shared_ptr to base class object from derived class object. They share ownership and usage counter but (in general) have two different pointer values of different types. This form is also used to provide a shared_ptr to a member value based on shared_ptr to object that it is a member of.
Here we can "abuse" the form to provide lock guard. Do it like this:
auto A::getResource()
{
auto counter = std::make_shared<Lock>(&mMutex);
std::shared_ptr<Resource> result{ counter, &mResource };
return result;
}
The returned shared_ptr points to mResource and keeps mMutex locked for as long as it is used by anyone.
The problem with this solution is that it is now your responsibility to ensure that the mResource remains valid (in particular - it doesn't get destroyed) for that long as well. If locking mMutex is enough for that, then you are fine.
Otherwise, above solution must be adjusted to your particular needs. For example, you might want to have the counter a simple struct that keeps both the Lock and another shared_ptr to the A object owning the mResource.
To add to Adam Badura's answer, for a more general case using std::mutex and std::lock_guard, this worked for me:
auto A::getResource()
{
auto counter = std::make_shared<std::lock_guard<std::mutex>>(mMutex);
std::shared_ptr<Resource> ptr{ counter, &mResource} ;
return ptr;
}
where the lifetimes of std::mutex mMutex and Resource mResource are managed by some class A.
I wrote a class which instances may be accessed by several threads. I used a trick to remember users they have to lock the object before using it. It involves keeping only const instances. When in the need to read or modify sensitive data, other classes should call a method (which is const, thus allowed) to get a non-const version of the locked object. Actually it returns a proxy object containing a pointer to the non-const object and a scoped_lock, so it unlocks the object when going out of scope. The proxy object also overloads operator-> so the access to the object is transparent.
This way, shooting onself's foot by accessing unlocked objects is harder (there is always const_cast).
"Clever tricks" should be avoided, and this smells bad anyway.
Is this design really bad ?
What else can I or should I do ?
Edit: Getters are non-const to enforce locking.
Basic problem: a non-const reference may exist elsewhere. If that gets written safely, it does not follow that it can be read safely -- you may look at an intermediate state.
Also, some const methods might (legitimately) modify hidden internal details in a thread-unsafe way.
Analyse what you're actually doing to the object and find an appropriate synchronisation mode.
If your clever container really does know enough about the objects to control all their synchronisation via proxies, then make those objects private inner classes.
This is clever, but unfortunately doomed to fail.
The problem, underlined by spraff, is that you protect against reads but not against writes.
Consider the following sequence:
unsigned getAverageSalary(Employee const& e) {
return e.paid() / e.hired_duration();
}
What happens if we increment paid between the two function calls ? We get an incoherent value.
The problem is that your scheme does not explicitly enforce locking for reads.
Consider the alternative of a Proxy pattern: The object itself is a bundle of data, all privates. Only a Proxy class (friend) can read/write its data, and when initializing the Proxy it grabs the lock (on the mutex of the object) automatically.
class Data {
friend class Proxy;
Mutex _mutex;
int _bar;
};
class Proxy {
public:
Proxy(Data& data): _lock(data._mutex), _data(data) {}
int bar() const { return _data._bar; }
void bar(int b) { _data._bar = b; }
private:
Proxy(Proxy const&) = delete; // disable copy
Lock _lock;
Data& _data;
};
If I wanted to do what you are doing, I would do one of the following.
Method 1:
shared_mutex m; // somewhere outside the class
class A
{
private:
int variable;
public:
void lock() { m.lock(); }
void unlock() { m.unlock(); }
bool is_locked() { return m.is_locked(); }
bool write_to_var(int newvalue)
{
if (!is_locked())
return false;
variable = newvalue;
return true;
}
bool read_from_var(int *value)
{
if (!is_locked() || value == NULL)
return false;
*value = variable;
return true;
}
};
Method 2:
shared_mutex m; // somewhere outside the class
class A
{
private:
int variable;
public:
void write_to_var(int newvalue)
{
m.lock();
variable = newvalue;
m.unlock();
}
int read_from_var()
{
m.lock();
int to_return = variable;
m.unlock();
return to_return;
}
};
The first method is more efficient (not locking-unlocking all the time), however, the program may need to keep checking the output of every read and write to see if they were successful. The second method automatically handles the locking and so the programmer wouldn't even know the lock is there.
Note: This is not code for copy-paste. It shows a concept and sketches how it's done. Please don't comment saying you forgot some error checking somewhere.
This sounds a lot like Alexandrescu's idea with volatile. You're not
using the actual semantics of const, but rather exploiting the way the
type system uses it. In this regard, I would prefer Alexandrescu's use
of volatile: const has very definite and well understood semantics,
and subverting them will definitely cause confusion for anyone reading
or maintaining the code. volatile is more appropriate, as it has no
well defined semantics, and in the context of most applications, is not
used for anything else.
And rather than returning a classical proxy object, you should return a
smart pointer. You could actually use shared_ptr for this, grabbing
the lock before returning the value, and releasing it in the deleter
(rather than deleting the object); I rather fear, however, that this
would cause some confusion amongst the readers, and I would probably go
with a custom smart pointer (probably using shared_ptr with the custom
deleter in the implementation). (From your description, I suspect that
this is closer to what you had in mind anyway.)
I am trying to make my_class thread-safe like so.
class my_class
{
const std::vector<double>&
get_data() const
{ //lock so that cannot get_data() while setting data
lock l(m_mutex);
return m_data;
}
void
run()
{
vector<double> tmp;
//some calculations on tmp.
{ //lock so that cannot get_data() while setting m_data
lock l(m_mutex);
m_data = tmp; //set the data
}
}
private:
std::vector<double> m_data;
mutex m_mutex;
my_class(); //non-copyable
}
run() and get_data() may be called by different openmp threads and so I introduce a lock.
(Since am using openmp, m_mutex and lock are RAII wrappers around omp_init_lock(); etc. commands).
However, the lock on get_data () is expensive to create and destroy (The most expensive operation when I profile my code - I call get_data() a lot).
Is is possible to reorganise my_class to remove the lock in get_data()? Or is this lock the unavoidable cost of parallelising the code?
First step would be to look into read-write locks: this way multiple readers will not block each other.
The next step would be using lock-free or wait-free operations. There are plenty of resources online describing them better than I would be able to. Just one note: lock-free approaches deal with atomic (interlocked) operations, which means the data size needs to be small. If you go this route, you'll be atomically replacing a pointer to your vector, not the whole vector. This means your class will get a bit more complex and will deal with some pointers and memory management.
It may be cheaper to use a critical section around get_data/run functions, you will not incur additional setup/teardown overhead (as the critical section is statically initialized), but this would also synchronize other instances of the class.
As Scott Meyers and Andrei Alexandrescu outlined in this article the simple try to implement the double-check locking implementation is unsafe in C++ specifically and in general on multi-processor systems without using memory barriers.
I was thinking a little bit about that and came to a solution that avoids using memory barriers and should also work 100% safe in C++. The trick is to store a copy of the pointer to the instance thread-local so each thread has to acquire the lock for the first times it access the singleton.
Here is a little sample code (syntax not checked; I used pthread but all other threading libs could be used):
class Foo
{
private:
Helper *helper;
pthread_key_t localHelper;
pthread_mutex_t mutex;
public:
Foo()
: helper(NULL)
{
pthread_key_create(&localHelper, NULL);
pthread_mutex_init(&mutex);
}
~Foo()
{
pthread_key_delete(&localHelper);
pthread_mutex_destroy(&mutex);
}
Helper *getHelper()
{
Helper *res = pthread_getspecific(localHelper);
if (res == NULL)
{
pthread_mutex_lock(&mutex);
if (helper == NULL)
{
helper = new Helper();
}
res = helper;
pthread_mutex_unlock(&mutex);
pthread_setspecific(localHelper, res);
}
return res;
}
};
What are your comments/opinions?
Do you find any flaws in the idea or the implementation?
EDIT:
Helper is the type of the singleton object (I know the name is not the bet...I took it from the Java examples in the Wikipedia article about DCLP).
Foo is the Singleton container.
EDIT 2:
Because it seems to be a little bit misunderstanding that Foo is not a static class and how it is used, here an example of the usage:
static Foo foo;
.
.
.
foo.getHelper()->doSomething();
.
.
.
The reason that Foo's members are not static is simply that I was able to create/destroy the mutex and the TLS in the constructor/destructor.
If a RAII version of a C++ mutex / TLS class is used Foo can easily be switched to be static.
You seem to be calling:
pthread_mutex_init(&mutex);
...in the Helper() constructor. But that constructor is itself called in the function getHelper() (which should be static, I think) which uses the mutex. So the mutex appears to be initialised twice or not at all.
I find the code very confusing, I must say. Double-checked locking is not that complex. Why don't you start again, and this time create a Mutex class, which does the initialisation, and uses RAI to release the underlying pthread mutex? Then use this Mutex class to implement your locking.
This isn't the double-checked locking pattern. Most of the potential thread safety issues of the pattern are due to the fact the the a common state is read outside of a mutually exclusive lock, and then re-checked inside it.
What you are doing is checking a thread local data item, and then checking the common state inside a lock. This is more like a standard single check singleton pattern with a thread local cached value optimization.
To a casual glance it does look safe, though.
Looks interesting! Clever use of thread-local storage to reduce contention.
But I wonder if this is really different from the problematic approach outlined by Meyers/Alexandrescu...?
Say you have two threads for which the singleton is uninitialized (e.g. thread local slot is empty) and they run getHelper in parallel.
Won't they get into the same race over the helper member? You're still calling operator new and you're still assigning that to a member, so the risk of rogue reordering is still there, right?
EDIT: Ah, I see now. The lock is taken around the NULL-check, so it should be safe. The thread-local replaces the "first" NULL-check of the DCLP.
Obviously a few people are misunderstanding your intent/solution. Something to think about.
I think the real question to be asked is this:
Is calling pthread_getspecific() cheaper than a memory barrier?
Are you trying to make a thread-safe singleton or, implement Meyers & Andrescu's tips? The most straightforward thing is to use a
pthread_once
as is done below. Obviously you're not going for lock-free speed or anything, creating & destroying mutexes as you are so, the code may as well be simple and clear - this is the kind of thing pthread_once was made for. Note, the Helper ptr is static, otherwise it'd be harder to guarantee that there's only one Helper object:
// Foo.h
#include <pthread.h>
class Helper {};
class Foo
{
private:
static Helper* s_pTheHelper;
static ::pthread_once_t once_control;
private:
static void createHelper() { s_pTheHelper = new Helper(); }
public:
Foo()
{ // stuff
}
~Foo()
{ // stuff
}
static Helper* getInstance()
{
::pthread_once(&once_control, Foo::createHelper);
return s_pTheHelper;
}
};
// Foo.cpp
// ..
Helper* Foo::s_pTheHelper = NULL;
::pthread_once_t Foo::once_control = PTHREAD_ONCE_INIT;
// ..