Encapsulated boost thread_group. Questions about ids and synchronization - c++

I´m using a class that encapsulates a thread_group, and have some questions about it
class MyGroup{
private:
boost::this_thread::id _id;
boost::thread::thread_group group;
int abc;
//other attributes
public:
void foo();
};
In the class constructor, i launch N threads
for (size_t i=0;i<N;i++){
group.add(new boost::thread(boost::bind(&foo,this)));
}
void foo(){
_id = boost::this_thread::get_id();
//more code.
abc++ //needs to be sync?
}
So, here are my questions.
Do class attributes need to be synchronized?
Do every thread get a different id? For example, if I have
void bar(){
this->id_;
}
will this result in different ids for each thread, or the same for everyone?
Thanks in advance !

Yes, shared data access must be protected even if you use thread creation helpers as boost.
In the end they all will execute the same code at the same time, and there is nothing a library can do to put protection around a variable you own and you manage.
If this->_id prints the current thread id then yes, it will print different values while different threads access it.

I don't know what you are doing with this thread_group so this may or may not apply.
Yes, all threads will have a unique ID.
Yes, you need to protect your shared state, you can do this with synchronization or by 'avoiding' shared state by copying it or with message passing.
A relevant pattern here is the 'actor' pattern.
Essentially rather than just create threads in your constructor, consider either:
a) have a class that derives from boost::thread and store thread specific members there. you can then access the member variables in the thread which won't be global to the group.
e.g.
class MyThreadClass : public boost::thread
{
private:
int thread_local_int;
...
}
b) have a class that contains a boost::thread as a member variable
class MyThreadClass : public boost::thread
{
private:
int thread_local_int;
boost::thread t;
public:
boost::thread& GetThread()
{
return t;
}
...
}
store a collection of either of these in your MyGroup class and use thread_group::add_thread to put the threads in the thread_group.
You can now be incredibly thoughtful about which state is shared in the thread_group (it should be synchronized or read-only) and which state is local to your actor (or thread) and how it's accessible.
Note, I have a personal reluctance against using TLS because I like having some control and guarantees over lifetimes of objects and threads and I just find this easier when I don't use it; YMMV and its great for some uses...

Related

What are some C++ alternatives to static objects that could make destruction safer (or more deterministic)?

I'm working on a large code base that, for performance reasons, limits access to one or more resources. A thread pool is a good analogy to my problem - we don't want everyone in the process spinning up their own threads, so a common pool with a producer/consumer job queue exists in an attempt to limit the number of threads running at any given time.
There isn't an elegant way to make ownership of the thread pool clear so, for all intents and purposes, it is a singleton. I speak better in code than in English, so here is an example:
class ThreadPool {
public:
static void SubmitTask(Task&& t) { instance_.SubmitTask(std::move(t)); }
private:
~ThreadPool() {
std::for_each(pool_.begin(), pool_.end(), [](auto &t) {
if (t.joinable()) t.join();
});
}
private:
std::array<std::thread, 5> pool_;
static ThreadPool instance_; // here or anonymous namespace
};
The issue with this pattern is instance_ doesn't go out of scope until after main has returned which typically results in races or crashes. Also, keep in mind this is analogous to my problem so better ways to do something asynchronously isn't really what I'm after; just better ways to manage the lifecycle of static objects.
Alternatives I've thought of:
Provide an explicit Terminate function that must be called manually before leaving main.
Not using statics at all and leaving it up to the app to ensure only a single instance exists.
Not using statics at all and crashing the app if more than 1 instance is instantiated.
I also realize that a small, sharp, team could probably make the above code work just fine. However, this code lives within a large organization that has many developers of various skill levels contributing to it.
You could explicitly bind the lifetime to your main function. Either add a static shutdown() method to your ThreadPool that does any cleanup you need and call it at the end of main().
Or fully bind the lifetime via RAII:
class ThreadPool {
public:
static ThreadPool* get() { return instance_.get(); }
void SubmitTask(Task&& t) { ... }
~ThreadPool() { ... }
private:
ThreadPool() {}
static inline std::unique_ptr<ThreadPool> instance_;
friend class ThreadPoolScope;
};
class ThreadPoolScope {
public:
ThreadPoolScope(){
assert(!ThreadPool::instance_);
ThreadPool::instance_.reset(new ThreadPool());
}
~ThreadPoolScope(){
ThreadPool::instance_.reset();
}
};
int main() {
ThreadPoolScope thread_pool_scope{};
...
}
void some_func() {
ThreadPool::get()->SubmitTask(...);
}
This makes destruction completely deterministic and if you do this with multiple objects, they are automatically destroyed in the correct order.

Synchronizing method calls on shared object from multiple threads

I am thinking about how to implement a class that will contain private data that will be eventually be modified by multiple threads through method calls. For synchronization (using the Windows API), I am planning on using a CRITICAL_SECTION object since all the threads will spawn from the same process.
Given the following design, I have a few questions.
template <typename T> class Shareable
{
private:
const LPCRITICAL_SECTION sync; //Can be read and used by multiple threads
T *data;
public:
Shareable(LPCRITICAL_SECTION cs, unsigned elems) : sync{cs}, data{new T[elems]} { }
~Shareable() { delete[] data; }
void sharedModify(unsigned index, T &datum) //<-- Can this be validly called
//by multiple threads with synchronization being implicit?
{
EnterCriticalSection(sync);
/*
The critical section of code involving reads & writes to 'data'
*/
LeaveCriticalSection(sync);
}
};
// Somewhere else ...
DWORD WINAPI ThreadProc(LPVOID lpParameter)
{
Shareable<ActualType> *ptr = static_cast<Shareable<ActualType>*>(lpParameter);
T copyable = /* initialization */;
ptr->sharedModify(validIndex, copyable); //<-- OK, synchronized?
return 0;
}
The way I see it, the API calls will be conducted in the context of the current thread. That is, I assume this is the same as if I had acquired the critical section object from the pointer and called the API from within ThreadProc(). However, I am worried that if the object is created and placed in the main/initial thread, there will be something funky about the API calls.
When sharedModify() is called on the same object concurrently,
from multiple threads, will the synchronization be implicit, in the
way I described it above?
Should I instead get a pointer to the
critical section object and use that instead?
Is there some other
synchronization mechanism that is better suited to this scenario?
When sharedModify() is called on the same object concurrently, from multiple threads, will the synchronization be implicit, in the way I described it above?
It's not implicit, it's explicit. There's only only CRITICAL_SECTION and only one thread can hold it at a time.
Should I instead get a pointer to the critical section object and use that instead?
No. There's no reason to use a pointer here.
Is there some other synchronization mechanism that is better suited to this scenario?
It's hard to say without seeing more code, but this is definitely the "default" solution. It's like a singly-linked list -- you learn it first, it always works, but it's not always the best choice.
When sharedModify() is called on the same object concurrently, from multiple threads, will the synchronization be implicit, in the way I described it above?
Implicit from the caller's perspective, yes.
Should I instead get a pointer to the critical section object and use that instead?
No. In fact, I would suggest giving the Sharable object ownership of its own critical section instead of accepting one from the outside (and embrace RAII concepts to write safer code), eg:
template <typename T>
class Shareable
{
private:
CRITICAL_SECTION sync;
std::vector<T> data;
struct SyncLocker
{
CRITICAL_SECTION &sync;
SyncLocker(CRITICAL_SECTION &cs) : sync(cs) { EnterCriticalSection(&sync); }
~SyncLocker() { LeaveCriticalSection(&sync); }
}
public:
Shareable(unsigned elems) : data(elems)
{
InitializeCriticalSection(&sync);
}
Shareable(const Shareable&) = delete;
Shareable(Shareable&&) = delete;
~Shareable()
{
{
SyncLocker lock(sync);
data.clear();
}
DeleteCriticalSection(&sync);
}
void sharedModify(unsigned index, const T &datum)
{
SyncLocker lock(sync);
data[index] = datum;
}
Shareable& operator=(const Shareable&) = delete;
Shareable& operator=(Shareable&&) = delete;
};
Is there some other synchronization mechanism that is better suited to this scenario?
That depends. Will multiple threads be accessing the same index at the same time? If not, then there is not really a need for the critical section at all. One thread can safely access one index while another thread accesses a different index.
If multiple threads need to access the same index at the same time, a critical section might still not be the best choice. Locking the entire array might be a big bottleneck if you only need to lock portions of the array at a time. Things like the Interlocked API, or Slim Read/Write locks, might make more sense. It really depends on your thread designs and what you are actually trying to protect.

Communication between 2 threads C++ UNIX

I need your help with wxWidgets. I have 2 threads (1 wxTimer and 1 wxThread), I need communicate between this 2 threads. I have a class that contains methods to read/write variable in this class. (Share Memory with this object)
My problem is: I instanciate with "new" this class in one thread but I don't know that necessary in second thread. Because if instanciate too, adress of variable are differents and I need communicate so I need even value in variable :/
I know about need wxSemaphore to prevent error when to access same time.
Thanks you for your help !
EDIT: My code
So, I need make a link with my code. Thanks you for all ;)
It's my declaration for my wxTimer in my class: EvtFramePrincipal (IHM)
In .h
EvtFramePrincipal( wxWindow* parent );
#include <wx/timer.h>
wxTimer m_timer;
in .cpp -Constructor EvtFramePrincipal
EvtFramePrincipal::EvtFramePrincipal( wxWindow* parent )
:
FramePrincipal( parent ),m_timer(this)
{
Connect(wxID_ANY,wxEVT_TIMER,wxTimerEventHandler(EvtFramePrincipal::OnTimer),NULL,this);
m_timer.Start(250);
}
So I call OnTimer method every 250ms with this line.
For my second thread start from EvtFramePrincipal (IHM):
in .h EvtFramePrincipal
#include "../Client.h"
Client *ClientIdle;
in .cpp EvtFramePrincipal
ClientIdle= new Client();
ClientIdle->Run();
In .h Client (Thread)
class Client: public wxThread
public:
Client();
virtual void *Entry();
virtual void OnExit();
In .cpp Client (Thread)
Client::Client() : wxThread()
{
}
So here, no probleme, thread are ok ?
Now I need that this class that use like a messenger between my 2 threads.
#ifndef PARTAGE_H
#define PARTAGE_H
#include "wx/string.h"
#include <iostream>
using std::cout;
using std::endl;
class Partage
{
public:
Partage();
virtual ~Partage();
bool Return_Capteur_Aval()
{ return Etat_Capteur_Aval; }
bool Return_Capteur_Amont()
{ return Etat_Capteur_Amont; }
bool Return_Etat_Barriere()
{ return Etat_Barriere; }
bool Return_Ouverture()
{ return Demande_Ouverture; }
bool Return_Fermeture()
{ return Demande_Fermeture; }
bool Return_Appel()
{ return Appel_Gardien; }
void Set_Ouverture(bool Etat)
{ Demande_Ouverture=Etat; }
void Set_Fermeture(bool Etat)
{ Demande_Fermeture=Etat; }
void Set_Capteur_Aval(bool Etat)
{ Etat_Capteur_Aval=Etat; }
void Set_Capteur_Amont(bool Etat)
{ Etat_Capteur_Amont=Etat; }
void Set_Barriere(bool Etat)
{ Etat_Barriere=Etat; }
void Set_Appel(bool Etat)
{ Appel_Gardien=Etat; }
void Set_Code(wxString valeur_code)
{ Code=valeur_code; }
void Set_Badge(wxString numero_badge)
{ Badge=numero_badge; }
void Set_Message(wxString message)
{
Message_Affiche=wxT("");
Message_Affiche=message;
}
wxString Get_Message()
{
return Message_Affiche;
}
wxString Get_Code()
{ return Code; }
wxString Get_Badge()
{ return Badge; }
protected:
private:
bool Etat_Capteur_Aval;
bool Etat_Capteur_Amont;
bool Etat_Barriere;
bool Demande_Ouverture;
bool Demande_Fermeture;
bool Appel_Gardien;
wxString Code;
wxString Badge;
wxString Message_Affiche;
};
#endif // PARTAGE_H
So in my EvtFramePrincipal(wxTimer), I make a new for this class. But in other thread (wxThread), what I need to do to communicate ?
If difficult to understand so sorry :/
Then main thread should create first the shared variable. After it, you can create both threads and pass them a pointer to the shared variable.
So, both of them, know how interact with the shared variable. You need to implement a mutex or wxSemaphore in the methods of the shared variable.
You can use a singleton to get access to a central object.
Alternatively, create the central object before creating the threads and pass the reference to the central object to threads.
Use a mutex in the central object to prevent simultaneous access.
Creating one central object on each thread is not an option.
EDIT 1: Adding more details and examples
Let's start with some assumptions. The OP indicated that
I have 2 threads (1 wxTimer and 1 wxThread)
To tell the truth, I know very little of the wxWidgets framework, but there's always the documentation. So I can see that:
wxTimer provides a Timer that will execute the wxTimer::Notify() method when the timer expires. The documentation doesn't say anything about thread-execution (although there's a note A timer can only be used from the main thread which I'm not sure how to understand). I can guess that we should expect the Notify method will be executed in some event-loop or timer-loop thread or threads.
wxThread provides a model for Thread execution, that runs the wxThread::Entry() method. Running a wxThread object will actually create a thread that runs the Entry method.
So your problem is that you need same object to be accessible in both wxTimer::Notify() and wxThread::Entry() methods.
This object:
It's not one variable but a lot of that store in one class
e.g.
struct SharedData {
// NOTE: This is very simplistic.
// since the information here will be modified/read by
// multiple threads, it should be protected by one or more
// mutexes
// so probably a class with getter/setters will be better suited
// so that access with mutexes can be enforced within the class.
SharedData():var2(0) { }
std::string var1;
int var2;
};
of which you have somewhere an instance of that:
std::shared_ptr<SharedData> myData=std::make_shared<SharedData>();
or perhaps in pointer form or perhaps as a local variable or object attribute
Option 1: a shared reference
You're not really using wxTimer or wxThread, but classes that inherit from them (at least the wxThread::Entry() is pure virtual. In the case of wxTimer you could change the owner to a different wxEvtHandler that will receive the event, but you still need to provide an implementation.
So you can have
class MyTimer: public wxTimer {
public:
void Notify() {
// Your code goes here
// but it can access data through the local reference
}
void setData(const std::shared_ptr<SharedData> &data) {
mLocalReference=data
}
private:
std::shared_ptr<SharedData> mLocalReferece
};
That will need to be set:
MyTimer timer;
timer.setData(myData);
timer.StartOnece(10000); // wake me up in 10 secs.
Similarly for the Thread
class MyThread: public wxThread {
public:
void Entry() {
// Your code goes here
// but it can access data through the local reference
}
void setData(const std::shared_ptr<SharedData> &data) {
mLocalReference=data
}
private:
std::shared_ptr<SharedData> mLocalReferece
};
That will need to be set:
MyThread *thread=new MyThread();
thread->setData(myData);
thread->Run(); // threads starts running.
Option2 Using a singleton.
Sometimes you cannot modify MyThread or MyTimer... or it is too difficult to route the reference to myData to the thread or timer instances... or you're just too lazy or too busy to bother (beware of your technical debt!!!)
We can tweak the SharedData into:
struct SharedData {
std::string var1;
int var2;
static SharedData *instance() {
// NOTE that some mutexes are needed here
// to prevent the case where first initialization
// is executed simultaneously from different threads
// allocating two objects, one of them leaked.
if(!sInstance) {
sInstance=new SharedData();
}
return sInstance
}
private:
SharedData():var2(0) { } // Note we've made the constructor private
static SharedData *sInstance=0;
};
This object (because it only allows the creation of a single object) can be accessed from
either MyTimer::Notify() or MyThread::Entry() with
SharedData::instance()->var1;
Interlude: why Singletons are evil
(or why the easy solution might bite you in the future).
What is so bad about singletons?
Why Singletons are Evil
Singletons Are Evil
My main reasons are:
There's one and only one instance... and you might think that you only need one now, but who knows what the future will hold, you've taken an easy solution for a coding problem that has far reaching consequences architecturally and that might be difficult to revert.
It will not allow doing dependency injection (because the actual class is used in the accessing the object).
Still, I don't think is something to completely avoid. It has its uses, it can solve your problem and it might save your day.
Option 3. Some middle ground.
You could still organize your data around a central repository with methods to access different instances (or different implementations) of the data.
This central repository can be a singleton (it is really is central, common and unique), but is not the shared data, but what is used to retrieve the shared data, e.g. identified by some ID (that might be easier to share between the threads using option 1)
Something like:
CentralRepository::instance()->getDataById(sharedId)->var1;
EDIT 2: Comments after OP posted (more) code ;)
It seems that your object EvtFramePrincipal will execute both the timer call back and it will contain the ClientIdle pointer to a Client object (the thread)... I'd do:
Make the Client class contain a Portage attribute (a pointer or a smart pointer).
Make the EvtFramePrincipal contain a Portage attribute (a pointer or smart pointer). I guess this will have the lifecycle of the whole application, so the Portage object can share that lifecycle too.
Add Mutexes locking to all methods setting and getting in the Portage attribute, since it can be accessed from multiple threads.
After the Client object is instantiated set the reference to the Portage object that the EvtFramePrincipal contains.
Client can access Portage because we've set its reference when it was created. When the Entry method is run in its thread it will be able to access it.
EvtFramePrincipal can access the Portage (because it is one of its attributes), so the event handler for the timer event will be able to access it.

mutex lock in multi-thread can I use several mutex locks

I want to implement a function that I create 5 pairs threads(one pair means one thread write and the other read and both of the threads share one list (5 lists in this scenario).
Do I need to create five mutex locks? How to declare them? In global area?
Do I need to create five mutex locks?
Depends on your data structure. If you have five different data objects to access safely from five associated thread pairs, you'll need five, if all threads access only one data object you'll need only one.
How to declare them? In global area?
Encapsulate your data object, the mutex and (writing) thread functions in a class. I'd say you'll not need another reading thread, that's the one that calls run usually, or any other that'll have access to instance of this class.
class MyAsynchDataProvider
{
public:
void run()
{
writeThread = std::thread(writeDataFunc,this);
}
MyDataStruct getSafeDataCopy()
{
std::lock_guard lock(dataGuard);
return data;
}
private:
std::mutex dataGuard;
MyDataStruct data;
std::thread writeThread;
static void writeThreadFunc(MyDataWorker* thisPtr)
{
// ...
std::lock_guard lock(thisPtr->dataGuard);
// Write to thisPtr->data member
}
};

Thread safe container

There is some exemplary class of container in pseudo code:
class Container
{
public:
Container(){}
~Container(){}
void add(data new)
{
// addition of data
}
data get(size_t which)
{
// returning some data
}
void remove(size_t which)
{
// delete specified object
}
private:
data d;
};
How this container can be made thread safe? I heard about mutexes - where these mutexes should be placed? Should mutex be static for a class or maybe in global scope? What is good library for this task in C++?
First of all mutexes should not be static for a class as long as you going to use more than one instance. There is many cases where you should or shouldn't use use them. So without seeing your code it's hard to say. Just remember, they are used to synchronise access to shared data. So it's wise to place them inside methods that modify or rely on object's state. In your case I would use one mutex to protect whole object and lock all three methods. Like:
class Container
{
public:
Container(){}
~Container(){}
void add(data new)
{
lock_guard<Mutex> lock(mutex);
// addition of data
}
data get(size_t which)
{
lock_guard<Mutex> lock(mutex);
// getting copy of value
// return that value
}
void remove(size_t which)
{
lock_guard<Mutex> lock(mutex);
// delete specified object
}
private:
data d;
Mutex mutex;
};
Intel Thread Building Blocks (TBB) provides a bunch of thread-safe container implementations for C++. It has been open sourced, you can download it from: http://threadingbuildingblocks.org/ver.php?fid=174 .
First: sharing mutable state between threads is hard. You should be using a library that has been audited and debugged.
Now that it is said, there are two different functional issue:
you want a container to provide safe atomic operations
you want a container to provide safe multiple operations
The idea of multiple operations is that multiple accesses to the same container must be executed successively, under the control of a single entity. They require the caller to "hold" the mutex for the duration of the transaction so that only it changes the state.
1. Atomic operations
This one appears simple:
add a mutex to the object
at the start of each method grab a mutex with a RAII lock
Unfortunately it's also plain wrong.
The issue is re-entrancy. It is likely that some methods will call other methods on the same object. If those once again attempt to grab the mutex, you get a dead lock.
It is possible to use re-entrant mutexes. They are a bit slower, but allow the same thread to lock a given mutex as much as it wants. The number of unlocks should match the number of locks, so once again, RAII.
Another approach is to use dispatching methods:
class C {
public:
void foo() { Lock lock(_mutex); foo_impl(); }]
private:
void foo_impl() { /* do something */; foo_impl(); }
};
The public methods are simple forwarders to private work-methods and simply lock. Then one just have to ensure that private methods never take the mutex...
Of course there are risks of accidentally calling a locking method from a work-method, in which case you deadlock. Read on to avoid this ;)
2. Multiple operations
The only way to achieve this is to have the caller hold the mutex.
The general method is simple:
add a mutex to the container
provide a handle on this method
cross your fingers that the caller will never forget to hold the mutex while accessing the class
I personally prefer a much saner approach.
First, I create a "bundle of data", which simply represents the class data (+ a mutex), and then I provide a Proxy, in charge of grabbing the mutex. The data is locked so that the proxy only may access the state.
class ContainerData {
protected:
friend class ContainerProxy;
Mutex _mutex;
void foo();
void bar();
private:
// some data
};
class ContainerProxy {
public:
ContainerProxy(ContainerData& data): _data(data), _lock(data._mutex) {}
void foo() { data.foo(); }
void bar() { foo(); data.bar(); }
};
Note that it is perfectly safe for the Proxy to call its own methods. The mutex will be released automatically by the destructor.
The mutex can still be reentrant if multiple Proxies are desired. But really, when multiple proxies are involved, it generally turns into a mess. In debug mode, it's also possible to add a "check" that the mutex is not already held by this thread (and assert if it is).
3. Reminder
Using locks is error-prone. Deadlocks are a common cause of error and occur as soon as you have two mutexes (or one and re-entrancy). When possible, prefer using higher level alternatives.
Add mutex as an instance variable of class. Initialize it in constructor, and lock it at the very begining of every method, including destructor, and unlock at the end of method. Adding global mutex for all instances of class (static member or just in gloabl scope) may be a performance penalty.
The is also a very nice collection of lock-free containers (including maps) by Max Khiszinsky
LibCDS1 Concurrent Data Structures
Here is the documentation page:
http://libcds.sourceforge.net/doc/index.html
It can be kind of intimidating to get started, because it is fully generic and requires you register a chosen garbage collection strategy and initialize that. Of course, the threading library is configurable and you need to initialize that as well :)
See the following links for some getting started info:
initialization of CDS and the threading manager
http://sourceforge.net/projects/libcds/forums/forum/1034512/topic/4600301/
the unit tests ((cd build && ./build.sh ----debug-test for debug build)
Here is base template for 'main':
#include <cds/threading/model.h> // threading manager
#include <cds/gc/hzp/hzp.h> // Hazard Pointer GC
int main()
{
// Initialize \p CDS library
cds::Initialize();
// Initialize Garbage collector(s) that you use
cds::gc::hzp::GarbageCollector::Construct();
// Attach main thread
// Note: it is needed if main thread can access to libcds containers
cds::threading::Manager::attachThread();
// Do some useful work
...
// Finish main thread - detaches internal control structures
cds::threading::Manager::detachThread();
// Terminate GCs
cds::gc::hzp::GarbageCollector::Destruct();
// Terminate \p CDS library
cds::Terminate();
}
Don't forget to attach any additional threads you are using:
#include <cds/threading/model.h>
int myThreadFunc(void *)
{
// initialize libcds thread control structures
cds::threading::Manager::attachThread();
// Now, you can work with GCs and libcds containers
....
// Finish working thread
cds::threading::Manager::detachThread();
}
1 (not to be confuse with Google's compact datastructures library)