Mutex protection for Singleton resources in multithreaded env - c++

I have a server listening on a port for request. When the request comes in, it is dispatched to a singleton class Singleton. This Singleton class has a data structure RootData.
class Singleton {
void process();
void refresh();
private:
RootData mRootData;
}
In the class there are two functions: process: Work with the mRootData to do some processing and refresh: called periodically from another thread to refresh the mRootData with the latest changes from Database.
It is required that the access to mRootData be gaurded by Mutex.
I have the following questions:
1] If the class is a singleton and mRootData is inside that class, is the Mutex gaurd really necessary?
I know it is necessary for conflict between refresh/process. But from the server, i think there will be only one call to process happening at any give time ( coz the class is Singleton) Please correct me if my understanding is wrong.
2] Should i protect the i) data structure OR ii) function accessing the data structure. E.g.
i) const RootData& GetRootData()
{
ACE_Read_Guard guard(m_oMutexReadWriteLock);
return mRootData;
// Mutex is released when this function returns
}
// Similarly Write lock for SetRootData()
ii) void process()
{
ACE_Read_Guard guard(m_oMutexReadWriteLock);
// use mRootData and do processing
// GetRootData() and SetRootData() wont be mutex protected.
// Mutex is released when this function returns
}
3] If the answer to above is i) should i return by reference or by object?
Please explain in either case.
Thanks in advance.

1] If the class is a singleton and mRootData is inside that class, is the Mutex gaurd really necessary?
Yes it is, since one thread may call process() while another is calling refresh().
2] Should i protect the i) data structure OR ii) function accessing the data structure.
Mutex is meant to protect a common code path, i.e. (part of) the function accessing the shared data. And it is easiest to use when locking and releasing happens within the same code block. Putting them into different methods is almost an open invitation for deadlocks, since it is up to the caller to ensure that every lock is properly released.
Update: if GetRootData and SetRootData are public functions too, there is not much point to guard them with mutexes in their current form. The problem is, you are publishing a reference to the shared data, after which it is completely out of your control what and when the callers may do with it. 100 callers from 100 different threads may store a reference to mRootData and decide to modify it at the same time! Similarly, after calling SetRootData the caller may retain the reference to the root data, and access it anytime, without you even noticing (except eventually from a data corruption or deadlock...).
You have the following options (apart from praying that the clients be nice and don't do nasty things to your poor singleton ;-)
create a deep copy of mRootData both in the getter and the setter. This keeps the data confined to the singleton, where it can be guarded with locks. OTOH callers of GetRootData get a snapshot of the data, and subsequent changes are not visible to them - this may or may not be acceptable to you.
rewrite RootData to make it thread safe, then the singleton needs to care no more about thread safety (in its current setup - if it has other data members, the picture may be different).
Update2:
or remove the getter and setter altogether (possibly together with moving data processing methods from other classes into the singleton, where these can be properly guarded by mutexes). This would be the simplest and safest, unless you absolutely need other parties to access mRootData directly.

1.) Yes, the mutex is necessary. Although there is only one instance of the class in existence at any one time, multiple threads could still call process() on that instance at the same time (unless you design your app so that never happens).
2.) Anytime you use the value you should protect it with the mutex.
However, you don't mention a GetRootData and SetRootData in your class declaration above. Are these private (used only inside the class to access the data) or public (to allow other code to access the data directly)?
If you need to provide outside access to the data by making the GetRootData() function public, then you would need to return a copy, or your callers could then store a reference and manipulate the data after the lock has been released. Of course, then changes they made to the data wouldn't be reflected inside the singleton, which might not be what you want.

Related

How to expose a thread-safe interface that allocate resources?

I'm trying to expose a C interface for my C++ library. This notably involve functions that allow the user to create, launch, query the status, then release a background task.
The task is implemented within a C++ class, which members are protected from concurrent read/write via an std::mutex.
My issue comes when I expose a C interface for this background task. Basically I have say the following functions (assuming task_t is an opaque pointer to an actual struct containing the real task class):
task_t* mylib_task_create();
bool mylib_task_is_running(task_t* task);
void mylib_task_release(task_t* task);
My goal is to make any concurrent usage of these functions thread-safe, however I'm not sure exactly how, i.e. that if a client code thread calls mylib_task_is_running() at the same time that another thread calls mylib_task_release(), then everything's fine.
At first I thought about adding an std::mutex to the implementation of task_t, but that means the delete statement at the end of mylib_task_release() will have to happen while the mutex is not held, which means it doesn't completely solve the problem.
I also thought about using some sort of reference counting but I still end up against the same kind of issue where the actual delete might happen right after a hypothetical retain() function is called.
I feel like there should be a (relatively) simple solution to this but I can't quite put my hand on it. How can I make it so I don't have to force the client code to protect accesses to task_t?
if task_t is being deleted, you should ensure that nobody else has a pointer to it.
if one thread is deleting task_t and the other is trying to acquire it's mutex, it should be apparent that you should not have deleted the task_t.
shared_ptrs are a great help for this.

Do I need mutex in constructor for field?

Let's assume I have a simple class A with one field in C++. This field is initialized in the constructor. Class A also has a method called doit() for modifing the value of this field. doit() will be called from multiple threads. If I have a mutex only in the doit() method, is this sufficient? Do I have a guarantee that I will never read an uninitialized field (because there is no lock in the constructor)?
Edit: I probably was not clear enough. Is there no issue involving processor cache or something similar? I mean, if there is no mutex for initializing memory region (i.e. my field) - is there no risk that the other thread will read some garbage value?
Your object can only be initialised once, and you won't be able use it before it's initialised, so you don't need a mutex there. You will however need a mutex or other suitable lock in your DoIt function, as you said this will be accessed across multiple threads.
Update for edited question: No, you don't need to worry about processor cache. You must construct your object first, before you can have a handle to it. Only once you have this handle can you pass it to other threads to be used. What I'm trying to say is, the spawned threads must start after the construction of the original object, it is impossible for it to happen the other way around!
It is not possible to call doit() on an object that is not created yet, so you do not need mutex in the constructor.
If doit() is the only method that accesses the field, then you should be fine.
If other methods of your class also access that field, even from a single thread, then you must use a mutex also in these methods.
You first need to construct the object before those pesky threads get their hands on it. The OS will allocate memory for the constructor that is only called by one thread. Ths OS looks after that allocation and therefore nothing needs to be done on your part. Hell you can even create two objects of the same class in two different threads.
You can be very conservative and use a mutex at the start of any method that used that field to lock it, and release it and the end.
Or if you understand the interactions of the various methods with the various algorithms , you can use a mutex for critical sections of code that use that field - i.e. That part of the code needs to be sure that the field is not altered by another thread during processing, but you method can release the lock after the critical section, do something else then perhaps have another critical section.

Structuring and Synchronizing a Multithreaded Game Loop

I'm running into a mild conundrum concerning thread safety for my game loop. What I have below is 3 threads (including the main) that are meant to work together. One for event managing (main thread), one for logic, and one for the rendering. All 3 of these threads exist within their own class, as you can see below. In basic testing the structure works without problems. This system uses SFML and renders with OpenGL.
int main(){
Gamestate gs;
EventManager em(&gs);
LogicManager lm(&gs);
Renderer renderer(&gs);
lm.start();
renderer.start();
em.eventLoop();
return 0;
}
However, as you may have noticed I have a "Gamestate" class that is meant to act as a container of all the resources that need to be shared between the threads (mostly with LogicManager as a writer and Renderer as a reader. EventManager is mostly just for window events). My questions are: (1 and 2 being the most important)
1) Is this a good way of going about things? Meaning is having a "global" Gamestate class a good idea to use? Is there a better way of going about it?
2) My intention was to have Gamestate have mutexes in the getters/setters, except that doesn't work for reading because I can't return the object while it's still locked, which means I'd have to put synchronization outside of the getters/setters and make the mutexes public. It also means I'd have a bloody ton of mutexes for all the different resources. What is the most elegant way of going about this problem?
3) I have all of the threads accessing "bool run" to check if to continue their loops
while(gs->run){
....
}
run gets set to false if I receive a quit message in the EventManager. Do I need to synchronize that variable at all? Would I set it to volatile?
4) Does constantly dereferencing pointers and such have an impact on performance? eg gs->objects->entitylist.at(2)->move(); Do all those '->' and '.' cause any major slowdown?
Global state
1) Is this a good way of going about things? Meaning is having a "global" Gamestate class a good idea to use? Is there a better way of going about it?
For a game, as opposed to some reusable piece of code, I'd say a global state is good enough. You might even avoid passing gamestate pointers around, and really make it a global variable instead.
Synchronization
2) My intention was to have Gamestate have mutexes in the getters/setters, except that doesn't work for reading because I can't return the object while it's still locked, which means I'd have to put synchronization outside of the getters/setters and make the mutexes public. It also means I'd have a bloody ton of mutexes for all the different resources. What is the most elegant way of going about this problem?
I'd try to think of this in terms of transactions. Wrapping every single state change into its own mutex locking code will not only impact performance, but might lead to actually incorrect behaviour if the code gets one state element, performs some computation on it and sets the value later on, while some other code modified the same element in between. So I'd try to structure LogicManager and Renderer in such ways that all the interaction with the Gamestate occurs bundled in a few places. For the duration of that interaction, the thread should hold a mutex on the state.
If you want to enforce the use of mutexes, then you can create some construct where you have at least two classes. Let's call them GameStateData and GameStateAccess. GameStateData would contain all the state, but without providing public access to it. GameStateAccess would be a friend of GameStateData and provide access to its private data. The constructor of GameStateAccess would take a reference or pointer to the GameStateData and would lock the mutex for that data. The destructor would free the mutex. That way, your code to manipulate the state would simply be written as a block where a GameStateAccess object is in scope.
There is still a loophole, though: In cases where objects returned from this GameStateAccess class are pointers or references to mutable objects, then this setup won't keep your code from carrying such a pointer out of the scope protected by the mutex. To prevent this, either take care about how you write things, or use some custom pointer-like template class which can be cleared once the GameStateAccess goes out of scope, or make sure you only pass things by value not reference.
Example
Using C++11, the above idea for lock management could be implemented as follows:
class GameStateData {
private:
std::mutex _mtx;
int _val;
friend class GameStateAccess;
};
GameStateData global_state;
class GameStateAccess {
private:
GameStateData& _data;
std::lock_guard<std::mutex> _lock;
public:
GameStateAccess(GameStateData& data)
: _data(data), _lock(data._mtx) {}
int getValue() const { return _data._val; }
void setValue(int val) { _data._val = val; }
};
void LogicManager::performStateUpdate {
int valueIncrement = computeValueIncrement(); // No lock for this computation
{ GameStateAccess gs(global_state); // Lock will be held during this scope
int oldValue = gs.getValue();
int newValue = oldValue + valueIncrement;
gs.setValue(newValue); // still in the same transaction
} // free lock on global state
cleanup(); // No lock held here either
}
Loop termination indicator
3) I have all of the threads accessing "bool run" to check if to continue their loops
while(gs->run){
....
}
run gets set to false if I receive a quit message in the EventManager. Do I need to synchronize that variable at all? Would I set it to volatile?
For this application, a volatile but otherwise unsynchronized variable should be fine. You have to declare it volatile in order to prevent the compiler from generating code which caches that value, thus hiding a modification by another thread.
As an alternative, you might want to use a std::atomic variable for this.
Pointer indirection overhead
4) Does constantly dereferencing pointers and such have an impact on performance? eg gs->objects->entitylist.at(2)->move(); Do all those -> and . cause any major slowdown?
It depends on the alternatives. In many cases, the compiler will be able to keep the value of e.g. gs->objects->entitylist.at(2) in the above code, if it is used repeatedly, and won't have to compute it over and over again. In general I would consider the performance penalty due to all this pointer indirection to be of minor concern, but that is hard to tell for sure.
Is it a good way of going about things? (class Gamestate)
1) Is this a good way of going about things?
Yes.
Meaning is having a "global" Gamestate class a good idea to use?
Yes, if the getter/setter are thread-safe.
Is there a better way of going about it?
No. The data is necessary for both game logic and representation. You could remove the global gamestate if you put it in a sub-routine, but this would only transport your problem to another function. A global Gamestate will also enable you to safe the current state very easily.
Mutex and getters/setters
2) My intention was to have Gamestate have mutexes in the getters/setters [...]. What is the most elegant way of going about this problem?
This is called reader/writer problem. You don't need public mutexes for this. Just keep in mind that you can have many readers, but only one writer. You could implement a queue for the readers/writers and block additional readers until the writer has finished.
while(gs->run)
Do I need to synchronize that variable at all?
Whenever a non-synchronized access of a variable could result in a unknown state, it should be synchronized. So if run will be set to false immediately after the rendering engine started the next iteration and the Gamestate has been destroyed, it will result in a mess. However, if the gs->run is only an indicator whether the loop should continue, it is safe.
Keep in mind that both logic and rendering engine should be stopped at the same time. If you can't shutdown both at the same time stop the rendering engine first in order to prevent a freeze.
Dereferencing pointers
4) Does constantly dereferencing pointers and such have an impact on performance?
There are two rules of optimization:
Do not optimize
Do not optimize yet.
The compiler will probably take care of this problem. You, as a programmer, should use the version which is most readable for you.

interprocess object passing

I need to have a class with one activity that is performed once per 5 seconds in its own thread. It is a web service one, so it needs an endpoint to be specified. During the object runtime the main thread can change the endpoint. This is my class:
class Worker
{
public:
void setEndpoint(const std::string& endpoint);
private:
void activity (void);
mutex endpoint_mutex;
volatile std::auto_ptr<std::string> newEndpoint;
WebServiceClient client;
}
Does the newEndpoint object need to be declared volatile? I would certainly do it if the read was in some loop (to make the complier not optimize it out), but here I don't know.
In each run the activity() function checks for a new endpoint (if a new one is there, then passes it to the client and perform some reconnection steps) and do its work.
void Worker::activity(void)
{
endpoint_mutex.lock(); //don't consider exceptions
std::auto_ptr<std::string>& ep = const_cast<std::auto_ptr<string> >(newEndpoint);
if (NULL != ep.get())
{
client.setEndpoint(*ep);
ep.reset(NULL);
endpoint_mutex.unlock();
client.doReconnectionStuff();
client.doReconnectionStuff2();
}
else
{
endpoint_mutex.unlock();
}
client.doSomeStuff();
client.doAnotherStuff();
.....
}
I lock the mutex, which means that the newEndpoint object cannot change anymore, so I remove the volatile class specification to be able to invoke const methods.
The setEndpoint method (called from another threads):
void Worker::setEndpoint(const std::string& endpoint)
{
endpoint_mutex.lock(); //again - don't consider exceptions
std::auto_ptr<std::string>& ep = const_cast<std::auto_ptr<string> >(newEndpoint);
ep.reset(new std::string(endpoint);
endpoint_mutex.unlock();
}
Is this thing thread safe? If not, what is the problem? Do I need the newEndpoint object to be volatile?
volatile is used in the following cases per MSDN:
The volatile keyword is a type qualifier used to declare that an
object can be modified in the program by something such as the
operating system, the hardware, or a concurrently executing thread.
Objects declared as volatile are not used in certain optimizations
because their values can change at any time. The system always reads
the current value of a volatile object at the point it is requested,
even if a previous instruction asked for a value from the same object.
Also, the value of the object is written immediately on assignment.
The question in your case is, how often does your NewEndPoint actually change? You create a connection in thread A, and then you do some work. While this is going on, nothing else can fiddle with your endpoint, as it is locked by a mutex. So, per my analysis, and from what I can see in your code, this variable doesn't necessarily change enough.
I cannot see the call site of your class, so I don't know if you are using the same class instance 100 times or more, or if you are creating new objects.
This is the kind of analysis you need to make when asking whether something should be volatile or not.
Also, on your thread-safety, what happens in these functions:
client.doReconnectionStuff();
client.doReconnectionStuff2();
Are they using any of the shared state from your Worker class? Are they sharing and modifying any other state use by another thread? If yes, you need to do the appropriate synchronization.
If not, then you're ok.
Threading requires some thinking, you need to ask yourself these questions. You need to look at all state and wonder whether or not you're sharing. If you're dealing with pointers, then you need wonder who own's the pointer, and whether you're ever sharing it amongst threads, accidentally or not, and act accordingly. If you pass a pointer to a function that is run in a different thread, then you're sharing the object that the pointer points to. If you then alter what it points to in this new thread, you are sharing and need to synchronize.

Thread Safe Access to Data Shared Between Objects

I'm something of an intermediate programmer, but relatively a novice to multi-threading.
At the moment, I'm working on an application with a structure similar to the following:
class Client
{
public:
Client();
private:
// These are all initialised/populated in the constrcutor.
std::vector<struct clientInfo> otherClientsInfo;
ClientUI* clientUI;
ClientConnector* clientConnector;
}
class ClientUI
{
public:
ClientUI(std::vector<struct clientInfo>* clientsInfo);
private:
// Callback which gets new client information
// from a server and pushes it into the otherClientsInfo vector.
synchClientInfo();
std::vector<struct clientInfo>* otherClientsInfo;
}
class ClientConnector
{
public:
ClientConnector(std::vector<struct clientInfo>* clientsInfo);
private:
connectToClients();
std::vector<struct clientInfo>* otherClientsInfo;
}
Somewhat a contrived example, I know. The program flow is this:
Client is constructed and populates otherClientsInfo and constructs clientUI and clientConnector with a pointer to otherClientsInfo.
clientUI calls synchClientInfo() anytime the server contacts it with new client information, parsing the new data and pushing it back into otherClientsInfo or removing an element.
clientConnector will access each element in otherClientsInfo when connectToClients() is called but won't alter them.
My first question is whether my assumption that if both ClientUI and ClientConnector access otherClientsInfo at the same time, will the program bomb out because of thread-unsafety?
If this is the case, then how would I go about making access to otherClientsInfo thread safe, as in perhaps somehow locking it while one object accesses it?
My first question is whether my assumption that if both ClientUI and ClientConnector access otherClientsInfo at the same time, will the program bomb out because of thread-unsafety?
Yes. Most implementations of std::vector do not allow concurrent read and modification. ( You'd know if you were using one which did )
If this is the case, then how would I go about making access to otherClientsInfo thread safe, as in perhaps somehow locking it while one object accesses it?
You would require at least a lock ( either a simple mutex or critical section or a read/write lock ) to be held whenever the vector is accessed. Since you've only one reader and writer there's no point having a read/write lock.
However, actually doing that correctly will get increasingly difficult as you are exposing te vector to the other classes, so will have to expose the locking primitive too, and remember to acquire it whenever you use the vector. It may be better to expose addClientInfo, removeClientInfo and const and non-const foreachClientInfo functions which encapsulate the locking in the Client class rather than having disjoint bits of the data owned by the client floating around the place.
See
Reader/Writer Locks in C++
and
http://msdn.microsoft.com/en-us/library/ms682530%28VS.85%29.aspx
The first one is probably a bit advanced for you. You can start with the Critical section (link 2).
I am assuming you are using Windows.
if both ClientUI and ClientConnector access otherClientsInfo at the same time, will the program bomb out because of thread-unsafety?
Yes, STL containers are not thread-safe.
If this is the case, then how would I go about making access to otherClientsInfo thread safe, as in perhaps somehow locking it while one object accesses it?
In the most simple case a mutual exclusion pattern around the access to the shared data... if you'd have multiple readers however, you would go for a more efficient pattern.
Is clientConnector called from the same thread as synchClientInfo() (even if it is all callback)?
If so, you don't need to worry about thread safety at all.
If you want to avoid simultanous access to the same data, you can use mutexes to protect the critical section. For exmample, mutexes from Boost::Thread
In order to ensure that access to the otherClientsInfo member from multiple threads is safe, you need to protect it with a mutex. I wrote an article about how to directly associate an object with a mutex in C++ over on the Dr Dobb's website:
http://www.drdobbs.com/cpp/225200269