C++ return function lock_guard - c++

I have the resource "resource" that wrapped in shared_ptr, and I want to access it from other threads.
When I do this:
// foo.h
class Foo{
public:
std::shared_ptr<Setup> GetSomeThing();
void SetSomeThing();
private:
std::shared_ptr<Setup> resource;
std::mutex lock;
}
//Foo.cpp
std::shared_ptr<Setup> Foo::GetSomeThing()
{
std::lock_guard<std::mutex> lock (mutex);
return resource;
}
void Foo::SetSomeThing()
{
std::lock_guard<std::mutex> lock (mutex);
resource = ...;
}
is it all ok?
When will be created a return object and when will be destroyed the lock? Is there something about it in the documentation?
Thank you!

This answer assumes (for clarity) that the two lines:
std::lock_guard<std::mutex> lock (mutex);
are both replaced with
std::lock_guard<std::mutex> guard (lock);
If multiple threads access a single instance of std::shared_ptr<> without synchronization and any of those call non-const members then a data race occurs.
That means you must ensure synchronization between SetSomeThing() and GetSomething().
Introducing a std::mutex and using std::lock_guard<> in the way proposed will do that. The returned copy will have been constructed before the destructor of guard is called.
Do note that it's about the same instance. The copy returned by GetSomeThing() has sufficient internal synchronization to ensure it can be accessed (non-const and even destructed) without synchronizing other instances.
However none of that prevents data races on any shared Setup object (part-)owned by the std::shared_ptr<Setup>. If all access to Setup is read-only then multiple threads can access it but if any threads write to the shared data a data race occurs (without further synchronization not shown in the question).
An object like Setup in a simple application may be constructed and initialized before multiple threads are launched and destructed when all but the main-thread have terminated. In that specific case no further synchronization would be required and even the provided lock is superfluous.
Here's a non-normative reference:
http://en.cppreference.com/w/cpp/memory/shared_ptr
Refer to the last paragraph of the opening description.
Footnote: Changing application "setup" can be far harder than simply ensuring that data races don't occur on the attributes. Threads may need to pend adopting changes or 'abandon' activity. For example consider a graphical program in which the screen resolution is changed during a 'draw' step. Should it finish the draw and produce a canvas that is too large/small or dump the part-drawn canvas and adopt the new resolution? Has the setup even been acquired in such a way that the draw will produce something consistent with 'before' or 'after' and not some nonsensical (possibly crashing) hybrid?

Related

How should std::atomic_...<std::shared_ptr> be used in the copy and move operations of a thread safe class?

I have the following class that is supposed to be thread safe and has a std::shared_ptr member that refers to some shared resource.
class ResourceHandle
{
using Resource = /* unspecified */;
std::shared_ptr<Resource> m_resource;
};
Multiple threads may acquire a copy of a resource handle from some central location and the central resource handle may be updated at any time. So reads and writes to the same ResourceHandle may take place concurrently.
// Centrally:
ResourceHandle rh;
// Thread 1: reads the central handle into a local copy for further processing
auto localRh = rh;
// Thread 2: creates a new resource and updates the central handle
rh = ResourceHandle{/* ... */};
Because these threads perform non-const operations on the same std::shared_ptr, according to CppReference, I should use the std::atomic_...<std::shared_ptr> specializations to manipulate the shared pointer.
If multiple threads of execution access the same shared_ptr without
synchronization and any of those accesses uses a non-const member
function of shared_ptr then a data race will occur; the shared_ptr
overloads of atomic functions can be used to prevent the data race.
So I want to implement the copy and move operations of the ResourceHandle class using these atomic operations such that manipulating a single resource handle from multiple threads avoids all data races.
The notes section of the CppReference page on std::atomic_...<std::shared_ptr> specializations states the following:
To avoid data races, once a shared pointer is passed to any of these
functions, it cannot be accessed non-atomically. In particular, you
cannot dereference such a shared_ptr without first atomically loading
it into another shared_ptr object, and then dereferencing through the
second object.
So I probably want to use some combination of std::atomic_load and std::atomic_store, but I am unsure where and when they should be applied.
How should the copy and move operations of my ResourceHandle class be implemented to not introduce any data races?
std::shared_ptr synchronises it's access to the reference count, so you don't have to worry about operations on one std::shared_ptr affecting another. If those are followed by at least one modification to the pointee, you have a data race there. Code that shares ownership of a previous Resource will be unaffected by m_resource being reset to point to a new Resource.
You have to synchronise access to a single std::shared_ptr, if that is accessible in multiple threads. The warning provided (and the reason it is deprecated in C++20) states that if anywhere is atomically accessing a value, everywhere that accesses that value should be atomic.
You could achieve that by hiding the global std::shared_ptr behind a local copies. ResourceHandle as a separate class makes that more difficult.
using ResourceHandle = std::shared_ptr<Resource>;
static ResourceHandle global;
ResourceHandle getResource()
{
return std::atomic_load(&global);
}
void setResource(ResourceHandle handle)
{
std::atomic_store(&global, handle);
}

Why we keep mutex instead of declaring it before guard every time?

Please consider this classical approach, I have simplified it to highlight the exact question:
#include <iostream>
#include <mutex>
using namespace std;
class Test
{
public:
void modify()
{
std::lock_guard<std::mutex> guard(m_);
// modify data
}
private:
/// some private data
std::mutex m_;
};
This is the classical approach of using std::mutex to avoid data races.
The question is why are we keeping an extra std::mutex in our class? Why can't we declare it every time before the declaration of std::lock_guard like this?
void modify()
{
std::mutex m_;
std::lock_guard<std::mutex> guard(m_);
// modify data
}
Lets say two threads are calling modify in parallel. So each thread gets its own, new mutex. So the guard has no effect as each guard is locking a different mutex. The resource you are trying to protect from race-conditions will be exposed.
The misunderstanding comes from what the mutex is and what the lock_guard is good for.
A mutex is an object that is shared among different threads, and each thread can lock and release the mutex. That's how synchronization among different threads works. So you can work with m_.lock() and m_.unlock() as well, yet you have to be very careful that all code paths (including exceptional exits) in your function actually unlocks the mutex.
To avoid the pitfall of missing unlocks, a lock_guard is a wrapper object which locks the mutex at wrapper object creation and unlocks it at wrapper object destruction. Since the wrapper object is an object with automatic storage duration, you will never miss an unlock - that's why.
A local mutex does not make sense, as it would be local and not a shared ressource. A local lock_guard perfectly makes sense, as the autmoatic storage duration prevents missing locks / unlocks.
Hope it helps.
This all depends on the context of what you want to prevent from being executed in parallel.
A mutex will work when multiple threads try to access the same mutex object. So when 2 threads try to access and acquire the lock of a mutex object, only one of them will succeed.
Now in your second example, if two threads call modify(), each thread will have its own instance of that mutex, so nothing will stop them from running that function in parallel as if there's no mutex.
So to answer your question: It depends on the context. The mission of the design is to ensure that all threads that should not be executed in parallel will hit the same mutex object at the critical part.
Synchronization of threads involves checking if there is another thread executing the critical section. A mutex is the objects that holds the state for us to check if it was "locked" by a thread. lock_guard on the other hand is a wrapper that locks the mutex on initialization and unlocks it during destruction.
Having realized that, it should be clearer why there has to be only one instance of the mutex that all lock_guards need access to - they need to check if it's clear to enter the critical section against the same object. In the second snippet of your question each function call creates a separate mutex that is seen and accessible only in its local context.
You need a mutex at class level. Otherwise, each thread has a mutex for itself, and therefore the mutex has no effect.
If for some reason you don't want your mutex to be stored in a class attribute, you could use a static mutex as shown below.
void modify()
{
static std::mutex myMutex;
std::lock_guard<std::mutex> guard(myMutex);
// modify data
}
Note that here there is only 1 mutex for all the class instances. If the mutex is stored in an attribute, you would have one mutex per class instance. Depending on your needs, you might prefer one solution or the other.

Should a critical section or mutex be really member variable or when should it be?

I have seen code where mutex or critical section is declared as member variable of the class to make it thread safe something like the following.
class ThreadSafeClass
{
public:
ThreadSafeClass() { x = new int; }
~ThreadSafeClass() {};
void reallocate()
{
std::lock_guard<std::mutex> lock(m);
delete x;
x = new int;
}
int * x;
std::mutex m;
};
But doesn't that make it thread safe only if the same object was being shared by multiple threads? In other words, if each thread was creating its own instance of this class, they will be very much independent and its member variables will never conflict with each other and synchronization will not even be needed in that case!?
It appears to me that defining the mutex as member variable really reduces synchronization to the events when the same object is being shared by multiple threads. It doesn't really make the class any thread safer if each thread has its own copy of the class (for example if the class were to access other global objects). Is this a correct assessment?
If you can guarantee that any given object will only be accessed by one thread then a mutex is an unnecessary expense. It however must be well documented on the class's contract to prevent misuse.
PS: new and delete have their own synchronization mechanisms, so even without a lock they will create contention.
EDIT: The more you keep threads independent from each other the better (because it eliminates the need for locks). However, if your class will work heavily with a shared resource (e.g. database, file, socket, memory, etc ...) then having a per-thread instance is of little advantage so you might as well share an object between threads. Real independence is achieved by having different threads work with separate memory locations or resources.
If you will have potentially long waits on your locks, then it might be a good idea to have a single instance running in its own thread and take "jobs" from a synchronized queue.

Put all database operations in a specific thread using Qt

I have a console application where after a timeout signal, a 2D matrix (15*1200) should be parsed element-by element and inserted to a database. Since the operation is time-consuming, I perform the insertion in a new thread using QConcurrent::run.
However, due to timeout signals, several threads may start before one finished, so multiple accesses to the database may occur.
As a solution, I was trying to buffer all database operations in a specific thread, in other words, assign a specific thread to the database class, but do not know how to do so.
Your problem is a classical concurrent data analysis problem. Have you tried using std::mutex? Here's how you use it:
You create some variable std::mutex (mutex = mutual exclusion) that's accessible by all the relevant threads.
std::mutex myLock;
and then, let's say that the function that processes the data looks like this:
void processData(const Data& myData)
{
ProcessedData d = parseData();
insertToDatabase(d);
}
Now from what I understand, you're afraid that multiple threads will call insertToDatabase(d) simultaneously. Now to solve this issue, simply do the following:
void processData(const Data& myData)
{
ProcessedData d = parseData();
myLock.lock();
insertToDatabase(d);
myLock.unlock();
}
Now with this, if another thread tries to access the same function, it will block until another all other threads are finished. So threads are mutually excluded from accessing the call together.
More about this:
Caveats:
This mutex object must be the same one that all the threads see, otherwise this is useless. So either make it global (bad idea, but will work), or put it in a the class that will do the calls.
Mutex objects are non-copyable. So if you include them in a class, you should either make the mutex object a pointer, or you should reimplement the copy constructor of that class to prevent copying that mutex, or make your class noncopyable using delete:
class MyClass
{
//... stuff
MyClass(const MyClass& src) = delete;
//... other stuff
};
There are way more fancier ways to use std::mutex, including std::lock_guard and std::unique_lock, which take ownership of the mutex and do the lock for you. This are good to use if you know that the call insertToDatabase(d); could throw an exception. In that case, using only the code I wrote will not unlock the mutex, and the program will reach a deadlock.
In the example I provided, here's how you use lock_guard:
void processData(const Data& myData)
{
ProcessedData d = parseData();
std::lock_guard<std::mutex> guard(myLock);
insertToDatabase(d);
//it will unlock automatically at the end of this function, when the object "guard" is destroyed
}
Be aware that calling lock() twice by the same thread has undefined behavior.
Everything I did above is C++11.
If you're going to deal with multiple threads, I recommend that you start reading about data management with multiple threads. This is a good book.
If you insist on using Qt stuff, here's the same thing from Qt... QMutex.

Is boost shared_ptr <XXX> thread safe?

I have a question about boost::shared_ptr<T>.
There are lots of thread.
using namespace boost;
class CResource
{
// xxxxxx
}
class CResourceBase
{
public:
void SetResource(shared_ptr<CResource> res)
{
m_Res = res;
}
shared_ptr<CResource> GetResource()
{
return m_Res;
}
private:
shared_ptr<CResource> m_Res;
}
CResourceBase base;
//----------------------------------------------
// Thread_A:
while (true)
{
//...
shared_ptr<CResource> nowResource = base.GetResource();
nowResource.doSomeThing();
//...
}
// Thread_B:
shared_ptr<CResource> nowResource;
base.SetResource(nowResource);
//...
Q1
If Thread_A do not care the nowResource is the newest, will this part of code have problem?
I mean when Thread_B do not SetResource() completely, Thread_A get a wrong smart point by GetResource()?
Q2
What does thread-safe mean?
If I do not care about whether the resource is newest, will the shared_ptr<CResource> nowResource crash the program when the nowResource is released or will the problem destroy the shared_ptr<CResource>?
boost::shared_ptr<> offers a certain level of thread safety. The reference count is manipulated in a thread safe manner (unless you configure boost to disable threading support).
So you can copy a shared_ptr around and the ref_count is maintained correctly. What you cannot do safely in multiple threads is modify the actual shared_ptr object instance itself from multiple threads (such as calling reset() on it from multiple threads). So your usage is not safe - you're modifying the actual shared_ptr instance in multiple threads - you'll need to have your own protection.
In my code, shared_ptr's are generally locals or parameters passed by value, so there's no issue. Getting them from one thread to another I generally use a thread-safe queue.
Of course none of this addresses the thread safety of accessing the object pointed to by the shared_ptr - that's also up to you.
From the boost documentation:
shared_ptr objects offer the same
level of thread safety as built-in
types. A shared_ptr instance can be
"read" (accessed using only const
operations) simultaneously by multiple
threads. Different shared_ptr
instances can be "written to"
(accessed using mutable operations
such as operator= or reset)
simultaneously by multiple threads
(even when these instances are copies,
and share the same reference count
underneath.)
Any other simultaneous accesses result in undefined behavior.
So your usage is not safe, since it uses simultaneous read and write of m_res. Example 3 in the boost documentation also illustrates this.
You should use a separate mutex that guards the access to m_res in SetResource/GetResource.
Well, tr1::shared_ptr (which is based on boost) documentation tells a different story, which implies that resource management is thread safe, whereas access to the resource is not.
"...
Thread Safety
C++0x-only features are: rvalue-ref/move support, allocator support, aliasing constructor, make_shared & allocate_shared. Additionally, the constructors taking auto_ptr parameters are deprecated in C++0x mode.
The Thread Safety section of the Boost shared_ptr documentation says "shared_ptr objects offer the same level of thread safety as built-in types." The implementation must ensure that concurrent updates to separate shared_ptr instances are correct even when those instances share a reference count e.g.
shared_ptr a(new A);
shared_ptr b(a);
// Thread 1 // Thread 2
a.reset(); b.reset();
The dynamically-allocated object must be destroyed by exactly one of the threads. Weak references make things even more interesting. The shared state used to implement shared_ptr must be transparent to the user and invariants must be preserved at all times. The key pieces of shared state are the strong and weak reference counts. Updates to these need to be atomic and visible to all threads to ensure correct cleanup of the managed resource (which is, after all, shared_ptr's job!) On multi-processor systems memory synchronisation may be needed so that reference-count updates and the destruction of the managed resource are race-free.
..."
see
http://gcc.gnu.org/onlinedocs/libstdc++/manual/memory.html#std.util.memory.shared_ptr
m_Res is not threadsafe ,because it simultaneous read/write,
you need boost::atomic_store/load function to protects it.
//--- Example 3 ---
// thread A
p = p3; // reads p3, writes p
// thread B
p3.reset(); // writes p3; undefined, simultaneous read/write
Add, your class has a Cyclic-references condition; the shared_ptr<CResource> m_Res can't be a member of CResourceBase. You can use weak_ptr instead.