delete order/speed of std::lock_guard relative to other stack-allocated objects?

delete order/speed of std::lock_guard relative to other stack-allocated objects? - c++

As far as I can tell there is quite a bit of time between between the lock_guard getting deleted and when a function (run in another thread) actually returns. See the comment below in TEST(...)
bool bDone = false;
void run_worker(Foo* f) {
f->Compute();
bDone = true;
}
TEST(FooTest,ThreadFoo) {
Foo* f = makeFoo();
std::thread worker( run_worker, f );
worker.detach();
micro_wait(100); // wait for N microseconds
f->Reset(); // should block until Compute() is done
// !!?? Why is this necessary !?!?
int k=0;
while(++k<500 && !bDone)
micro_wait(100);**
EXPECT_TRUE(bDone); // Fails even with a single micro_wait(100)!
}
Is there a good explanation for when/why there can be such a time lapse
between when f->Compute() finishes and bDone gets set? My suspicion is that the mutex gets unlocked while there is still work to be done cleaning up stack-based variables allocated in Compute() but this is purely a hypothesis.
Stubs for Compute and Reset are below:
void Foo::Compute() {
std::lock_guard<std::mutex> guard(m_Mutex);
// ... allocate bunch of temporary stuff on stack, update *this
}
void Foo::Reset() {
std::lock_guard<std::mutex> guard(m_Mutex);
// ... simpler stuff, clear
}

There is no synchronization of bDone.
It's quite possible that the compiler loads bDone into a register while it's value is false, and then continues to use the register cached version, instead of acquiring the updated version from memory. Alternatively, your instructions may be reordered such that bDone is set to false after the lock is released.
The correct way to approach this is to use an std::atomic<bool>. The worker thread can update it with a call to bDone.store(true) and the waiting thread can read it's most current value with a call to bDone.load().
If you want to read into memory ordering to help understand why an atomic is needed, you can further improve this (though for a unit test, it really doesn't matter) by using acquire and release ordering.
Aside from this, what you really should be doing is joining your worker thread. A join blocks until the thread has ended, so you can be assured that your Compute function has completed execution. If you're afraid that it may run forever (or for too long), I'd suggest using boost::thread instead of std::thread, as it provides a timed_join function, which stops waiting for the thread after a specified period of time.

Related

C++ Kill a Thread In Destructor

I have a class that starts another thread that accesses some of its data at constant intervals. This means I have two threads that access the same data (the original thread and the newly created thread). This introduces the need for a mutex. All goes well until the destructor of the class is called (at the end of the program) and the memory locations are no longer valid. At this point the new thread attempts to access the data and gets an access violation error (obviously).
What I would like to do is stop the thread in the destructor, or have the thread stop once it "notices" that the class instance has been destroyed.
Here is the simplified thread code (typedefs used for brevity):
void myClass::StartThread() {
auto threadFunc = [&, this]() {
while (true) {
time_point now = steady_clock::now();
if (chro::duration_cast<chro::milliseconds>(now - this->m_lastSeedTime).count() > INTERVAL) {
std::lock_guard<std::mutex> lockGuard(this->m_mut);
this->m_lastSeedTime = now;
this->accessData();
}
}
};
std::thread thread(threadFunc);
thread.detach();
of course if I am just mishandling this in some obvious way, please let me know as well.

If you want a thread to die, you should ask it to exit. It's the only reliable way to do it cleanly.
Just change
while (true)
to
while(this->keepRunning)
and synchronize it appropriately. Either don't detach the thread (so the destructor can join it) or add some way for the thread to indicate that it has exited (so the destructor can wait for it).
Oh, and instead of spinning, the thread should probably sleep. In that case, if you don't want the destructor to also block, you need some way to interrupt the sleep: using a timed wait on a condition variable for your sleep makes this easy.

#Useless' answer is correct. Here is how exactly you can do it:
class myClass{
...
private:
std::thread m_thread;
std::atomic_bool m_keepRunning{true};
....
};
void myClass::StartThread() {
auto threadFunc = [&, this]() {
while (m_keepRunning) {
time_point now = steady_clock::now();
if (chro::duration_cast<chro::milliseconds>(now - this->m_lastSeedTime).count() > INTERVAL) {
std::lock_guard<std::mutex> lockGuard(this->m_mut);
if(!m_keepRunning) break; // destructor called, don't access data
this->m_lastSeedTime = now;
this->accessData();
}
}
};
m_thread = std::thread(threadFunc);
}
myClass::~myClass()
{
m_keepRunning = false;
m_mutex.unlock(); // make sure we don't wait in the loop for the lock
if(m_thread.joinable()) m_thread.join();
// do other cleaning
}
Another point is, when you always wait for INTERVAL, it will cause a cumulative delay in time. Let's say your interval is 50 ms. When your CPU has too much work to do or accessData function takes too much time, you won't be able to run the next iteration exactly in 50 ms. Let's say it will be 52 msecs, which is a 2 msecs delay. These delays will add up in time and will effect your precision.
Instead, you could do:
time_point waitUntil = steady_clock::now() + initialWaitTime;
while(m_keepRunning){
if(steady_clock::now() >= waitUntil)
{
// ... do your work
waitUntil = waitUntil + chro::milliseconds(INTERVAL)
}
}
Also #Useless is correct again for the timed waiting part. Spinning will cause a heavy load on your core. Instead, you should use a conditional or timed_mutex. But the advice above is still valid. Instead of using sleep_for, go for the sleep_until one.

Killing threads does not work. The problem is that if you do kill a thread, it could be in the middle of a multiple step operation that should be performed as an atomic operation, leaving your program in an invalid state. Instead, signal the other thread to commit suicide, and wait for it to die.

std::atomic_bool for cancellation flag: is std::memory_order_relaxed the correct memory order?

I have a thread that reads from a socket and generates data. After every operation, the thread checks a std::atomic_bool flag to see if it must exit early.
In order to cancel the operation, I set the cancellation flag to true, then call join() on the worker thread object.
The code of the thread and the cancellation function looks something like this:
std::thread work_thread;
std::atomic_bool cancel_requested{false};
void thread_func()
{
while(! cancel_requested.load(std::memory_order_relaxed))
process_next_element();
}
void cancel()
{
cancel_requested.store(true, std::memory_order_relaxed);
work_thread.join();
}
Is std::memory_order_relaxed the correct memory order for this use of an atomic variable?

As long as there is no dependency between cancel_requested flag and anything else, you should be safe.
The code as shown looks OK, assuming you use cancel_requested only to expedite the shutdown, but also have a provision for an orderly shutdown, such as a sentinel entry in the queue (and of course that the queue itself is synchronized).
Which means your code actually looks like this:
std::thread work_thread;
std::atomic_bool cancel_requested{false};
std::mutex work_queue_mutex;
std::condition_variable work_queue_filled_cond;
std::queue work_queue;
void thread_func()
{
while(! cancel_requested.load(std::memory_order_relaxed))
{
std::unique_lock<std::mutex> lock(work_queue_mutex);
work_queue_filled_cond.wait(lock, []{ return !work_queue.empty(); });
auto element = work_queue.front();
work_queue.pop();
lock.unlock();
if (element == exit_sentinel)
break;
process_next_element(element);
}
}
void cancel()
{
std::unique_lock<std::mutex> lock(work_queue_mutex);
work_queue.push_back(exit_sentinel);
work_queue_filled_cond.notify_one();
lock.unlock();
cancel_requested.store(true, std::memory_order_relaxed);
work_thread.join();
}
And if we're that far, then cancel_requested may just as well become a regular variable, the code even becomes simpler.
std::thread work_thread;
bool cancel_requested = false;
std::mutex work_queue_mutex;
std::condition_variable work_queue_filled_cond;
std::queue work_queue;
void thread_func()
{
while(true)
{
std::unique_lock<std::mutex> lock(work_queue_mutex);
work_queue_filled_cond.wait(lock, []{ return cancel_requested || !work_queue.empty(); });
if (cancel_requested)
break;
auto element = work_queue.front();
work_queue.pop();
lock.unlock();
process_next_element(element);
}
}
void cancel()
{
std::unique_lock<std::mutex> lock(work_queue_mutex);
cancel_requested = true;
work_queue_filled_cond.notify_one();
lock.unlock();
work_thread.join();
}
memory_order_relaxed is generally hard to reason about, because it blurs the general notion of sequentially executing code. So the usefulness of it is very, very limited as Herb explains in his atomic weapons talk.
Note std::thread::join() by itself acts as a memory barrier between the two threads.

Whether this code is correct depends on a lot of things. Most of all it depends on what exactly you mean by "correct". As far as I can tell, the bits of code that you show don't invoke undefined behavior (assuming your work_thread and cancel_requested are not actually initialized in the order your snippet above suggests as you would then have the thread potentially reading the uninitialized value of the atomic). If all you need to do is change the value of that flag and have the thread eventually see the new value at some point independent of whatever else may be going on, then std::memory_order_relaxed is sufficient.
However, I see that your worker thread calls a process_next_element() function. That suggests that there is some mechanism through which the worker thread receives elements to process. I don't see any way for the thread to exit when all elements have been processed. What does process_next_element() do when there's no next element available right away? Does it just return immediately? In that case you've got yourself a busy wait for more input or cancellation, which will work but is probably not ideal. Or does process_next_element() internally call some function that blocks until an element becomes available!? If that is the case, then cancelling the thread would have to involve first setting the cancellation flag and then doing whatever is needed to make sure the next element call your thread is potentially blocking on returns. In this case, it's potentially essential that the thread can never see the cancellation flag after the blocking call returns. Otherwise, you could potentially have the call return, go back into the loop, still read the old cancellation flag and then go call process_next_element() again. If process_next_element() is guaranteed to just return again, then you're fine. If that is not the case, you have a deadlock. So I believe it technically depends on what exactly process_next_element() does. One could imagine an implementation of process_next_element() where you would potentially need more than relaxed memory order. However, if you already have a mechanism for fetching new elements to process, why even use a separate cancellation flag? You could simply handle cancellation through that same mechanism, e.g., by having it return a next element with a special value or return no element at all to signal cancellation of processing and cause the thread to return instead of relying on a separate flag…

Swapping mutex locks

I'm having trouble with properly "swapping" locks. Consider this situation:
bool HidDevice::wait(const std::function<bool(const Info&)>& predicate)
{
/* A method scoped lock. */
std::unique_lock waitLock(this->waitMutex, std::defer_lock);
/* A scoped, general access, lock. */
{
std::lock_guard lock(this->mutex);
bool exitEarly = false;
/* do some checks... */
if (exitEarly)
return false;
/* Only one thread at a time can execute this method, however
other threads can execute other methods or abort this one. Thus,
general access mutex "this->mutex" should be unlocked (to allow threads
to call other methods) while at the same time, "this->waitMutex" should
be locked to prevent multiple executions of code below. */
waitLock.lock(); // How do I release "this->mutex" here?
}
/* do some stuff... */
/* The main problem is with this event based OS function. It can
only be called once with the data I provide, therefore I need to
have a 2 locks - one blocks multiple method calls (the usual stuff)
and "waitLock" makes sure that only one instance of "osBlockingFunction"
is ruinning at the time. Since this is a thread blocking function,
"this->mutex" must be unlocked at this point. */
bool result = osBlockingFunction(...);
/* In methods, such as "close", "this->waitMutex" and others are then used
to make sure that thread blocking methods have returned and I can safely
modify related data. */
/* do some more stuff... */
return result;
}
How could I solve this "swapping" problem without overly complicating code? I could unlock this->mutex before locking another, however I'm afraid that in that nanosecond, a race condition might occur.
Edit:
Imagine that 3 threads are calling wait method. The first one will lock this->mutex, then this->waitMutex and then will unlock this->mutex. The second one will lock this->mutex and will have to wait for this->waitMutex to be available. It will not unlock this->mutex. The third one will get stuck on locking this->mutex.
I would like to get the last 2 threads to wait for this->waitMutex to be available.
Edit 2:
Expanded example with osBlockingFunction.

It smells like that the design/implementation should be a bit different with std::condition_variable cv on the HidDevice::wait and only one mutex. And as you write "other threads can execute other methods or abort this one" will call cv.notify_one to "abort" this wait. The cv.wait {enter wait & unlocks the mutex} atomically and on cv.notify {exits wait and locks the mutex} atomically. Like that HidDevice::wait is more simple:
bool HidDevice::wait(const std::function<bool(const Info&)>& predicate)
{
std::unique_lock<std::mutex> lock(this->m_Mutex); // Only one mutex.
m_bEarlyExit = false;
this->cv.wait(lock, spurious wake-up check);
if (m_bEarlyExit) // A bool data-member for abort.
return;
/* do some stuff... */
}
My assumption is (according to the name of the function) that on /* do some checks... */ the thread waits until some logic comes true.
"Abort" the wait, will be in the responsibility of other HidDevice function, called by the other thread:
void HidDevice::do_some_checks() /* do some checks... */
{
if ( some checks )
{
if ( other checks )
m_bEarlyExit = true;
this->cv.notify_one();
}
}
Something similar to that.

I recommend creating a little "unlocker" facility. This is a mutex wrapper with inverted semantics. On lock it unlocks and vice-versa:
template <class Lock>
class unlocker
{
Lock& locked_;
public:
unlocker(Lock& lk) : locked_{lk} {}
void lock() {locked_.unlock();}
bool try_lock() {locked_.unlock(); return true;}
void unlock() {locked_.lock();}
};
Now in place of:
waitLock.lock(); // How do I release "this->mutex" here?
You can instead say:
unlocker temp{lock};
std::lock(waitLock, temp);
where lock is a unique_lock instead of a lock_guard holding mutex.
This will lock waitLock and unlock mutex as if by one uninterruptible instruction.
And now, after coding all of that, I can reason that it can be transformed into:
waitLock.lock();
lock.unlock(); // lock must be a unique_lock to do this
Whether the first version is more or less readable is a matter of opinion. The first version is easier to reason about (once one knows what std::lock does). But the second one is simpler. But with the second, the reader has to think more carefully about the correctness.
Update
Just read the edit in the question. This solution does not fix the problem in the edit: The second thread will block the third (and following threads) from making progress in any code that requires mutex but not waitMutex, until the first thread releases waitMutex.
So in this sense, my answer is technically correct, but does not satisfy the desired performance characteristics. I'll leave it up for informational purposes.

How to properly abort a thread with the use of a condition_variable?

I have a class with some methods that should be thread safe, i.e. multiple threads should be able operate on the class object state. One of the methods spawns a new thread that, every 10 seconds, updates a field. Because this thread can be long-running, I'd like to be able to abort it properly.
I have implemented a solution that uses std::condition_variable.wait_for() to wait for an abortion signal inside the thread, but am not particularly sure if my solution is either optimal or correct at all.
class A
{
unsigned int value; // A value that will be updated every 10 s in another thread
bool is_being_updated; // true while value is being updated in another thread
std::thread t;
bool aborted; // true = thread should abort
mutable std::mutex m1;
mutable std::mutex m2;
std::condition_variable cv;
public:
A();
~A();
void begin_update(); // Creates a thread that periodically updates value
void abort(); // Aborts the updating thread
unsigned int get_value() const;
void set_value(unsigned int);
};
This is how I implemented the methods:
A::A() : value(0), is_being_updated(false), aborted(false) { }
A::~A()
{
// Not sure if this is thread safe?
if(t.joinable()) t.join();
}
// Updates this->value every 10 seconds
void A::begin_update()
{
std::lock_guard<std::mutex> lck(m1);
if (is_being_updated) return; // Don't allow begin_update() while updating
is_being_updated = true;
if (aborted) aborted = false;
// Create a thread that will update value periodically
t = std::thread([this] {
std::unique_lock<std::mutex> update_lock(m2);
for(int i=0; i < 10; i++)
{
cv.wait_for(update_lock, std::chrono::seconds(10), [this]{ return aborted; });
if (!aborted)
{
std::lock_guard<std::mutex> lck(m1);
this->value++; // Update value
}
else
{
break; // Break on thread abort
}
}
// Locking here would cause indefinite blocking ...
// std::lock_guard<std::mutex> lck(m1);
if(is_being_updated) is_being_updated = false;
});
}
// Aborts the thread created in begin_update()
void A::abort()
{
std::lock_guard<std::mutex> lck(m1);
is_being_updated = false;
this->value = 0; // Reset value
{
std::lock_guard<std::mutex> update_lock(m2);
aborted = true;
}
cv.notify_one(); // Signal abort ...
if(t.joinable()) t.join(); // Wait for the thread to finish
}
unsigned int A::get_value() const
{
std::lock_guard<std::mutex> lck(m1);
return this->value;
}
void A::set_value(unsigned int v)
{
std::lock_guard<std::mutex> lck(m1);
if (is_being_updated) return; // Cannot set value while thread is updating it
this->value = v;
}
This seems to work fine, but I'm uncertain about it being correct. My concerns are the following:
Is my destructor safe? Suppose that the updating thread has not been aborted and is still doing its job while A object goes out of scope. A switch to a different thread now happens while dtor's t.join() still hasn't finished, and the switched-to thread calls begin_update() on the same object. Is something like this possible? Should I introduce e.g. an extra is_being_destructed flag that I would set to true inside a destructor and that all other methods should check for being false before they can proceed? Or can no such undesired scenario happen?
Inside the thread, at the end, I'm setting is_being_updated = false without a lock, despite the variable being shared state. This can mean that other threads won't see its correct value, e.g. even after the thread is done, some other thread may still see the value as is_being_updated == true instead of false. I cannot lock the mutex, however, because abort() may have already locked it, meaning that the call will block indefinitely. I'm not sure about the best way to solve this, other than perhaps making is_being_updated atomic. Would that work?
I've read about spurious wakeups, but am not sure I the code should do anything extra to handle them. As far as I understand, the answer is no, and no problems are to be expected in this regard.
Is my thinking here correct? Did I miss anything else that I should have in mind?

This stuff is always hard to check, so don't be afraid to question me if you think I misunderstand.
Short answer, no, it's not thread safe.
As long as the thread that has scope of A is the one calling abort (and doesn't forget to call abort), you won't experience a race condition, as A::abort() will block until the thread is joined. Under these assumptions, the join in your destructor is pointless.
If abort is called by the a thread that doesn't own A, then it's definitely possible for the thread to be join-ed twice, which is bad. Using .joinable() to decide to join a thread or not is a big red flag.
Please remove one of your if(t.joinable()) t.join(); (I'm leaning towards the one in the destructor) and change the other to just t.join().
As you said, you can make is_being_updated atomic. That's a great solution.
Here's another solution. You can signal without holding the lock. (It's actually better form in general, as it helps reduce lock contention, since the first thing the woken thread needs to do is reacquire its mutex.)
void A::abort()
{
{
std::lock(m1, m2); // deadlock-proof
std::lock_guard<std::mutex> lck(m1, std::adopt_lock);
std::lock_guard<std::mutex> update_lock(m2, std::adopt_lock);
is_being_updated = false;
this->value = 0; // Reset value
aborted = true;
}
cv.notify_one(); // Signal abort ...
t.join(); // Wait for the thread to finish
}
You're good. The way you wrote the wait, you will only come back if abort==true or 10 seconds has elapsed.

1) I think this problem is inherent on your design, as it is a bool flag will not fix the problem. Maybe A shouldn't go out of scope until all the threads stop using it, in which case it should reside in a managed pointer like shared_ptr.
2) You should be using atomics for your bools and also value, this would avoid having to use the unique_lock for increasing the value and for returning it.
3) As I said in the comments the lambda in the cv handles the spurious wakeups.

The biggest bit of code smell is using a full thread to update a variable every 10 seconds. A heavy-weight OS thread with magabytes to gigabytes of address space to do one task every 10 seconds.
What more, it is updating a value without anyone being able to see the change.
You already have a get_value wrapping accessor. Simply store the start point when you want to start counting. When you call get_value calculate the time since the start point. Divide by 10 seconds. Use that to calculate the returned value.
In a real application, you'd have a timer system that lets you trigger events (either in a thread pool, or in a message pump) every period of time. You'd use that instead of a dedicated thread to do something like this, and you'd make sure that modifying that value was vulgar (allowed people to subscribe to changes in it). Then your abort would consist of deregistering the timer instead of stopping a thread.
Your system is a horrible mixture of the two, using threads for no good reason.

ls this code thread-safe?

I'm refactoring some time consuming function so that it can be called from a thread, but I'm having trouble wrapping my head around the issue (not very familiar with thread programming).
At any point, the user can cancel and the function will stop. I do not want to kill the thread as soon as the user cancels since it could cause some data integrity problems. Instead, in several places in the function, I will check if the function has been cancelled and, if so, exit. I will only do that where I know it's safe to exit.
The whole code of the function will be within a mutex. This is the pseudo-code I have in mind:
SomeClass::SomeClass() {
cancelled_ = false;
}
void SomeClass::cancelBigSearch() {
cancelled_ = true;
}
void SomeClass::bigSearch() {
mutex.lock();
// ...
// Some code
// ...
// Safe to exit at this point
if (cancelled_) {
mutex.unlock();
cancelled_ = false;
return;
}
// ...
// Some more code
// ...
if (cancelled_) {
mutex.unlock();
cancelled_ = false;
return;
}
// ...
// Again more code
// ...
if (cancelled_) {
mutex.unlock();
cancelled_ = false;
return;
}
mutex.unlock();
}
So when the user starts a search, a new thread calls bigSearch(). If the user cancels, cancelBigSearch() is called and a cancelled_ flag is set. Then, when bigSearch() reaches a point where it's safe to exit, it will exit.
Any idea if this is all thread-safe?

You should lock access to cancelled_ with another mutex, so checking and setting does not happen simultaneously. Other than that, I think your approach is OK
Update: Also, make sure no exceptions can be thrown from SomeClass::bigSearch(), otherwise the mutex might remain in a locked state. To make sure that all return paths unlock the mutex, you might want to surround the processing parts of the code with if (!cancelled_) and return only at the very end of the method (where you have the one unlock() call on the mutex.
Better yet, wrap the mutex in a RAII (acronym for Resource Allocation Is Initialization) object, so no matter how the function ends (exception or otherwise), the mutex is guaranteed to be unlocked.

Yes, this is thread safe. But:
Processors can have separate cache and cache it's own copy of cancelled_, typically mutex synchronization functions applies proper cache synchronization.
Compiler generated code, can make invalid assumptions about Your data locality, this can lead to not update in time cancelled_. Some platform specific commands can help here, or you can simply use other mechanisms.
All these lead to a thread that isn't canceled in time as you wish.
Your code usage pattern is simple "signaling". So you need to transfer signal to thread. Signal patterns allows trigger multiple times same trigger (signal), and clear it later.
This can be simulated using:
atomic operations
mutex protected variables
signal synchronization primitives

It's not thread-safe, because one thread could read cancelled_ at the same time another thread writes to it, which is a data race, which is undefined behaviour.
As others suggested, either use an atomic type for cancelled_ or protect it with another mutex.
You should also use RAII types to lock the mutexes.
e.g.
void SomeClass::cancelBigSearch() {
std::lock_guard<std::mutex> lock(cxlMutex_);
cancelled_ = true;
}
bool SomeClass::cancelled() {
std::lock_guard<std::mutex> lock(cxlMutex_);
if (cancelled_) {
// reset to false, to avoid caller having to lock mutex again to reset it
cancelled_ = false;
return true;
}
return false;
}
void SomeClass::bigSearch() {
std::lock_guard<std::mutex> lock(mutex);
// ...
// Some code
// ...
// Safe to exit at this point
if (cancelled())
return;
// ...
// Some more code
// ...
if (cancelled())
return;
// ...
// Again more code
// ...
if (cancelled())
return;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

delete order/speed of std::lock_guard relative to other stack-allocated objects? - c++

Related

C++ Kill a Thread In Destructor

std::atomic_bool for cancellation flag: is std::memory_order_relaxed the correct memory order?

Swapping mutex locks

How to properly abort a thread with the use of a condition_variable?

ls this code thread-safe?

Categories

Resources