How to write an automated test for thread safety

How to write an automated test for thread safety - c++

I have a class which is not thread safe:
class Foo {
/* Abstract base class, code which is not thread safe */
};
Moreover, if you have foo1 and foo2 objects, you cannot call foo1->someFunc() until foo2->anotherFunc() has returned (this can happen with two threads). This is the situation and it can't be changed (a Foo subclass is actually a wrapper for a python script).
In order to prevent unwanted calls I've created the following -
class FooWrapper {
public:
FooWrapper(Foo* foo, FooWrappersMutex* mutex);
/* Wrapped functions from Foo */
};
Internally, FooWrapper wraps calls to the Foo functions with the shared mutex.
I want to test FooWrapper for thread safety. My biggest problem is the fact that threads are managed by the operating system, which means I've got less control on their execution. What I would like to test is the following scenario:
Thread 1 calls fooWrapper1->someFunc() and blocks while inside the function
Thread 2 calls fooWrapper2->anotherFunc() and returns immediately (since someFunc() is still executing)
Thread 1 finishes the execution
What is the simplest to test a scenario like this automatically?
I'm using QT on Win32, although I would prefer a solution which is at least cross-platform as QT is.

You might want to check out CHESS: A Systematic Testing Tool for Concurrent Software by Microsoft Research. It is a testing framework for multithreaded programs (both .NET and native code).
If I understood that correctly, it replaces the operating system's threading libraries with its own, so that it can control thread switching. Then it analyzes the program to figure out every possible way that the execution streams of the threads can interleave and it re-runs the test suite for every possible interleaving.

Instead of just checking that a particular thread is finished or not, why not create a fake Foo to be invoked by your wrapper in which the functions record the time at which they were actually started/completed. Then your yield thread need only wait long enough to be able to distinguish the difference between the recorded times. In your test you can assert that another_func's start time is after some_func's start time and it's completed time is before some_funcs completed time. Since your fake class is only recording the times, this should be sufficient to guarantee that the wrapper class is working properly.
EDIT: You know, of course, that what your Foo object does could be an anti-pattern, namely Sequential Coupling. Depending what it does, you may be able to handle it by simply having the second method do nothing if the first method has not yet been called. Using the example from the Sequential Coupling link, this would be similar to having the car do nothing when the accelerator pedal is pressed, if the car has not yet been started. If doing nothing is not appropriate, you could either wait and try again later, initiate the "start sequence" in the current thread, or handle it as an error. All of these things could be enforced by your wrapper as well and would probably be easier to test.
You also may need to be careful to make sure that the same method doesn't get invoked twice in sequence if an intervening call to another method is required.

Intel Threadchecker.
If I recall correctly the tool checks your code for theoretically possible data races.
The point is you don't need to run your code to check whether it's correct or not.

When you start multithreading, your code becomes, by definition, non-deterministic, so testing for thread safety is, in the general case, impossible.
But to your very specific question, if you insert long delays inside Foo to cause each Foo method to take a looonnng time, then you can do what you ask. That is, the probablity of the first thread returning before the second thread enters the call becomes essentially zero.
But what is it that you're really trying to accomplish? What is this test supposed to test? If you trying to validate that the FooWrappersMutex class works correctly, this won't do it.

So far I've written the following code. Sometimes it works and sometimes the test fails, since the Sleep is not enough for running all threads.
//! Give some time to the other threads
static void YieldThread()
{
#ifdef _WIN32
Sleep(10);
#endif //_WIN32
}
class FooWithMutex: public Foo {
public:
QMutex m_mutex;
virtual void someFunc()
{
QMutexLocker(&m_mutex);
}
virtual void anotherFunc()
{
QMutexLocker(&m_mutex);
}
};
class ThreadThatCallsFooFunc1: public QThread {
public:
ThreadThatCallsFooFunc1( FooWrapper& fooWrapper )
: m_fooWrapper(fooWrapper) {}
virtual void run()
{
m_fooWrapper.someFunc();
}
private:
FooWrapper& m_fooWrapper;
};
class ThreadThatCallsFooFunc2: public QThread {
public:
ThreadThatCallsFooFunc2( FooWrapper& fooWrapper )
: m_fooWrapper(fooWrapper) {}
virtual void run()
{
m_fooWrapper.anotherFunc();
}
private:
FooWrapper& m_fooWrapper;
};
TEST(ScriptPluginWrapperTest, CallsFromMultipleThreads)
{
// FooWithMutex inherits the abstract Foo and adds
// mutex lock/unlock on each function.
FooWithMutex fooWithMutex;
FooWrapper fooWrapper( &fooWithMutex );
ThreadThatCallsFooFunc1 thread1(fooWrapper);
ThreadThatCallsFooFunc2 thread2(fooWrapper);
fooWithMutex.m_mutex.lock();
thread1.start(); // Should block
YieldThread();
ASSERT_FALSE( thread1.isFinished() );
thread2.start(); // Should finish immediately
YieldThread();
ASSERT_TRUE( thread2.isFinished() );
fooWithMutex.m_mutex.unlock();
YieldThread();
EXPECT_TRUE( thread1.isFinished() );
}

Jinx to the rescue
http://www.corensic.com/

Related

Group member functions to all require implicit mutex lock first?

I have a "Device" class representing the connection of a peripheral hardware device. Scores of member functions ("device functions") are called on each Device object by clients.
class Device {
public:
std::timed_mutex mutex_;
void DeviceFunction1();
void DeviceFunction2();
void DeviceFunction3();
void DeviceFunction4();
// void DeviceFunctionXXX(); lots and lots of device functions
// other stuff
// ...
};
The Device class has a member std::timed_mutex mutex_ which must be locked by each of the device functions prior to communicating with the device, to prevent communication with the device simultaneously from concurrent threads.
An obvious but repetitive and cumbersome approach is to copy/paste the mutex_.try_lock() code at the top of the execution of each device function.
void Device::DeviceFunction1() {
mutex_.try_lock(); // this is repeated in ALL functions
// communicate with device
// other stuff
// ...
}
However, I'm wondering if there is a C++ construct or design pattern or paradigm which can be used to "group" these functions in such a way that the mutex_.try_lock() call is "implicit" for all functions in the group.
In other words: in a similar fashion that a derived class can implicitly call common code in a base class constructor, I'd like to do something similar with functions calls (instead of class inheritance).
Any recommendations?

First of all, if the mutex must be locked before you do anything else, then you should call mutex_.lock(), or at least not ignore the fact that try_lock may actually fail to lock the mutex. Also, manually placing calls to lock and unlock a mutex is extremely error-prone and can be much harder to get right than you might think. Don't do it. Use, e.g., an std::lock_guard instead.
The fact that you're using an std::timed_mutex suggests that what's actually going on in your real code may be a bit more involved (what for would you be using an std::timed_mutex otherwise). Assuming that what you're really doing is something more complex than just calling try_lock and ignoring its return value, consider encapsulating your complex locking procedure, whatever it may be, in a custom lock guard type, e.g.:
class the_locking_dance
{
auto do_the_locking_dance(std::timed_mutex& mutex)
{
while (!mutex.try_lock_for(100ms))
/* do whatever it is that you wanna do */;
return std::lock_guard { mutex, std::adopt_lock_t };
}
std::lock_guard<std::timed_mutex> guard;
public:
the_locking_dance(std::timed_mutex& mutex)
: guard(do_the_locking_dance(mutex))
{
}
};
and then create a local variable
the_locking_dance guard(mutex_);
to acquire and hold on to your lock. This will also automatically release the lock upon exit from a block.
Apart from all that, note that what you're doing here is, most likely, not a good idea in general. The real question is: why are there so many different methods that all need to be protected by the same mutex to begin with? Do you really have to support an arbitrary number of threads you know nothing about, which arbitrarily may do arbitrary things with the same device object at arbitrary times in arbitrary order? If not, then why are you building your Device abstraction to support this use case? Is there really no better interface that you could design for your application scenario, knowing about what it actually is the threads are supposed to be doing. Do you really have to do such fine-grained locking? Consider how inefficient it is with your current abstraction to, e.g., call multiple device functions in a row as that requires constantly locking and unlocking and locking and unlocking this mutex again and again all over the place…
All that being said, there may be a way to improve the locking frequency while, at the same time, addressing your original question:
I'm wondering if there is a C++ construct or design pattern or paradigm which can be used to "group" these functions in such a way that the mutex_.try_lock() call is "implicit" for all functions in the group.
You could group these functions by exposing them not as methods of a Device object directly, but as methods of yet another lock guard type, for example
class Device
{
…
void DeviceFunction1();
void DeviceFunction2();
void DeviceFunction3();
void DeviceFunction4();
public:
class DeviceFunctionSet1
{
Device& device;
the_locking_dance guard;
public:
DeviceFunctionSet1(Device& device)
: device(device), guard(device.mutex_)
{
}
void DeviceFunction1() { device.DeviceFunction1(); }
void DeviceFunction2() { device.DeviceFunction2(); }
};
class DeviceFunctionSet2
{
Device& device;
the_locking_dance guard;
public:
DeviceFunctionSet2(Device& device)
: device(device), guard(device.mutex_)
{
}
void DeviceFunction3() { device.DeviceFunction4(); }
void DeviceFunction4() { device.DeviceFunction3(); }
};
};
Now, to get access to the methods of your device within a given block scope, you first acquire the respective DeviceFunctionSet and then you can call the methods:
{
DeviceFunctionSet1 dev(my_device);
dev.DeviceFunction1();
dev.DeviceFunction2();
}
The nice thing about this is that the locking happens once for an entire group of functions (which will, hopefully, somewhat logically belong together as a group of functions used to achieve a particular task with your Device) automatically and you can also never forget to unlock the mutex…
Even with this, however, the most important thing is to not just build a generic "thread-safe Device". These things are usually neither efficient nor really useful. Build an abstraction that reflects the way multiple threads are supposed to cooperate using a Device in your particular application. Everything else is second to that. But without knowing anything about what your application actually is, there's not really anything more that could be said to that…

How to structure an object in C++ that does asynchronous background tasks

I'd like to have an object in C++ that does an asynchronous task in the background, but can still be requested to be freed by code using it without breaking things.
Let's say I have an object like this:
class MyObj {
std::thread* asyncTask;
bool result;
public:
MyObj() {
asyncTask = new std::thread([this](){
result = doSomething();
});
};
bool isResult() {
return result;
};
}
How would you go about making sure that the object can still be freed without terminating the process(due to thread still joinable/running at time of destruction)? I've thought about something involving delaying the destructor with a thread running counter, but that doesn't seem like the right solution. Part of the complexity is that the thread needs to normally access elements of the class, so it can't just detach either.

The only way to do this in general is to create a new process to handle the task (expensive, lots of marshalling and busywork), or have the thread cooperate.
A thread that cooperates regularly checks if it should abort. When it detects it should abort, it does so. It has to do this even when it is blocking on some resource.
For simple tasks, this is simple. For general tasks, next to impossible.
C++ compilers basically assume the threads get to act single threaded unless you go and explicitly synchronize operations. This permits certain important optimizations. The cost is that the state of a C++ thread need not make any sense at any point; so killing it or suspending it externally cannot be made safe (without cooperation).
In short, write your doSomething with cooperation and abort in mind.

How do some Qt functions not block?

When I call QNetworkAccessManager::get() or QNetworkAccessManager::post() or many other methods the program flow continues on after the call and if I want further interaction, like getting what was received from the server, I need to use signals/slots. Do these functions run in their own thread? But the times I have used threads I have had to call something like MyClass::start() which doesn't happen when I call get() or post().
The times I have built a threaded class, the only way to start a function in the class is through MyClass:start() and MyClass::run(). But right now I have a class that has all sorts of functions in it that get called, and these functions should run in the background so that the main application can later receive a signal from those functions.
Hypothetically I'd have something like this
class MyClass
{
public:
void func1();
void func2();
};
MyClass::func1()
{
// move off into other thread
// do stuff
emit signal1(data1)
}
MyClass::func2()
{
// move off into other thread
// do stuff
emit signal2(data2)
}
I should be able to access MyClass::func1 or func2 directly which would be cumbersome if the only was to access them were through MyClass::start().
I hope this makes sense, I'm more a php person, these things are a bit foreign to me.
In sum, I'm looking to have a class, with multiple public functions all of which can be called on their own thread. I think. Maybe I'm on the wrong track.

understanding a qthread subclass's run method and thread context

i have an encoder class with lots of methods . this is a subclass of Qthread. i am new to multi-threading and
trying to understand how this class is
threading its methods
... i understand to thread a method it has to be in a subclass of qthread. and the run of this implements the threaded code for this class. And the thread starts only when a call to start method on the object of this class is made.
Question : firstly what do you infer
from the this run implementation
void Encoder::run(void)
{
VERBOSE(VB_DEBUG, "Encoder::run");
if (WILL_PRINT(VB_DEBUG))
print_stats_timer_id = QObject::startTimer(kEncoderDebugInterval);
health_check_timer_id = QObject::startTimer(kEncoderHealthCheckInterval);
if (init())
exec();
else
VERBOSE(VB_ERROR, "Encoder::run -- failed to initialize encoder");
QObject::killTimer(health_check_timer_id);
if (print_stats_timer_id)
QObject::killTimer(print_stats_timer_id);
cleanup();
}
Question: what does thread context mean in
relation to its methods .
also
Question: what would happen If a method of this
class is called before this class's
thread has started

The class you have written creates a thread and initializes a QObject::timer. It then goes on to call a user defined init() function then the QThread::exec() function.
My guess is that you intended that exec() would be a user defined function where the actual work is to occur. Be aware that QThread::exec() processes the thread's Qt Event Queue.
Also, on some platforms you may get an "Error creating timer from thread" warning message. I've encountered this error on Windows when the code executed fine on Linux
Also, be aware that your timer will never occur if you do not call the QThread::exec() function or QApplication::processEvents() from within your thread.
Thread context in Qt is the same as any other thread concept. That is, all memory is shared between the threaded code (entered at this point in your "run()" function). And any other context which calls into your object. If this object may ever be executing in a thread and accessed from outside of the thread you must protect the shared data.
Because all data is shared between thread contexts (it's a shared memory multiprocessing model) there is no problem with calling functions before/after/during thread execution. Given that:
The object is fully constructed before you call any method. This is not special to threads, necessarily, unless the object is created in a thread.
Any data member is protected with a mutex lock (I eluded to this in #2). QMutexLocker is a handy stack based RAII way of dealing with mutex locks in Qt.
I believe I fully answered your question here, so I'll go ahead and link to RAII and threading articles I have written on another site, just for further reference.
Edit: specificity about threading scenarios:
class MyThreadedClass : public QThread
{
MyThreadClass(const boost::shared_ptr<SomeOtherClass> &t_object)
: m_object(t_object) {}
void doSomething()
{
// Depending on how this method was called (from main, from internal thread)
// will determine which thread this runs on, potentially complicating thread
// safety issues.
m_object->someThing();
}
void run()
{
// I'm now in a thread!
m_object->someFunction(); // oops! The call to someFunction is occurring from
// a thread, this means that SomeOtherClass must be
// threadsafe with mutex guards around shared
// (object level) data.
// do some other stuff
}
};
int main()
{
MyThreadClass thread(someobjectfromsomewhere);
thread.start(); // MyThreadClass is now running
thread.doSomething(); // The call to doSomething occurs from main's thread.
// This means 2 threads are using "thread", main
// and "thread"'s thread.
// The call to thread.doSomething hits Thread.m_object, which means that
// now multiple threads are also accessing m_object ("thread" and "main").
// This can all get very messy very quickly. It's best to tightly control
// how many threads are hitting an object, and how
}
NOTE: It would be a good idea to investigate QFuture, which is designed to handle this kind of asynchronous task, like an encoder, that you are looking at QFuture will avoid some of the potential threading issues of shared data and deadlocks.

C++ using this pointer in constructors

In C++, during a class constructor, I started a new thread with this pointer as a parameter which will be used in the thread extensively (say, calling member functions). Is that a bad thing to do? Why and what are the consequences?
My thread start process is at the end of the constructor.

The consequence is that the thread can start and code will start executing a not yet fully initialized object. Which is bad enough in itself.
If you are considering that 'well, it will be the last sentence in the constructor, it will be just about as constructed as it gets...' think again: you might derive from that class, and the derived object will not be constructed.
The compiler may want to play with your code around and decide that it will reorder instructions and it might actually pass the this pointer before executing any other part of the code... multithreading is tricky

Main consequence is that the thread might start running (and using your pointer) before the constructor has completed, so the object may not be in a defined/usable state. Likewise, depending how the thread is stopped it might continue running after the destructor has started and so the object again may not be in a usable state.
This is especially problematic if your class is a base class, since the derived class constructor won't even start running until after your constructor exits, and the derived class destructor will have completed before yours starts. Also, virtual function calls don't do what you might think before derived classes are constructed and after they're destructed: virtual calls "ignore" classes whose part of the object doesn't exist.
Example:
struct BaseThread {
MyThread() {
pthread_create(thread, attr, pthread_fn, static_cast<void*>(this));
}
virtual ~MyThread() {
maybe stop thread somehow, reap it;
}
virtual void id() { std::cout << "base\n"; }
};
struct DerivedThread : BaseThread {
virtual void id() { std::cout << "derived\n"; }
};
void* thread_fn(void* input) {
(static_cast<BaseThread*>(input))->id();
return 0;
}
Now if you create a DerivedThread, it's a best a race between the thread that constructs it and the new thread, to determine which version of id() gets called. It could be that something worse can happen, you'd need to look quite closely at your threading API and compiler.
The usual way to not have to worry about this is just to give your thread class a start() function, which the user calls after constructing it.

Depends on what you do after starting the thread. If you perform initialization work after the thread has started, then it could use data that is not properly initialized.
You can reduce the risks by using a factory method that first creates an object, then starts the thread.
But I think the greatest flaw in the design is that, for me at least, a constructor that does more than "construction" seems quite confusing.

It can be potentially dangerous.
During construction of a base class any calls to virtual functions will not despatch to overrides in more derived classes that haven't yet been completely constructed; once the construction of the more derived classes change this changes.
If the thread that you kick-off calls a virtual function and it is indeterminate where this happens in relation to the completion of the construction of the class then you are likely to get unpredictable behaviour; perhaps a crash.
Without virtual functions, if the thread only uses methods and data of the parts of the class that have been constructed completely the behaviour is likely to be predictable.

I'd say that, as a general rule, you should avoid doing this. But you can certainly get away with it in many circumstances. I think there are basically two things that can go wrong:
The new thread might try to access the object before the constructor finishes initializing it. You can work around this by making sure all initialization is complete before you start the thread. But what if someone inherits from your class? You have no control over what their constructor will do.
What happens if your thread fails to start? There isn't really a clean way to handle errors in a constructor. You can throw an exception, but this is perilous since it means that your object's destructor will not get called. If you elect not to throw an exception, then you're stuck writing code in your various methods to check if things were initialized properly.
Generally speaking, if you have complex, error-prone initialization to perform, then it's best to do it in a method rather than the constructor.

Basically, what you need is two-phase construction: You want to start your thread only after the object is fully constructed. John Dibling answered a similar (not a duplicate) question yesterday exhaustively discussing two-phase construction. You might want to have a look at it.
Note, however, that this still leaves the problem that the thread might be started before a derived class' constructor is done. (Derived classes' constructors are called after those of their base classes.)
So in the end the safest thing is probably to manually start the thread:
class Thread {
public:
Thread();
virtual ~Thread();
void start();
// ...
};
class MyThread : public Thread {
public:
MyThread() : Thread() {}
// ...
};
void f()
{
MyThread thrd;
thrd.start();
// ...
}

It's fine, as long as you can start using that pointer right away. If you require the rest of the constructor to complete initialization before the new thread can use the pointer, then you need to do some synchronization.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to write an automated test for thread safety - c++

Intel Threadchecker. If I recall correctly the tool checks your code for theoretically possible data races. The point is you don't need to run your code to check whether it's correct or not.

Jinx to the rescue http://www.corensic.com/

Related

Group member functions to all require implicit mutex lock first?

How to structure an object in C++ that does asynchronous background tasks

How do some Qt functions not block?

understanding a qthread subclass's run method and thread context

C++ using this pointer in constructors

Categories

Resources