Testing concurrent data structure - c++

What are some methods for testing concurrent data structures to make sure the data structs behave correctly when accessed from multiple threads ?

All of the other answers have focused on actually testing the code by putting it through its paces and actually running it in one form or another or politely saying "don't do it yourself, use an existing library".
This is great and all, but IMO, the most important (practical tests are important too) test is to look at the code line by line and for every line of code ask "what happens if I get interrupted by another thread here?" Imagine another thread, running just about any of the other lines/functions during this interruption. Do things still stay consistent? When competing for resources, does the other thread[s] block or spin?
This is what we did in school when learning about concurrency and it is a surprisingly effective approach. Bottom line, I feel that taking the time to prove to yourself that things are consistent and work as expected in all states is the first technique you should use when dealing with this stuff.

Concurrent systems are probabilistic and errors are often difficult to replicate. Therefore you need to run various input/output cases, each tested over time (hours, days, etc) in order to detect possible errors.
Tests for concurrent data structure involves examining the container's state before and after expected events such as insert and delete.

Use a pre-existing, pre-tested library that meets your needs if possible.
Make sure that the code has appropriate self-consistency checks (preferably fast sanity checks), and run your code on as many different types of hardware as possible to help narrow down interesting timing problems.
Have multiple people peer review the code, preferably without a pre-explanation of how it's supposed to work. That way they have to grok the code which should help catch more bugs.
Set up a bunch of threads that do nothing but random operations on the data structures and check for consistency at some rate.

Start with the assumption that your calls to access/modify data are not thread safe and use locks to ensure only a single thread can access/modify any part of the data at a time. Only after you can prove to yourself that a specific type of access is safe outside of the lock by multiple threads at once should you move that code outside of the lock.
Assume worst case scenarios, e.g. that your code will stop right in the middle of some pointer manipulation or another critical point, and that another thread will encounter that data in mid-transition. If that would have a bad result, leave it within the lock.

I normally test these kinds of things by interjecting sleep() calls at appropriate places in the distributed threads/processes.
For instance, to test a lock, put sleep(2) in all your threads at the point of contention, and spawn two threads roughly 1 second apart. The first one should obtain the lock, and the second should have to wait for it.
Most race conditions can be tested by extending this method, but if your system has too many components it may be difficult or impossible to know every possible condition that needs to be tested.

Run your concurrent threads for one or a few days and look what happens. (Sounds strange, but finding out race conditions is such a complex topic that simply trying it is the best approach).

Related

Multithreading: when to call mutex.lock?

So, I have a ton ob objects, each having several fields, including a c-array, which are modified within their "Update()" method. Now I create several threads, each updating a section of these objects. As far as I understand calling lock() before calling the update function would be useless, since this would essentially cause the updates being called in a sequential order just like they would be without multithreading. Now, there objects have pointers, cross referencing to each other. Do I need to call lock every time ANY field is modified, or just before specific operations (like delete, re-initializing arrays, etc?)
Do I need to call lock every time ANY field is modified, or just before specific operations (like delete, re-initializing arrays, etc?)
Neither. You need to have a lock even to read, to make sure another thread isn't part way through modifying the data you're reading. You might want to use a many reader / one writer lock. I suggest you start by having a single lock (whether a simple mutex or the more elaborate multi-reader/writer lock) and get the code working so you can profile it and see whether you actually need more fine-grained locking, then you'll have a bit more experience and understanding of options and advice about how to manage that.
If you do need fine-grained locking, then the trick is to think about where the locks logically belong - for example - there could be one per object. You'll then need to learn about techniques for avoiding deadlocks. You should do some background reading too.
It depends on the consequences of the data changes you want to make. If each thread is, for example, changing well defined sub-blocks of data and each sub-block is entirely independent of all other sub-blocks then it might make sense to have a mutex per sub-block.
That would allow one thread to deal with one set of sub-blocks whilst another gets a different subset to process.
Having threads make changes without gaining a mutex lock first is going to lead to inconsistencies at best...
If the data and processing isn't subdivisible that way then you would probably have to start thinking about how you might handle whole objects in parallel, ie adopt a coarser granularity and one mutex per object. This is perhaps more likely to be possible - different objects are supposed to be independent of each other, so it should in theory be possible to process their data in parallel.
However the unavoidable truth is that some computer jobs require fast single thread performance. For that one starts seriously needing the right sort of supercomputer and perhaps some jolly long pipelines.

How to (unit) test if a function is lock free ?

I would like to add several unit tests to my code, also as I load plug ins I don't always have access to the code I'm running.
The test I would really like to check is if the function I'm calling is lock free ?
Is there any hook, or way to test if between a point A and B in my program there was a call to a non lock free function ?
Another less complicated function is how to hook all calls to locking functions (like locks, system calls ...). I know how to hook calls to malloc on windows but nothing else.
Thank you for your help
You can't.
You could substitute a different implementation of pthread_lock but code could make direct calls to e.g. futex, and if you replace that the code could still call it directly with syscall(SYS_futex,...). You could profile the code or use something like strace to detect all such calls, but that still wouldn't tell you if the code implements its own custom spinlock in assembly.
I'm pretty sure you can't do that without instrumenting the locks, or something similar.
One could come up with a lot of scenarios where the call of a locking function causes different behaviour in testing [possibly only when "special test-mode for identifying testing" is enabled] than in production code - for example, add a sleep for 100ms into the lock method, and try to use another locked function and compare the time with "no competiton for the lock.
Or we could keep a count of calls to lock, and see if the count before and after the function is the same (or has increased by the expected amount, if the function is supposed to call lock a certain number of times).
But a generic way that isn't intrusive into the locking mechanism, I'm pretty sure it's impossible.
Of course, code-review and clear documentation as to what code calls locks and which doesn't would also be useful - and good reviewers that spot errors.
As the others have already answered it is not possible to test whether the algorithm is lock-free or not. However, it is possible to test that it behaves consistently in a multi-threaded environment. My experience in this area is only using a lock-free queue (which I wrote myself, but based on an academic paper) so my tests are based around a queue which may or may not be useful to you.
I used multiple threads to test to hammer the queue.
Thread Safety: the queue must not crash under heavy loads
Speed: how does the response times vary under a heavy load
Consistency: the queue mustn't loose items.
In my test, I also varied the number of readers and writers. The queue will behave differently depending on the ratio of readers to writers. More readers than writers will generally result in a nearly empty queue, whereas the inverse will result in a queue that continually expands until the writers stop writing.
Point 2 might be of interest to you as you can you can generally tell if the algorithm is lock-free or not based on the variance of response times under a heavy load. If response times remain fast under a heavy load then you can infer that the algorithm is lock-free. Or at least if it isn't it behaves as it if is.

Multithreading: a blocking wait with timeout

I'm using TinyThread++ to get clean and simple platform independent control over threading features in my project. I just came upon a situation where I'd like to have responsive synchronized message passing without pegging the CPU, while allowing a thread to continue to do a bit of work on the side while it is idle. Sure, I could simply spawn a third thread to do this "other work" but all I'm missing is a condition variable wait(int ms) type function rather than the wait() that already works great. The idea is that I'd like for it to block only for up to ms milliseconds, so it will be able to time out and perform some actions periodically (during which the thread will not be actively waiting on the condition variable). The idea is that even though it's nice to have the thread sitting there waiting to pounce on any incoming messages, if I give it some task to do on the side which takes only 50 microseconds to execute, and I only need to run that once every second, it definitely shouldn't push me to make yet another thread (and message queue and other resources) to get it done.
Does any of this make sense? I'm looking for suggestions on how i might go about implementing this. I'm hoping adding a couple of lines to the TinyThread code can provide me with this functionality.
Well the source code for the wait function isn't very complicated so making the required modificiations looks simple enough:
The linux implementation relies on the pthread_cond_wait function
which can trivially be changed to the pthread_cond_timedwait
function. Do read the documentation carefully in case I forgot about any minutias.
On the windows side of things, it's a little more
complicated and I'm no expert on multithreading on windows. That
being said, if there's a timed version of the _wait function (I'm pretty sure there is),
changing that should work just fine. Again, read over the documentation carefully before doing any modifications.
Now before you go off and do these modifications, I don't think what you're trying to do is a good idea. The main advantage of using threads is to conceptually seperate different tasks. Trying to do multiple things in a single thread is a bit like trying to do multiple things in a single function: it complicates the design and makes things harder to debug. So unless the overhead of creating a new thread is provably too great or unless the resulting code remains simple and easy to understand, I'd split it up into multiple threads.
Finally, I get the feeling that you might not be aware that condition variables can return spuriously (returns without anybody having done any signalling or returns when the condition is still false). So just in case, I'd suggest reviewing the usage examples and making sure you understand why those loops are there.

Is checking current thread inside a function ok?

Is it ok to check the current thread inside a function?
For example if some non-thread safe data structure is only altered by one thread, and there is a function which is called by multiple threads, it would be useful to have separate code paths depending on the current thread. If the current thread is the one that alters the data structure, it is ok to alter the data structure directly in the function. However, if the current thread is some other thread, the actual altering would have to be delayed, so that it is performed when it is safe to perform the operation.
Or, would it be better to use some boolean which is given as a parameter to the function to separate the different code paths?
Or do something totally different?
What do you think?
You are not making all too much sense. You said a non-thread safe data structure is only ever altered by one thread, but in the next sentence you talk about delaying any changes made to that data structure by other threads. Make up your mind.
In general, I'd suggest wrapping the access to the data structure up with a critical section, or mutex.
It's possible to use such animals as reader/writer locks to differentiate between readers and writers of datastructures but the performance advantage for typical cases usually wont merit the additional complexity associated with their use.
From the way your question is stated, I'm guessing you're fairly new to multithreaded development. I highly suggest sticking with the simplist and most commonly used approaches for ensuring data integrity (most books/articles you readon the issue will mention the same uses for mutexes/critical sections). Multithreaded development is extremely easy to get wrong and can be difficult to debug. Also, what seems like the "optimal" solution very often doesn't buy you the huge performance benefit you might think. It's usually best to implement the simplist approach that will work then worry about optimizing it after the fact.
There is a trick that could work in case, as you said, the other threads will only make changes only once in a while, although it is still rather hackish:
make sure your "master" thread can't be interrupted by the other ones (higher priority, non fair scheduling)
check your thread
if "master", just change
if other, put off scheduling, if needed by putting off interrupts, make change, reinstall scheduling
really test to see whether there are no issues in your setup.
As you can see, if requirements change a little bit, this could turn out worse than using normal locks.
As mentioned, the simplest solution when two threads need access to the same data is to use some synchronization mechanism (i.e. critical section or mutex).
If you already have synchronization in your design try to reuse it (if possible) instead of adding more. For example, if the main thread receives its work from a synchronized queue you might be able to have thread 2 queue the data structure update. The main thread will pick up the request and can update it without additional synchronization.
The queuing concept can be hidden from the rest of the design through the Active Object pattern. The activ object may also be able to publish the data structure changes through the Observer pattern to other interested threads.

Thread related issues and debugging them

This is my follow up to the previous post on memory management issues. The following are the issues I know.
1)data races (atomicity violations and data corruption)
2)ordering problems
3)misusing of locks leading to dead locks
4)heisenbugs
Any other issues with multi threading ? How to solve them ?
Eric's list of four issues is pretty much spot on. But debugging these issues is tough.
For deadlock, I've always favored "leveled locks". Essentially you give each type of lock a level number. And then require that a thread aquire locks that are monotonic.
To do leveled locks, you can declare a structure like this:
typedef struct {
os_mutex actual_lock;
int level;
my_lock *prev_lock_in_thread;
} my_lock_struct;
static __tls my_lock_struct *last_lock_in_thread;
void my_lock_aquire(int level, *my_lock_struct lock) {
if (last_lock_in_thread != NULL) assert(last_lock_in_thread->level < level)
os_lock_acquire(lock->actual_lock)
lock->level = level
lock->prev_lock_in_thread = last_lock_in_thread
last_lock_in_thread = lock
}
What's cool about leveled locks is the possibility of deadlock causes an assertion. And with some extra magic with FUNC and LINE you know exactly what badness your thread did.
For data races and lack of synchronization, the current situation is pretty poor. There are static tools that try to identify issues. But false positives are high.
The company I work for ( http://www.corensic.com ) has a new product called Jinx that actively looks for cases where race conditions can be exposed. This is done by using virtualization technology to control the interleaving of threads on the various CPUs and zooming in on communication between CPUs.
Check it out. You probably have a few more days to download the Beta for free.
Jinx is particularly good at finding bugs in lock free data structures. It also does very well at finding other race conditions. What's cool is that there are no false positives. If your code testing gets close to a race condition, Jinx helps the code go down the bad path. But if the bad path doesn't exist, you won't be given false warnings.
Unfortunately there's no good pill that helps automatically solve most/all threading issues. Even unit tests that work so well on single-threaded pieces of code may never detect an extremely subtle race condition.
One thing that will help is keeping the thread-interaction data encapsulated in objects. The smaller the interface/scope of the object, the easier it will be to detect errors in review (and possibly testing, but race conditions can be a pain to detect in test cases). By keeping a simple interface that can be used, clients that use the interface will also be correct just by default. By building up a bigger system from lots of smaller pieces (only a handful of which actually do thread-interaction), you can go a long way towards averting threading errors in the first place.
The four most common problems with theading are
1-Deadlock
2-Livelock
3-Race Conditions
4-Starvation
How to solve [issues with multi threading]?
A good way to "debug" MT applications is through logging. A good logging library with extensive filtering options makes it easier. Of course, logging itself influences the timing, so you still can have "heisenbugs", but it's much less likely than when you're actuall breaking into the debugger.
Prepare and plan for that. Include a good logging facility into your application from the start.
Make your threads as simple as possible.
Try not to use global variables. Global constants (actual constants that never change) is fine. When you do need to use global or shared variables you need to protect them with some type of mutex/lock (semaphore, monitor, ...).
Make sure that you actually understand what how your mutexes work. There are a few different implementations which can work differently.
Try to organize your code so that the critical sections (places where you hold some type of lock(s) ) are as quick as possible. Be aware that some functions may block (sleep or wait on something and keep the OS from allowing that thread to continue running for some time). Do not use these while holding any locks (unless absolutely necessary or during debugging as it can sometimes show other bugs).
Try to understand what more threads actually does for you. Blindly throwing more threads at a problem is very often going to make things worse. Different threads compete for the CPU and for locks.
Deadlock avoidance requires planning. Try to avoid having to acquire more than one lock at a time. If this is unavoidable decide on an ordering you will use to acquire and release the locks for all threads. Make sure you know what deadlock really means.
Debugging multi-threaded or distributed applications is difficult. If you can do most of the debugging in a single threaded environment (maybe even just forcing other threads to sleep) then you can try to eliminate non-threading centric bugs before jumping into multi-threaded debugging.
Always think about what the other threads might be up to. Comment this in your code. If you are doing something a certain way because you know that at that time no other thread should be accessing a certain resource write a big comment saying so.
You may want to wrap calls to mutex locks/unlocks in other functions like:
int my_lock_get(lock_type lock, const char * file, unsigned line, const char * msg) {
thread_id_type me = this_thread();
logf("%u\t%s (%u)\t%s:%u\t%s\t%s\n", time_now(), thread_name(me), me, "get", msg);
lock_get(lock);
logf("%u\t%s (%u)\t%s:%u\t%s\t%s\n", time_now(), thread_name(me), me, "in", msg);
}
And a similar version for unlock. Note, the functions and types used in this are all made up and not overly based on any one API.
Using something like this you can come back if there is an error and use a perl script or something like it to run queries on your logs to examine where things went wrong (matching up locks and unlocks, for instance).
Note that your print or logging functionality may need to have locks around it as well. Many libraries already have this built in, but not all do. These locks need to not use the printing version of the lock_[get|release] functions or you'll have infinite recursion.
Beware of global variables even if
they are const, in particular in
C++. Only POD that are statically
initialized "à la" C are good here.
As soon as a run-time constructor
comes into play, be extremely
careful. AFAIR initialization order
of variables with static linkage that are in
different compilation units are
called in an undefined order. Maybe
C++ classes that initialize all
their members properly and have an
empty function body, could be ok
nowadays, but I once had a bad
experience with that, too.
This is one of the reason why on the
POSIX side pthread_mutex_t is much
easier to program than sem_t: it
has a static initializer
PTHREAD_MUTEX_INITIALIZER.
Keep critical sections as short as
possible, for two reasons: it might
be more efficient at the end, but
more importantly it is easier to
maintain and to debug.
A critical section should never be
longer that a screen, including the
locking and unlocking that is needed
to protect it, and including the
comments and assertions that help
the reader to understand what is
happening.
Start implementing critical sections
very rigidly maybe with one global
lock for them all, and relax the
constraints afterwards.
Logging might is difficult if many
threads start to write at the same
time. If every thread does a
reasonable amount of work try to
have them each write a file of their
own, such that they don't interlock
each other.
But beware, logging changes behavior
of code. This can be bad when bugs
disappear, or beneficial when bugs
appear that you otherwise wouldn't
have noticed.
To make a post-mortem analysis of
such a mess you have to have
accurate timestamps on each line
such that all the files can be
merged and give you a coherent view
of the execution.
-> Add priority inversion to that list.
As another poster eluded to, log files are wonderful things. For deadlocks, using a LogLock instead of a Lock can help pinpoint when you entities stop working. That is, once you know you've got a deadlock, the log will tell you when and where locks were instantiated and released. This can be enormously helpful in tracking these things down.
I've found that race conditions when using an Actor model following the same message->confirm->confirm received style seem to disappear. That said, YMMV.