I have curious situation (at least for me :D ) in C++
My code is:
static void startThread(Object* r){
while(true)
{
while(!r->commands->empty())
{
doSomthing();
}
}
}
I start this function as thread using boost where commands in r is queue... this queue I fill up in another thread....
The problem is that if I fill the queue first and then start this tread everything works fine... But if I run the startThread first and after that I fill up queue commands, it is not working... doSomething() will not run...
Howewer if I modify startThread:
static void startThread(Object* r){
while(true)
{
std::cout << "c" << std::endl;
while(!r->commands->empty())
{
doSomthing();
}
}
}
I just added cout... and it is working... Can anybody explain why it is working with cout and not without? Or anybody has idea what can be wrong?
Maybe compiler is doing some kind of optimalization? I do not think so... :(
Thanks
But if I run the startThread first and after that I fill up queue commands, it is not working... doSomething() will not run
Of course not! What did you expect? Your queue is empty, so !r->commands->empty() will be false.
I just added cout... and it is working
You got lucky. cout is comparatively slow, so your main thread had a chance to fill the queue before the inner while test was executed for the first time.
So why does the thread not see an updated version of r->commands after it has been filled by the main thread? Because nothing in your code indicates that your variable is going to change from the outside, so the compiler assumes that it doesn’t.
In fact, the compiler sees that your r’s pointee cannot change, so it can just remove the redundant checks from the inner loop. When working with multithreaded code, you explicitly need to tell C++ that variables can be changed from a different context, using atomic memory access.
When u first run the thread and then fill up the queue, not entering the inner loop is logical, since the test !r->commands->empty() is true. After u add the cout statement, it is working because it takes some time to print the output, and meanwhile the other thread fills up the queue. so the condition becomes again true. But this is not good programming to rely on this facts in a multi-threading environment.
There are two inter-related issues:
You are not forcing a reload of r->commands or r->commands-Yempty(), thus your compiler, diligent as it is in search of the pinnacle of performance, cached the result. Adding some more code might make the compiler remove this optimisation if it cannot prove the caching is still valid.
You have a data-race, so your program has undefined behavior. (I am assuming doSomething() removes an element and some other thread adds elements.
1.10 Multi-threaded executions and data races § 21
The execution of a program contains a data race if it contains two conflicting actions in different threads,
at least one of which is not atomic, and neither happens before the other. Any such data race results in
undefined behavior. [ Note: It can be shown that programs that correctly use mutexes and memory_order_-
seq_cst operations to prevent all data races and use no other synchronization operations behave as if the
operations executed by their constituent threads were simply interleaved, with each value computation of an
object being taken from the last side effect on that object in that interleaving. This is normally referred to as
“sequential consistency”. However, this applies only to data-race-free programs, and data-race-free programs
cannot observe most program transformations that do not change single-threaded program semantics. In
fact, most single-threaded program transformations continue to be allowed, since any program that behaves
differently as a result must perform an undefined operation. —end note ]
22
Related
I am building a very simple program as an exercise.
The idea is to compute the total size of a directory by recursively iterating over all its contents, and summing the sizes of all files contained in the directory (and its subdirectories).
To show to a user that the program is still working, this computation is performed on another thread, while the main thread prints a dot . once every second.
Now the main thread of course needs to know when it should stop printing dots and can look up a result.
It is possible to use e.g. a std::atomic<bool> done(false); and pass this to the thread that will perform the computation, which will set it to true once it is finished. But I am wondering if in this simple case (one thread writes once completed, one thread reads periodically until nonzero) it is necessary to use atomic data types for this. Obviously if multiple threads might write to it, it needs to be protected. But in this case, there's only one writing thread and one reading thread.
Is it necessary to use an atomic data type here, or is it overkill and could a normal data type be used instead?
Yes, it's necessary.
The issue is that the different cores of the processor can have different views of the "same" data, notably data that's been cached within the CPU. The atomic part ensures that these caches are properly flushed so that you can safely do what you are trying to do.
Otherwise, it's quite possible that the other thread will never actually see the flag change from the first thread.
Yes it is necessary. The rule is that if two threads could potentially be accessing the same memory at the same time, and at least one of the threads is a writer, then you have a data race. Any execution of a program with a data race has undefined behavior.
Relevant quotes from the C++14 standard:
1.10/23
The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.
1.10/6
Two expression evaluations conflict if one of them modifies a memory location (1.7) and the other one accesses or modifies the same memory location.
Yes, it is necessary. Otherwise it is not guaranteed that changes to the bool in one thread will be observable in the other thread. In fact, if the compiler sees that the bool variable is, apparently, not ever used again in the execution thread that sets it, it might completely optimize away the code that sets the value of the bool.
I have a list that I want different threads to grab elements from. In order to avoid locking the mutex guarding the list when it's empty, I check empty() before locking.
It's okay if the call to list::empty() isn't right 100% of the time. I only want to avoid crashing or disrupting concurrent list::push() and list::pop() calls.
Am I safe to assume VC++ and Gnu GCC will only sometimes get empty() wrong and nothing worse?
if(list.empty() == false){ // unprotected by mutex, okay if incorrect sometimes
mutex.lock();
if(list.empty() == false){ // check again while locked to be certain
element = list.back();
list.pop_back();
}
mutex.unlock();
}
It's okay if the call to list::empty() isn't right 100% of the time.
No, it is not okay. If you check if the list is empty outside of some synchronization mechanism (locking the mutex) then you have a data race. Having a data race means you have undefined behavior. Having undefined behavior means we can no longer reason about the program and any output you get is "correct".
If you value your sanity, you'll take the performance hit and lock the mutex before checking. That said, the list might not even be the correct container for you. If you can let us know exactly what you are doing with it, we might be able to suggest a better container.
There is a read and a write (most probably to the size member of std::list, if we assume that it's named like that) that are not synchronized in reagard to each other. Imagine that one thread calls empty() (in your outer if()) while the other thread entered the inner if() and executes pop_back(). You are then reading a variable that is, possibly, being modified. This is undefined behaviour.
As an example of how things could go wrong:
A sufficiently smart compiler could see that mutex.lock() cannot possibly change the list.empty() return value and thus skip the inner if check completely, eventually leading to a pop_back on a list that had its last element removed after the first if.
Why can it do that? There is no synchronization in list.empty(), thus if it were changed concurrently that would constitute a data race. The standard says that programs shall not have data races, so the compiler will take that for granted (otherwise it could perform almost no optimizations whatsoever). Hence it can assume a single-threaded perspective on the unsynchronized list.empty() and conclude that it must remain constant.
This is only one of several optimizations (or hardware behaviors) that could break your code.
I want to test some Object's function for thread safety in a race condition. In order to test this I would like to call a function simultaneously from two (or more) different threads. How can I write code that guarantee that the function calls will occur at the same time or at least close enough that it will have the desired effect?
The best you can do is hammer heavily at the code and check all the little signs you may get of an issue. If there's a race-condition, you should be able to write code that will eventually trigger it. Consider:
#include <thread>
#include <assert.h>
int x = 0;
void foo()
{
while (true)
{
x = x + 1;
x = x - 1;
assert(x == 0);
}
}
int main()
{
std::thread t(foo);
std::thread t2(foo);
t.join();
t2.join();
}
Everywhere I test it, it asserts pretty quickly. I could then add critical sections until the assert is gone.
But in fact, there's no guarantee that it ever will assert. But I've used this technique repeatedly on large-scale production code. You may just need to hammer at your code for a long while, to be sure.
Have a struct having a field of array of integers of zero, probably 300-500 kB long. Then from two threads, copy two other structs (one having 1s another having 2s) to it, just before some atomic memory issuing barriers(to be sure undefined behavior area has finished, from main thread by checking atomic variable's value).
This should have a high chance of undefined behavior and maybe you could see mixed 1s, 2s (and even 0s?) in it to know it happened.
But when you delete all control stuff such as atomics, then new shape can be also another undefined behavior and behave different.
A great way to do this is by inserting well-timed sleep calls. You can use this, for example, to force combinations of events in an order you want to test (Thread 1 does something, then Thread 2 does something, then Thread 1 does something else). A downside is that you have to have an idea of where to put the sleep calls. After doing this for a little bit you should start to get a feel it, but some good intuition helps in the beginning.
You may be able to conditionally call sleep or hit a breakpoint from a specific thread if you can get a handle to the thread id.
Also, I'm pretty sure that Visual Studio and (I think) GDB allow you to freeze some threads and/or run specific ones.
I am building a very simple program as an exercise.
The idea is to compute the total size of a directory by recursively iterating over all its contents, and summing the sizes of all files contained in the directory (and its subdirectories).
To show to a user that the program is still working, this computation is performed on another thread, while the main thread prints a dot . once every second.
Now the main thread of course needs to know when it should stop printing dots and can look up a result.
It is possible to use e.g. a std::atomic<bool> done(false); and pass this to the thread that will perform the computation, which will set it to true once it is finished. But I am wondering if in this simple case (one thread writes once completed, one thread reads periodically until nonzero) it is necessary to use atomic data types for this. Obviously if multiple threads might write to it, it needs to be protected. But in this case, there's only one writing thread and one reading thread.
Is it necessary to use an atomic data type here, or is it overkill and could a normal data type be used instead?
Yes, it's necessary.
The issue is that the different cores of the processor can have different views of the "same" data, notably data that's been cached within the CPU. The atomic part ensures that these caches are properly flushed so that you can safely do what you are trying to do.
Otherwise, it's quite possible that the other thread will never actually see the flag change from the first thread.
Yes it is necessary. The rule is that if two threads could potentially be accessing the same memory at the same time, and at least one of the threads is a writer, then you have a data race. Any execution of a program with a data race has undefined behavior.
Relevant quotes from the C++14 standard:
1.10/23
The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.
1.10/6
Two expression evaluations conflict if one of them modifies a memory location (1.7) and the other one accesses or modifies the same memory location.
Yes, it is necessary. Otherwise it is not guaranteed that changes to the bool in one thread will be observable in the other thread. In fact, if the compiler sees that the bool variable is, apparently, not ever used again in the execution thread that sets it, it might completely optimize away the code that sets the value of the bool.
What is the worst thing that could happen when using a normal bool flag for controlling when one thread stops what it is doing? The peculiarity is that the exact time at which the thread stops is not very important at all, it just play backs some media, it might even be a half-second late in reacting for all I care. It has a simple while (!restart) loop:
while (!restart) //bool restart
{
//do something
}
and the other thread changes some seetings and then sets restart to true:
someSetting = newSetting;
restart = 1;
Since the playback loop runs thousands of times per second, I'm worried that using atomic bool might increase latency. I understand that this is "undefined behavior", but how does that manifest itself? If the bool is 54r*wx]% at some point, so what? Can I get runtime errors? The bool changes to a comprehensible value EVENTUALLY, doesn't it? (The code works currently, btw.) In another post, someone suggested that the flag might never change at all, since the threads have separate caches - this sounds iffy to me, surely the compiler must make sure that shared variables are changed even if a data race exists? Or is it possible that the order of execution for the controlling thread might change and someSetting might be changed after restart? Again, that just sounds creepy, why would a compiler allow that to happen?
I have considered setting a counter inside the loop and checking an atomic bool flag only every thousand times through. But I don't want to do that unless I really must.
UB doesn't mean that your code doesn't work, It just mean that behaviour of your code doesn't specified by standard. You must use std::atomic to make your code standard compliant without actually changing the behaviour. You can do this using memory_order_relaxed:
atomic<int> restart ....
while (!restart.load(memory_order_relaxed))
{
//do something
}
and in another thread:
someSetting = newSetting;
restart.store(1, memory_order_relaxed);
this code will emit the same instructions as yours.