why is there a race condition in this multithreading snippet - c++

I have this code in c++ using multithreading but I am unsure why I am getting the output I am getting.
void Fun(int* var) {
int myID;
myID = *var;
std::cout << "Thread ID: " << myID << std::endl;
}
int main()
{
using ThreadVector = std::vector<std::thread>;
ThreadVector tv;
std::cout << std::thread::hardware_concurrency() << std::endl;
for (int i = 0; i < 3 ; ++i)
{
auto th = std::thread(&Fun, &i);
tv.push_back(std::move(th));
}
for (auto& elem : tv)
{
elem.join();
}
}
I am wondering if there is a race condition for the i variable, and if so, how does it interleave? I tried to compile it and I constantly got the Thread ID printout as 3, but I was surprised because I thought the variable had to be global in order to be accessed by the various new threads?
This is what I thought would happen: thread 1 is created, Fun starts to run in thread 1 with myid = 0, main thread continues running and increments i, 2nd thread is created and the myid for that would be myid=1... and so on. And so the printout would be the myID in increments i/e 1,2,3
I know that I can solve this with std::lock_guard but I am just wondering how is the interleaving (LOAD, INCREMENT,STORE) happening that causes this race condition for the i variable.
Kind help is appreciated thank you!

I am wondering if there is a race condition for the i variable
Yes, most definitely. The parent thread writes to i, which is a non-atomic variable, and the child threads read it, without any intervening synchronization. That's the exact definition of a data race in C++.
and if so, how does it interleave?
Data races in C++ cause undefined behavior, and any behavior you may observe does not have to be explainable by interleaving.
I tried to compile it and I constantly got the Thread ID printout as 3, but I was surprised because I thought the variable had to be global in order to be accessed by the various new threads?
No, it doesn't have to be global. Threads can access variables which are local to other threads if they are somehow passed a pointer or reference to such a variable.
This is what I thought would happen: thread 1 is created, Fun starts to run in thread 1 with myid = 0, main thread continues running and increments i, 2nd thread is created and the myid for that would be myid=1... and so on. And so the printout would be the myID in increments i/e 1,2,3
Well, nothing at all in your program forces those events to occur (or become observable) in that order, so there is really no basis for expecting that they will. It's entirely possible, for instance, that the three threads all get started, but don't get a chance to actually run until after the loop in main has completed, at which point i has the value 3. (Or rather, the memory where i used to be located, as it is now out of scope and its lifetime has ended - it's a separate bug that you don't prevent that from happening.)

This is the version of the code that would not exhibit a data race:
#include <iostream>
#include <thread>
#include <vector>
// since `id` is passed by value, each thread will work on its own copy and no
// data race is possible
void fun(int id) { std::cout << "thread id: " << id << "\n"; }
int main() {
std::vector<std::thread> threads;
for (auto id = 0; id < 3; ++id) {
threads.emplace_back(fun, id);
}
for (auto& thread : threads) {
thread.join();
}
}
Since each thread receives a copy of the variable id, there is no race (except for the scrambled output due to unsynchronized std::cout, but I assume that's not part of this discussion).
Variables do not need to be global for their use in multiple threads. In fact, global variables often make it more difficult or even practically impossible to write multithreaded code, since there is no guarantee that every read and write will be appropriately synchronized.

Related

Thread usage counter C++

In a C++ class, How can I limit the number calls/uses of a certain function for each thread?
For example, each thread is allowed only to use a certain data setter for 3 times.
You just have to count how often the method has been called for each thread and then react accordingly:
void Foo::set(int x) {
static std::map<std::thread::id,unsigned> counter;
auto counts = ++counter[std::this_thread::get_id()];
if (counts > max_counts) return;
x_member = x;
}
This is just to outline the basic idea. I am not so sure about the static map. I am not even sure if it is a good idea to let the method itself implement the counter. I would rather put this elsewhere, eg each thread could get a CountedFoo instance that holds a reference to the actual Foo object and the CountedFoo controls the maximum number of calls.
PS: And of course, don't forget to use some synchronisation when multiple threads are calling the method concurrently (for the sake of brevity I did not include any mutex or similar in the above code).
Using std::map to store thread Ids as sugested by #formerlyknownas_463035818 would probably be the most robust solution, but synchronization might prove more complex.
The fastest solution to this issue is using thread_local. This will enable each thread to have its own copy of the counter. Here is the working example which might prove useful.
thread_local unsigned int N_Calls = 0;
std::mutex mtx;
void controlledIncreese(const std::string& thread_name){
while (N_Calls < 3) {
++N_Calls;
std::this_thread::sleep_for(std::chrono::seconds(rand() % 2));
std::lock_guard<std::mutex> lock(mtx);
std::cout << "Call for thread " << thread_name.c_str() << ": " << N_Calls << '\n';
}
}
int main(){
std::thread first_t(controlledIncreese, "first"), second_t(controlledIncreese, "second");
first_t.join();
second_t.join();
}
Since both Threads are using std::cout the actual output will be sequential, so this specific example is not very useful but it does provide easy working solution to thread execution counting problem.

Unexpected behavior when std::thread.detach is called

I've been trying to develop a better understanding of C++ threading, by which I have written the following example:
#include <functional>
#include <iostream>
#include <thread>
class Test {
public:
Test() { x = 5; }
void act() {
std::cout << "1" << std::endl;
std::thread worker(&Test::changex, this);
worker.detach();
std::cout << "2" << std::endl;
}
private:
void changex() {
std::cout << "3" << std::endl;
x = 10;
std::cout << "4" << std::endl;
}
int x;
};
int main() {
Test t;
t.act();
return 0;
}
To me, I should get the following output when compiled with g++ linked with -pthread:
1
2
3
4
as the cout calls are in that order. However, the output is inconsistent. 1 and 2 are always printed in order, but sometimes the 3 and or 4 are either omitted or printed double. i.e. 12, 123, 1234, or 12344
My working theory is that the main thread exits before the worker thread begins working or completes, thus resulting in the omission of output. I can immediately think of a solution to this problem in creating a global boolean variable to signify when the worker thread has completed that the main thread waits on for a state change before exiting. This alleviates the issue.
However, this feels to me like a highly messy approach that likely has a more clean solution, especially for an issue like this that likely comes up often in threading.
Just some general advice, that holds both for using raw pthreads in C++ and for pthreads wrapped in std::thread: The best way to get readable, comprehensible and debuggable behavior is to make thread synchronization and lifetime management explicit. I.e. avoid using pthread_kill, pthread_cancel, and in most cases, avoid detaching threads and instead do explicit join.
One design pattern I like is using an std atomic flag. When main thread wants to quit, it sets the atomic flag to true. The worker threads typically do their work in a loop, and check the atomic flag reasonably often, e.g. once per lap of the loop. When they find main has ordered them to quit, they clean up and return. The main thread then join:s with all workers.
There are some special cases that require extra care, for example when one worker is stuck in a blocking syscall and/or C library function. Usually, the platform provides ways of getting out of such blocking calls without resorting to e.g. pthread_cancel, since thread cancellation works very badly with C++. One example of how to avoid blocking is the Linux manpage for getaddrinfo_a, i.e. asynchronous network address translation.
One additional nice design pattern is when workers are sleeping in e.g. select(). You can then add an extra control pipe between main and the worker. Main signals the worker to quit by send():ing one byte over the pipe, thus waking up the worker if it sleeps in select().
Example of how this could be done:
#include <functional>
#include <iostream>
#include <thread>
class Test {
std::thread worker; // worker is now a member
public:
Test() { x = 5; } // worker deliberately left without a function to run.
~Test()
{
if (worker.joinable()) // worker can be joined (act was called successfully)
{
worker.join(); // wait for worker thread to exit.
// Note destructor cannot complete if thread cannot be exited.
// Some extra brains needed here for production code.
}
}
void act() {
std::cout << "1" << std::endl;
worker = std::thread(&Test::changex, this); // give worker some work
std::cout << "2" << std::endl;
}
// rest unchanged.
private:
void changex() {
std::cout << "3" << std::endl;
x = 10;
std::cout << "4" << std::endl;
}
int x;
};
int main() {
Test t;
t.act();
return 0;
} // test destroyed here. Destruction halts and waits for thread.

How do I print in a new thread without threads interrupting lines? (particularly c++)

I've worked a decent amount with threading in C on linux and now I'm trying to do the same but with c++ on Windows, but I'm having trouble with printing to the standard output. In the function the thread carries out I have:
void print_number(void* x){
int num = *(static_cast<int*> (x));
std::cout << "The number is " << num << std::endl;
}
wrapped in a loop that creates three threads. The problem is that although everything gets printed, the threads seem to interrupt each other between each of the "<<"'s.
For example, the last time I ran it I got
The number is The number is 2The number is 3
1
When I was hoping for each on a separate line. I'm guessing that each thread is able to write to the standard output after another has written a single section between "<<"s. In C, this wasn't a problem because the buffer wasn't flushed until everything I needed the write was there, but that's not the case now I don't think. Is this a case of a need for a mutex?
In C++, we first of all would prefer to take arguments as int*. And then, we can just lock. In C++11:
std::mutex mtx; // somewhere, in case you have other print functions
// that you want to control
void print_number(int* num) {
std::unique_lock<std::mutex> lk{mtx}; // RAII. Unlocks when lk goes out of scope
std::cout << "The number is " << *num << std::endl;
}
If not C++11, there's boost::mutex and boost::mutex::scoped_lock that work the same way and do the same thing.
Your C example worked by accident; printf and the like aren't atomic either.
This is indeed a case for a mutex. I typically allocate it static function locally. E.g.:
void atomic_print(/*args*/) {
static MyMutex mutex;
mutex.acquire();
printf(/*with the args*/);
mutex.release();
}

Boost Mutex Scoped Lock

I was reading through a Boost Mutex tutorial on drdobbs.com, and found this piece of code:
#include <boost/thread/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/bind.hpp>
#include <iostream>
boost::mutex io_mutex;
void count(int id)
{
for (int i = 0; i < 10; ++i)
{
boost::mutex::scoped_lock
lock(io_mutex);
std::cout << id << ": " <<
i << std::endl;
}
}
int main(int argc, char* argv[])
{
boost::thread thrd1(
boost::bind(&count, 1));
boost::thread thrd2(
boost::bind(&count, 2));
thrd1.join();
thrd2.join();
return 0;
}
Now I understand the point of a Mutex is to prevent two threads from accessing the same resource at the same time, but I don't see the correlation between io_mutex and std::cout. Does this code just lock everything within the scope until the scope is finished?
Now I understand the point of a Mutex is to prevent two threads from accessing the same resource at the same time, but I don't see the correlation between io_mutex and std::cout.
std::cout is a global object, so you can see that as a shared resource. If you access it concurrently from several threads, those accesses must be synchronized somehow, to avoid data races and undefined behavior.
Perhaps it will be easier for you to notice that concurrent access occurs by considering that:
std::cout << x
Is actually equivalent to:
::operator << (std::cout, x)
Which means you are calling a function that operates on the std::cout object, and you are doing so from different threads at the same time. std::cout must be protected somehow. But that's not the only reason why the scoped_lock is there (keep reading).
Does this code just lock everything within the scope until the scope is finished?
Yes, it locks io_mutex until the lock object itself goes out of scope (being a typical RAII wrapper), which happens at the end of each iteration of your for loop.
Why is it needed? Well, although in C++11 individual insertions into cout are guaranteed to be thread-safe, subsequent, separate insertions may be interleaved when several threads are outputting something.
Keep in mind that each insertion through operator << is a separate function call, as if you were doing:
std::cout << id;
std::cout << ": ";
std::cout << i;
std::cout << endl;
The fact that operator << returns the stream object allows you to chain the above function calls in a single expression (as you have done in your program), but the fact that you are having several separate function calls still holds.
Now looking at the above snippet, it is more evident that the purpose of this scoped lock is to make sure that each message of the form:
<id> ": " <index> <endl>
Gets printed without its parts being interleaved with parts from other messages.
Also, in C++03 (where insertions into cout are not guaranteed to be thread-safe) , the lock will protect the cout object itself from being accessed concurrently.
A mutex has nothing to do with anything else in the program
(except a conditional variable), at least at a higher level.
A mutex has two effeccts: it controls program flow, and prevents
multiple threads from executing the same block of code
simultaneously. It also ensures memory synchronization. The
important issue here, is that mutexes aren't associated with
resources, and don't prevent two threads from accessing the same
resource at the same time. A mutex defines a critical section
of code, which can only be entered by one thread at a time. If
all of the use of a particular resource is done in critical
sections controled by the same mutex, then the resource is
effectively protected by the mutex. But the relationship is
established by the coder, by ensuring that all use does take
place in the critical sections.

C++ - Threads without coordinating mechanism like mutex_Lock

I attended one interview two days back. The interviewed guy was good in C++, but not in multithreading. When he asked me to write a code for multithreading of two threads, where one thread prints 1,3,5,.. and the other prints 2,4,6,.. . But, the output should be 1,2,3,4,5,.... So, I gave the below code(sudo code)
mutex_Lock LOCK;
int last=2;
int last_Value = 0;
void function_Thread_1()
{
while(1)
{
mutex_Lock(&LOCK);
if(last == 2)
{
cout << ++last_Value << endl;
last = 1;
}
mutex_Unlock(&LOCK);
}
}
void function_Thread_2()
{
while(1)
{
mutex_Lock(&LOCK);
if(last == 1)
{
cout << ++last_Value << endl;
last = 2;
}
mutex_Unlock(&LOCK);
}
}
After this, he said "these threads will work correctly even without those locks. Those locks will reduce the efficiency". My point was without the lock there will be a situation where one thread will check for(last == 1 or 2) at the same time the other thread will try to change the value to 2 or 1. So, My conclusion is that it will work without that lock, but that is not a correct/standard way. Now, I want to know who is correct and in which basis?
Without the lock, running the two functions concurrently would be undefined behaviour because there's a data race in the access of last and last_Value Moreover (though not causing UB) the printing would be unpredictable.
With the lock, the program becomes essentially single-threaded, and is probably slower than the naive single-threaded code. But that's just in the nature of the problem (i.e. to produce a serialized sequence of events).
I think the interviewer might have thought about using atomic variables.
Each instantiation and full specialization of the std::atomic template defines an atomic type. Objects of atomic types are the only C++ objects that are free from data races; that is, if one thread writes to an atomic object while another thread reads from it, the behavior is well-defined.
In addition, accesses to atomic objects may establish inter-thread synchronization and order non-atomic memory accesses as specified by std::memory_order.
[Source]
By this I mean the only thing you should change is remove the locks and change the lastvariable to std::atomic<int> last = 2; instead of int last = 2;
This should make it safe to access the last variable concurrently.
Out of curiosity I have edited your code a bit, and ran it on my Windows machine:
#include <iostream>
#include <atomic>
#include <thread>
#include <Windows.h>
std::atomic<int> last=2;
std::atomic<int> last_Value = 0;
std::atomic<bool> running = true;
void function_Thread_1()
{
while(running)
{
if(last == 2)
{
last_Value = last_Value + 1;
std::cout << last_Value << std::endl;
last = 1;
}
}
}
void function_Thread_2()
{
while(running)
{
if(last == 1)
{
last_Value = last_Value + 1;
std::cout << last_Value << std::endl;
last = 2;
}
}
}
int main()
{
std::thread a(function_Thread_1);
std::thread b(function_Thread_2);
while(last_Value != 6){}//we want to print 1 to 6
running = false;//inform threads we are about to stop
a.join();
b.join();//join
while(!GetAsyncKeyState('Q')){}//wait for 'Q' press
return 0;
}
and the output is always:
1
2
3
4
5
6
Ideone refuses to run this code (compilation errors)..
Edit: But here is a working linux version :) (thanks to soon)
The interviewer doesn't know what he is talking about. Without the locks you get races on both last and last_value. The compiler could for example reorder the assignment to last before the print and increment of last_value, which could lead to the other thread executing on stale data. Furthermore you could get interleaved output, meaning things like two numbers not being seperated by a linebreak.
Another thing, which could go wrong is that the compiler might decide not to reload last and (less importantly) last_value each iteration, since it can't (safely) change between those iterations anyways (since data races are illegal by the C++11 standard and aren't acknowledged in previous standards). This means that the code suggested by the interviewer actually has a good chance of creating infinite loops of doing absoulutely doing nothing.
While it is possible to make that code correct without mutices, that absolutely needs atomic operations with appropriate ordering constraints (release-semantics on the assignment to last and acquire on the load of last inside the if statement).
Of course your solution does lower efficiency due to effectivly serializing the whole execution. However since the runtime is almost completely spent inside the streamout operation, which is almost certainly internally synchronized by the use of locks, your solution doesn't lower the efficiency anymore then it already is. Waiting on the lock in your code might actually be faster then busy waiting for it, depending on the availible resources (the nonlocking version using atomics would absolutely tank when executed on a single core machine)