How do i sync variables across threads? C++ - c++

Lets say i have A) Global variable B) Local variable but another thread has a ptr to it
Lets say i have this code
thread2(&localVar);//now thread2 can modify it
localVar=0;
globalVar=0;
while(1){
mutex.lock();
cout << (globalVar && localVar ? "Both true" : "fail");
mutex.unlock();
Sleep(1000)
}
Is this correct and safe? I can't remember. If it is my question is how does C++ know that localVar and globalVar may have been modified? If you say its because of mutex lock/unlock then my question is why? When calling any functions does C++ believe variables may have been modified and need to be reloaded into the register?
If this isn't safe then what makes it unsafe? (I suspect if it isnt then only localVar is unsafe), how do i correct it?

You are making a lot more complicated then it really is. Your responsibility is to make sure no conflicting accesses to a variable ever happen.
Definition: two accesses to a given memory location conflict iff they can happen concurrently and at least one is a write access.
So:
two writes to the same memory location = bad
a read and a write to the same memory location = bad
To ensure this never happens, you can use mutex, or you design your program so that variables shared between threads are only read. There are many other possible strategies.
If I understand your incomplete code correctly: localVar is modified concurrently in two threads. The behavior is not defined. Anything can happen.

It this correct and safe?
No.
Fix: Initialize localVar and globalVar before creating your second thread.
That you've exposed a localVar (on local stack? local var suggests on stack in scope) to another thread is troublesome. If this localVar is on the stack, then just be sure that you don't let it drop out of scope.
Are you using the same mutex on the other thread, to protect read/write access? If yes, then this is moving towards safe.
It's not a question of "does C++ know that localVar and globalVar may have been modified?": C++ doesn't know. For this situation, as described, it's all about using the mutex to properly protect all access to all shared values. The mutex, if used by both threads, allows only ONE thread at a time to get past it, into the protected code. Any other thread trying to lock the mutex will "stop & wait" until the lock is released.
So you're on the right track by surrounding your reads of the pair of values by the mutex.
When calling any functions does C++ believe variables may have been modified and need to be reloaded into the register?
As a general rule, yes.

I think that first time your get address of the local variable it become volatile and all reads and writes is performing with the memory.

Related

Does a mutex lock itself, or the memory positions in question?

Let's say we've got a global variable, and a global non-member function.
int GlobalVariable = 0;
void GlobalFunction();
and we have
std::mutex MutexObject;
then inside one of the threads, we have this block of code:
{
std::lock_guard<std::mutex> lock(MutexObject);
GlobalVairable++;
GlobalFunction()
}
now, inside another thread running in parallel, what happens if we do thing like this:
{
//std::lock_guard<std::mutex> lock(MutexObject);
GlobalVairable++;
GlobalFunction()
}
So the question is, does a mutex lock only itself from getting owned while being owned by another thread, not caring in the process about what is being tried to be accessed in the critical code? or does the compiler, or in run-time, the OS actually designate the memory location being accessed in the critical code as blocked for now by MutexObject?
My guess is the former, but I need to hear from an experienced programmer; Thanks for taking the time to read my question.
It’s the former. There’s no relationship between the mutex and the objects you’re protecting with the mutex. (In general, it's not possible for the compiler to deduce exactly which objects a given block of code will modify.) The magic behind the mutex comes entirely from the temporal ordering guarantees it makes: that everything the thread does before releasing the mutex is visible to the next thread after it’s grabbed the mutex. But the two threads both need to actively use the mutex for that to happen.
A system which actually cares about what memory locations a thread has accessed and modified, and builds safely on top of that, is “transactional memory”. It’s not widely used.

Shared Variables in C++11

So I took an OS class last semester and we had a concurrency/threading project. It was an airport sim that landed planes / had them take off into the direction that the wind was blowing. We had to do it in Java. So now that finals are over and I'm bored, I'm trying to do it in C++11. In Java I used a synchronized variable for the wind (0 - 360) in main and passed it to the 3 threads I was using. My question is: Can you do that in C++11? It's a basic reader/writer, one thread writes/updates the wind, the other 2 (takeoff/land) read.
I got it working by having a global wind variable in my "threads.cpp" implementation file. But is there a way to pass a variable to as many threads as I want and all of them keep up with it? Or is it actually better for me to just use the global variable and not pass anything?(why/why not?) I was looking at std::ref() but that didn't work.
EDIT: I'm already using mutex and lock_guard. I'm just trying to figure out how to pass and keep a variable up to date in all threads. Right now it only updates in the write thread.
You can use a std::mutex with std::lock_guard to synchronize access to the shared data. Or if the shared data fits in an integer, you can use std::atomic<int> without locking.
If you want to avoid global variables, simply pass the address of the shared state to the thread functions when you launch them. For example:
void thread_entry1(std::atomic<int>* val) {}
void thread_entry2(std::atomic<int>* val) {}
std::atomic<int> shared_value;
std::thread t1(thread_entry1, &shared_value);
std::thread t2(thread_entry2, &shared_value);
Using std::mutex and std::lock_guard mimicks what a Java synchronized variable does (only in Java this happens secretly without you knowing, in C++ you do it explicitly).
However, having one producer (there is just one direction of wind) and otherwise only consumers, it suffices to write to a e.g. std::atomic<int> variable with relaxed ordering, and to read from that variable from each consumer, again with relaxed ordering. Unless you have the requirement that the global view of all airplanes are consistent (but then you would have to run a lockstep simulation, which makes threading nonsensical), there is no need for synchronization, you only have to make sure that any value that any airplane reads at any time is eventually correct and that no garbled intermediate results can occur. In other words, you need an atomic update.
Relaxed memory ordering is sufficient too, since if all you read is one value, you do not need any happens-before guarantees.
An atomic update (or rather, atomic write) is at least an order of magnitude, if not more, faster. Atomic reads and writes with relaxed ordering are indeed plain normal reads and writes on many (most) mainstream architectures.
The variable needs not be global, you can as well keep it in the main thread's simultion loop's scope and pass a reference (or pointer) to the threads.
You might want to create say, the wind object, on the heap with new through an std::shared_ptr. Pass this pointer to all interested threads and use a std::mutex and std::lock_guard to change it.

double checked locking pattern in c++ concurrent programming

I am reading concurrency programming in c++ and came across this piece of code. the book mentioned the potential for nasty race conditions.
void undefined_behaviour_with_double_checked_locking(){
if(!resource_ptr){ //<1>
std::lock_guard<std::mutex> lk(resource_mutex);
if(!resource_ptr){ //<2>
resource_ptr.reset(new some_resource); //<3>
}
}
resource_ptr->do_something(); //<4>
}
here is the quote of explanation from the book. however, i just cant come up with a real example. I wonder if anyone here could help me out.
Unfortunately, this pattern is infamous for a reason: it has the
potential for nasty race conditions, because the read outside the lock
<1> isn’t synchronized with the write done by another thread inside
the lock <3>. This therefore creates a race condition that covers not
just the pointer itself but also the object pointed to; even if a
thread sees the pointer written by another thread, it might not see
the newly created instance of some_resource, resulting in the call to
do_something() <4> operating on incorrect values.
You don't show what resource_ptr is but from the explanation the reasoning seems to be that "!resource_ptr" (outside the lock) and "resource_ptr.reset" (inside the lock) are not atmoic and are not synchronized with each other.
The use case would be:
thread1 comes into the method, sees that resource_ptr is not
populated, enters the lock and is in the middle of the
resource_ptr.reset.
thread2 comes into the method and is when
checking !resource_ptr may see it as set but resource_ptr may not be
fully configured for use.
thread2 falls through to execute "resource_ptr->do_something()" and may see resource_ptr in an inconsistent state and bad things may happen.
I recommend you read this: http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf.
Anyway, the gist of it is: the compiler is free to reorder operations as long as they appear to be executed in the program's order in a single threaded situation. On top of that, some CPU architectures take the same liberties with their instruction execution order. So, technically resource_ptr could be modified to point to newly allocated memory before some_resource's constructor has finished. Another thread could at that time see that resource_ptr is not null and attempt to use the not-yet-fully-constructed instance.
The use of a smart pointer instead of a raw pointer might make this less likely, but it doesn't rule it out afaik.
The potential problem is that the write to resource_ptr isn't atomic (inside the reset call). Assuming that resource_ptr is a global or static variable that (/ or otherwise) starts initialized with the value NULL before we get here, it will never cause a thread to fall-through unless the object some_resource is already fully allocated and constructed, however - say that the pointer to this new object is 0x123456789, then it is theoretically possible that resource_ptr has, for example, the value 0x12340000 when another thread does the if (!resource_ptr) test, falls through and uses that value (especially more likely when using aliasing). If resource_ptr is an atomic variable then this code would be fine.
If a program can guarantee that the first time this code is called there is only one thread running (ie, the first call will be from main() before any other thread is created) then this will work fine too, because once initialized, the if test will just always fall through, resulting in only read accesses to resource_ptr while more than one thread is running. In that case you don't need the lock inside the if block though, and you are not allowed to ever write to resource_ptr anywhere else.

Boost, mutex concept

I am new to multi-threading programming, and confused about how Mutex works. In the Boost::Thread manual, it states:
Mutexes guarantee that only one thread can lock a given mutex. If a code section is surrounded by a mutex locking and unlocking, it's guaranteed that only a thread at a time executes that section of code. When that thread unlocks the mutex, other threads can enter to that code region:
My understanding is that Mutex is used to protect a section of code from being executed by multiple threads at the same time, NOT protect the memory address of a variable. It's hard for me to grasp the concept, what happen if I have 2 different functions trying to write to the same memory address.
Is there something like this in Boost library:
lock a memory address of a variable, e.g., double x, lock (x); So
that other threads with a different function can not write to x.
do something with x, e.g., x = x + rand();
unlock (x)
Thanks.
The mutex itself only ensures that only one thread of execution can lock the mutex at any given time. It's up to you to ensure that modification of the associated variable happens only while the mutex is locked.
C++ does give you a way to do that a little more easily than in something like C. In C, it's pretty much up to you to write the code correctly, ensuring that anywhere you modify the variable, you first lock the mutex (and, of course, unlock it when you're done).
In C++, it's pretty easy to encapsulate it all into a class with some operator overloading:
class protected_int {
int value; // this is the value we're going to share between threads
mutex m;
public:
operator int() { return value; } // we'll assume no lock needed to read
protected_int &operator=(int new_value) {
lock(m);
value = new_value;
unlock(m);
return *this;
}
};
Obviously I'm simplifying that a lot (to the point that it's probably useless as it stands), but hopefully you get the idea, which is that most of the code just treats the protected_int object as if it were a normal variable.
When you do that, however, the mutex is automatically locked every time you assign a value to it, and unlocked immediately thereafter. Of course, that's pretty much the simplest possible case -- in many cases, you need to do something like lock the mutex, modify two (or more) variables in unison, then unlock. Regardless of the complexity, however, the idea remains that you centralize all the code that does the modification in one place, so you don't have to worry about locking the mutex in the rest of the code. Where you do have two or more variables together like that, you generally will have to lock the mutex to read, not just to write -- otherwise you can easily get an incorrect value where one of the variables has been modified but the other hasn't.
No, there is nothing in boost(or elsewhere) that will lock memory like that.
You have to protect the code that access the memory you want protected.
what happen if I have 2 different functions trying to write to the same
memory address.
Assuming you mean 2 functions executing in different threads, both functions should lock the same mutex, so only one of the threads can write to the variable at a given time.
Any other code that accesses (either reads or writes) the same variable will also have to lock the same mutex, failure to do so will result in indeterministic behavior.
It is possible to do non-blocking atomic operations on certain types using Boost.Atomic. These operations are non-blocking and generally much faster than a mutex. For example, to add something atomically you can do:
boost::atomic<int> n = 10;
n.fetch_add(5, boost:memory_order_acq_rel);
This code atomically adds 5 to n.
In order to protect a memory address shared by multiple threads in two different functions, both functions have to use the same mutex ... otherwise you will run into a scenario where threads in either function can indiscriminately access the same "protected" memory region.
So boost::mutex works just fine for the scenario you describe, but you just have to make sure that for a given resource you're protecting, all paths to that resource lock the exact same instance of the boost::mutex object.
I think the detail you're missing is that a "code section" is an arbitrary section of code. It can be two functions, half a function, a single line, or whatever.
So the portions of your 2 different functions that hold the same mutex when they access the shared data, are "a code section surrounded by a mutex locking and unlocking" so therefore "it's guaranteed that only a thread at a time executes that section of code".
Also, this is explaining one property of mutexes. It is not claiming this is the only property they have.
Your understanding is correct with respect to mutexes. They protect the section of code between the locking and unlocking.
As per what happens when two threads write to the same location of memory, they are serialized. One thread writes its value, the other thread writes to it. The problem with this is that you don't know which thread will write first (or last), so the code is not deterministic.
Finally, to protect a variable itself, you can find a near concept in atomic variables. Atomic variables are variables that are protected by either the compiler or the hardware, and can be modified atomically. That is, the three phases you comment (read, modify, write) happen atomically. Take a look at Boost atomic_count.

How does Multiple C++ Threads execute on a class method

let's say we have a c++ class like:
class MyClass
{
void processArray( <an array of 255 integers> )
{
int i ;
for (i=0;i<255;i++)
{
// do something with values in the array
}
}
}
and one instance of the class like:
MyClass myInstance ;
and 2 threads which call the processArray method of that instance (depending on how system executes threads, probably in a completely irregular order). There is no mutex lock used in that scope so both threads can enter.
My question is what happens to the i ? Does each thread scope has it's own "i" or would each entering thread modify i in the for loop, causing i to be changing weirdly all the time.
i is allocated on the stack. Since each thread has its own separate stack, each thread gets its own copy of i.
Be careful. In the example provided the method processArray seems to be reentrant (it's not clear what happens in // do something with values in the array). If so, no race occurs while two or more threads invoke it simultaneously and therefore it's safe to call it without any locking mechanism.
To enforce this, you could mark both the instance and the method with the volatile qualifier, to let users know that no lock is required.
It has been published an interesting article of Andrei Alexandrescu about volatile qualifier and how it can be used to write correct multithreaded classes. The article is published here:
http://www.ddj.com/cpp/184403766
Since i is a local variable it is stored on the thread's own private stack. Hence, you do not need to protect i with a critical section.
As Adam said, i is a variable stored on the stack and the arguments are passed in so this is safe. When you have to be careful and apply mutexes or other synchronization mechanisms is if you were accessing shared member variables in the same instance of the class or global variables in the program (even scoped statics).