Multithreading. Do I need critical sections for read-only access? - c++

I have a bunch of threads. They should access a singleton containing configuration data which is initialized once when the singleton is created. Hence on the first access. So further actions on the singleton are just read-only. Do I need critical sections in this case?

It appears that because the data is created lazily on first access, the pointer or the reference to your singleton is read-write. This means that you do need a critical section.
In fact, the desire to avoid a critical section while keeping the lazy initialization in this situation has been so universally strong that it lead to the creation of the double-checked locking antipattern.
On the other hand, if you were to initialize your singleton eagerly before the reads, you would be able to avoid a critical section for accessing an immutable object through a constant pointer / reference.

I understand your question as there is lazy initialization in your singleton. It is initialized only for the first read.
The next consecutive reads are thread safe. But what about concurrent read during initialization?
If you have situation like this:
SomeConfig& SomeConfig::getInstance()
{
static SomeConfig instance;
return instance;
}
Then it depends on your compiler. According to this post in C++03 it was implementation dependent if this static initialization is thread safe.
For C++11 it is thread safe - see answers to this post, citation:
such a variable is initialized the first time control passes through its declaration; such a variable is considered initialized upon the completion of its initialization. [...] If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization.
It is worth to note that read only access to global variables is thread safe.

No. If you're simply reading this data after it is fully initialized and the data never changes then there is no possibility of a race condition occurring.
However, if the data is written to/modified in any way then you will need to synchronize access to it, i.e., lock the data before writing.

If you only ever read some shared data, and never write, you do not need to synchronize access.
You only need to synchronize when a shared piece of data is both read and written at potentially the same time.

The official rule in the spec is that a data race is when one thread can write to a variable concurrently while another thread read or writes the same variable.
If you can prove that the initialization has to take place before any of the readers can read, then you do not need synchronization. This is most often accomplished by initializing before creating (or synchronizing) the threads, or by using static storage variables, which C++11 guarantees some synchronization on

Related

OpenMP race condition while reading from pointer

I know that reading from a shared variable in OpenMP does not cause a race condition, because every thread has it's own copy of that variable.
But if the shared variable is a pointer (e.g. to a container), then every thread only gets a copy of the pointer.
If I now read from the location the pointer is pointing to (my container), can there be race conditons or does OpenMP somehow take care of this?
Is it better to share a copy of the container itself, instead of a pointer to it, among threads?
Just reading from a variable cannot produce a race condition: it doesn't matter whether the variable is shared or not. To produce a race condition you need to have two or more threads trying to modify the same instance of a variable at the same time.
Then, assuming that your threads are reading and modifying a certain variable, if you make this variable shared you will still have a race condition since all the threads share the same instance. I guess that in your first paragraph you wanted to say private, as #ilotXXI pointed out.
About your question about privatizing a pointer, if two o more instances of that pointer point to the same data and they modify it, you will have a race condition (each thread has a private version of the pointer but not a private version of the data).
Note that changing from one data-sharing clause to another may change the behavior of your application. Thus, in general, when you are parallelizing an application, what you have to do first is to analyze which kind of data accesses your application is performing. Once you know that, you have to think which data-sharing clauses and which synchronization constructs (if needed) you should use to keep the original behavior of your application.

Possible bug with simultaneous write to the same memory region in boost::mutex constructor

As was previously discussed in this question, a pre-C++11 implementation can execute code in the way when several threads simultaneously call constructor for the same object with static storage duration.
In boost::mutex implementation there's initialize function that is called from its constructor and contains the following code:
void initialize()
{
active_count=0;
event=0;
}
Well, it seems for me that it's UB since we can have situation when several threads simultaneously write 0 to the same memory region, isn't it?
Yes. If you simultaneously construct a mutex in the same memory location from different threads then you're invoking UB.
Of course, that scenario is really really hard to achieve.
Update
Ok, it has become apparent that the question is about initialization of function-local statics pre c++11. Although this has nothing to do with boost::mutex, per se, I can confirm that, indeed, boost::mutex construction is also unsafe for cases like that.
(Mutexes that coordinate access to a shared resource typically need to precede access to the resource. When the mutex itself is a shared resource before it is created, you're doing it wrong.
You need existing synchronization to coordinate access to the are in which you construct new mutexes if you even need to do something like this.)

are c++ pointers to user-defined objects thread safe for reading?

I can't find the answer but it's a simple question:
Is it safe for two threads to read the value of a pointer to a user-defined object in c++ at the same time with no locks or any other shenanigans?
Yes. Actually it is safe to read any values (of builtin type) concurrently.
Data races can only occur, if a value is modified concurrently with some other thread using it. The key statements from the Standard for this are:
A data race is defined in §1.10/21:
The execution of a program contains a data race if it contains two
conflicting actions in different threads, at least one of which is not
atomic, and neither happens before the other.
where conflicting is defined in §1.10/4:
Two expression evaluations conflict if one of them modifies a memory
location (1.7) and the other one accesses or modifies the same memory
location.
So you must use suitable synchronization between those reads and any writes.
It is always safe to read values from multiple threads. It's only when you're also writing to the data that you need to manage concurrent accesses.
The only possible issue for read-only data is ensuring that the value has, in fact, been initialized when the reading is done. If you initialize the value before you start your threads you'll be fine.
It is generally not thread-safe if the variable gets modified in one of the threads.
By thread-safe I suppose you mean to ask whether they have atomic writes. In C++03 this is not true, as C++03 doesn't really know about threads. In C++11 you have std::atomic, which is specialized for pointers.

Boost, mutex concept

I am new to multi-threading programming, and confused about how Mutex works. In the Boost::Thread manual, it states:
Mutexes guarantee that only one thread can lock a given mutex. If a code section is surrounded by a mutex locking and unlocking, it's guaranteed that only a thread at a time executes that section of code. When that thread unlocks the mutex, other threads can enter to that code region:
My understanding is that Mutex is used to protect a section of code from being executed by multiple threads at the same time, NOT protect the memory address of a variable. It's hard for me to grasp the concept, what happen if I have 2 different functions trying to write to the same memory address.
Is there something like this in Boost library:
lock a memory address of a variable, e.g., double x, lock (x); So
that other threads with a different function can not write to x.
do something with x, e.g., x = x + rand();
unlock (x)
Thanks.
The mutex itself only ensures that only one thread of execution can lock the mutex at any given time. It's up to you to ensure that modification of the associated variable happens only while the mutex is locked.
C++ does give you a way to do that a little more easily than in something like C. In C, it's pretty much up to you to write the code correctly, ensuring that anywhere you modify the variable, you first lock the mutex (and, of course, unlock it when you're done).
In C++, it's pretty easy to encapsulate it all into a class with some operator overloading:
class protected_int {
int value; // this is the value we're going to share between threads
mutex m;
public:
operator int() { return value; } // we'll assume no lock needed to read
protected_int &operator=(int new_value) {
lock(m);
value = new_value;
unlock(m);
return *this;
}
};
Obviously I'm simplifying that a lot (to the point that it's probably useless as it stands), but hopefully you get the idea, which is that most of the code just treats the protected_int object as if it were a normal variable.
When you do that, however, the mutex is automatically locked every time you assign a value to it, and unlocked immediately thereafter. Of course, that's pretty much the simplest possible case -- in many cases, you need to do something like lock the mutex, modify two (or more) variables in unison, then unlock. Regardless of the complexity, however, the idea remains that you centralize all the code that does the modification in one place, so you don't have to worry about locking the mutex in the rest of the code. Where you do have two or more variables together like that, you generally will have to lock the mutex to read, not just to write -- otherwise you can easily get an incorrect value where one of the variables has been modified but the other hasn't.
No, there is nothing in boost(or elsewhere) that will lock memory like that.
You have to protect the code that access the memory you want protected.
what happen if I have 2 different functions trying to write to the same
memory address.
Assuming you mean 2 functions executing in different threads, both functions should lock the same mutex, so only one of the threads can write to the variable at a given time.
Any other code that accesses (either reads or writes) the same variable will also have to lock the same mutex, failure to do so will result in indeterministic behavior.
It is possible to do non-blocking atomic operations on certain types using Boost.Atomic. These operations are non-blocking and generally much faster than a mutex. For example, to add something atomically you can do:
boost::atomic<int> n = 10;
n.fetch_add(5, boost:memory_order_acq_rel);
This code atomically adds 5 to n.
In order to protect a memory address shared by multiple threads in two different functions, both functions have to use the same mutex ... otherwise you will run into a scenario where threads in either function can indiscriminately access the same "protected" memory region.
So boost::mutex works just fine for the scenario you describe, but you just have to make sure that for a given resource you're protecting, all paths to that resource lock the exact same instance of the boost::mutex object.
I think the detail you're missing is that a "code section" is an arbitrary section of code. It can be two functions, half a function, a single line, or whatever.
So the portions of your 2 different functions that hold the same mutex when they access the shared data, are "a code section surrounded by a mutex locking and unlocking" so therefore "it's guaranteed that only a thread at a time executes that section of code".
Also, this is explaining one property of mutexes. It is not claiming this is the only property they have.
Your understanding is correct with respect to mutexes. They protect the section of code between the locking and unlocking.
As per what happens when two threads write to the same location of memory, they are serialized. One thread writes its value, the other thread writes to it. The problem with this is that you don't know which thread will write first (or last), so the code is not deterministic.
Finally, to protect a variable itself, you can find a near concept in atomic variables. Atomic variables are variables that are protected by either the compiler or the hardware, and can be modified atomically. That is, the three phases you comment (read, modify, write) happen atomically. Take a look at Boost atomic_count.

How does Multiple C++ Threads execute on a class method

let's say we have a c++ class like:
class MyClass
{
void processArray( <an array of 255 integers> )
{
int i ;
for (i=0;i<255;i++)
{
// do something with values in the array
}
}
}
and one instance of the class like:
MyClass myInstance ;
and 2 threads which call the processArray method of that instance (depending on how system executes threads, probably in a completely irregular order). There is no mutex lock used in that scope so both threads can enter.
My question is what happens to the i ? Does each thread scope has it's own "i" or would each entering thread modify i in the for loop, causing i to be changing weirdly all the time.
i is allocated on the stack. Since each thread has its own separate stack, each thread gets its own copy of i.
Be careful. In the example provided the method processArray seems to be reentrant (it's not clear what happens in // do something with values in the array). If so, no race occurs while two or more threads invoke it simultaneously and therefore it's safe to call it without any locking mechanism.
To enforce this, you could mark both the instance and the method with the volatile qualifier, to let users know that no lock is required.
It has been published an interesting article of Andrei Alexandrescu about volatile qualifier and how it can be used to write correct multithreaded classes. The article is published here:
http://www.ddj.com/cpp/184403766
Since i is a local variable it is stored on the thread's own private stack. Hence, you do not need to protect i with a critical section.
As Adam said, i is a variable stored on the stack and the arguments are passed in so this is safe. When you have to be careful and apply mutexes or other synchronization mechanisms is if you were accessing shared member variables in the same instance of the class or global variables in the program (even scoped statics).