Does any of C++ smart pointers avoid data race in strict sense? - c++

For consumer/producer model there is a built-in mechanism to avoid data race - queue.
But for global flag there seems not yet a ready-to-go type to avoid data race rather than attaching a mutex to each global flag as simple as boolean or int type.
I came across shared pointer. Is it true that as one pointer operates on that variable, another is prohibited from accessing it?
Or will unique pointer promise no data race?
e.g. scenario:
One thread updates the number of visits on serving a new visitor, while another thread periodically reads that number out (might be copy behavior) and save it to log. They will be accessing the same memory on the heap that stores that number, and race condition is that they are accessing it at the same time from different cpu cores, which would cause a crash.

For consumer/producer model there is a built-in mechanism to avoid data race - queue.
The standard library has no thread-safe queue. std::queue and others cannot be used without explicit synchronization in multiple threads.
I came across shared pointer. Is it true that as one pointer operates on that variable, another is prohibited from accessing it?
std::shared_ptr (or any other standard library smart pointer) does not in any way prevent multiple threads accessing the managed object unsynchronized. std::shared_ptr only guarantees that destruction of the managed object is thread-safe.
Or will unique pointer promise no data race?
std::unique_ptr cannot be copied, so you cannot have multiple std::unique_ptr or threads managing the object. None of the smart pointers guarantee that access to the smart pointer object itself is free of data races.
One thread updates the number of visits on serving a new visitor, while another thread periodically reads that number out (might be copy behavior) and save it to log. They will be accessing the same memory on the heap that stores that number, and race condition is that they are accessing it at the same time from different cpu cores, which would cause a crash.
That can simply be a std::atomic<int> or similar. Unsynchronized access to a std::atomic is allowed. There can of course still be race conditions if you rely on a particular order in which the access should happen, but in your example that doesn't seem to be the case. However, in contrast to non-atomic objects, there will be at least no undefined behavior due to the unsynchronized access (data race).

Related

multithread std::shared_ptr read / write

I have a double free bug using std::shared_ptr and trying to get know why. I am using shared_ptr in multithread environment , one thread sometimes replaces some element in a global array
std::shared_ptr<Bucket> globalTable[100]; // global elements storage
using:
globalTable[idx].reset(newBucket);
and the other thread reads this table sometimes using :
std::shared_ptr<Bucket> bkt(globalTable[pIdx]);
// do calculations with bkt-> items
After this I am receiving double-free error, and AddressSanitizer says that the second code tries to free an object that was destroyed by the first one . How it is possible ? As I know shared_ptr must be completly thread safe.
Reset does not guarantee you thread saefty.
Assignments and reference counting are thread safe as explained here
To satisfy thread safety requirements, the reference counters are
typically incremented using an equivalent of std::atomic::fetch_add
with std::memory_order_relaxed (decrementing requires stronger
ordering to safely destroy the control block).
If multiple threads access same shared_ptr you can have a race condition.
If multiple threads of execution access the same shared_ptr without
synchronization and any of those accesses uses a non-const member
function of shared_ptr then a data race will occur; the shared_ptr
overloads of atomic functions can be used to prevent the data race.
Your function reset is non const so falls on that category. You need to use mutex or another synchronization mechanism.
http://en.cppreference.com/w/cpp/memory/shared_ptr
Not all operations on a std::shared_ptr are thread-safe.
Specifically, the reference-counts are managed atomically, but it's your responsibility to make sure the std::shared_ptr instance you access is not concurrently modified.
You fail that responsibility, resulting in a data-race and the expected undefined behavior, manifesting as a double-free in your case.

Synchronization of particular shared memory write operations (MPI)

To keep things simple and in order to concentrate on the core of my problem, let's assume that a memory location, addressed locally by a pointer variable ptr, is shared among several processes. I in particular use MPI shared memory windows in C/++ to allocate and share the memory. To be concrete, let's say ptr references a floating point variable, so locally we have
float* ptr;
Now assume that all processes attempt to write the same value const float f to ptr, i.e.
*ptr = f;
My question is: Does this operation require synchronization, or can it be executed concurrently, given the fact that all processes attempt to modify the bytes in the same way, i.e. given the fact that f has the same value for every process. My question therefore boils down to: For concurrent write operations to e.g. floating point variables, is there the possibility that the race condition results in an inconsistent byte pattern, although every process attempts to modify the memory in the same way. I.e. if I know for sure that every process writes the same data, can I then omit synchronization?
Yes, you must synchronize the shared memory. the fact that the modifying threads reside in different processes has no meaning, it is still data race (writing to a shared memory from different threads).
do note that there are other problems that synchronization objects solve, like visibility and memory reordering, what is written to the shared memory is irrelevant.
currently, the standard does not define the idea of a process (only thread), and does not provide any means of synchronizing between processes easily.
you allocate a std::mutex in a shared memory and use that as you synchronization primitive, or rely on a win32 inter-process synchronization primitives like a mutex, semaphore or event.
alternatively, if you only want to synchronize a primitive, you can allocate a std::atomic<T> on a shared memory and use that as your synchronized primitive.
In C++, if multiple processes write to the same memory location without proper use of synchronization primitives or atomic operations, undefined behavior occurs. (That is, it might work, it might not work, the computer might catch on fire.)
In practice, on your computer, it's basically certain to work the way you think it should work. It actually is plausible that on some architectures things don't go the way you expect, though: If the CPU cannot read/write a block of memory as small as your shared value, or if the storage of the shared value crosses an alignment boundary, such a write can actually involve a read as well, and that read-modify-write can have the effect of reverting or corrupting other changes to memory.
The easiest way to get what you want is simply to do the write as a "relaxed" atomic operation:
std::atomic_store_explicit(ptr, f, std::memory_order_relaxed);
That ensures that the write is "atomic" in the sense of not causing a data race, and won't incur any overhead except on architectures where there would be potential problems with *ptr = f.

OpenMP race condition while reading from pointer

I know that reading from a shared variable in OpenMP does not cause a race condition, because every thread has it's own copy of that variable.
But if the shared variable is a pointer (e.g. to a container), then every thread only gets a copy of the pointer.
If I now read from the location the pointer is pointing to (my container), can there be race conditons or does OpenMP somehow take care of this?
Is it better to share a copy of the container itself, instead of a pointer to it, among threads?
Just reading from a variable cannot produce a race condition: it doesn't matter whether the variable is shared or not. To produce a race condition you need to have two or more threads trying to modify the same instance of a variable at the same time.
Then, assuming that your threads are reading and modifying a certain variable, if you make this variable shared you will still have a race condition since all the threads share the same instance. I guess that in your first paragraph you wanted to say private, as #ilotXXI pointed out.
About your question about privatizing a pointer, if two o more instances of that pointer point to the same data and they modify it, you will have a race condition (each thread has a private version of the pointer but not a private version of the data).
Note that changing from one data-sharing clause to another may change the behavior of your application. Thus, in general, when you are parallelizing an application, what you have to do first is to analyze which kind of data accesses your application is performing. Once you know that, you have to think which data-sharing clauses and which synchronization constructs (if needed) you should use to keep the original behavior of your application.

Should I use different mutexes for different objects?

I am new to threading . Correct me if I am wrong that mutex locks the access to a shared data structure so that it cannot be used by other threads until it is unlocked . So, lets consider that there are 2 or more shared data structures . So , should I make different mutex objects for different data structures ? If no ,then how std::mutex will know which object it should lock ? What If I have to lock more than 1 objects at the same time ?
There are several points in your question that can be made more precise. Perhaps clearing this will solve things for you.
To begin with, a mutex, by itself, does not lock access to anything. It is basically something that your code can lock and unlock, and some "magic" ensures that only one thread can lock it at a time.
If, by convention, you decide that any code accessing some data structure foo will first begin by locking a mutex foo_mutex, then it will have the effect of protecting this data structure.
So, having said that, regarding your questions:
It depends on whether the two data structures need to be accessed together or not (e.g., can updating one without the other leave the system in an inconsistent state). If so, you should lock them with a single mutex. If not, you can improve parallelism by using two.
The mutex does not lock anything. It is you who decide by convention whether you can access 1, 2, or a million data structures, while holding it.
If you always needs to access both structures then it could be considered as a single resource so only a single lock is needed.
If you sometimes, even just once, need to access one of the structures independently then they can no longer be considered a single resource and you might need two locks. Of course, a single lock could still be sufficient, but then that lock would lock both resources at once, prohibiting other threads from accessing any of the structures.
Mutex does not "know" anything other than about itself. The lock is performed on mutex itself.
If there are two objects (or pieces of code) that need synchronized access (but can be accessed at the same time) then you have the liberty to use just one mutex for both or one for each. If you use one mutex they will not be accessed at the same time from two different threads.
If it cannot happen that access to one object is required while accessing the other object then you can use two mutexes, one for each. But if it can happen that one object must be accessed while the thread already holds another mutex then care must be taken that code never can reach a deadlock, where two threads hold one mutex each, and both at the same time wait that the other mutex is released.

are c++ pointers to user-defined objects thread safe for reading?

I can't find the answer but it's a simple question:
Is it safe for two threads to read the value of a pointer to a user-defined object in c++ at the same time with no locks or any other shenanigans?
Yes. Actually it is safe to read any values (of builtin type) concurrently.
Data races can only occur, if a value is modified concurrently with some other thread using it. The key statements from the Standard for this are:
A data race is defined in §1.10/21:
The execution of a program contains a data race if it contains two
conflicting actions in different threads, at least one of which is not
atomic, and neither happens before the other.
where conflicting is defined in §1.10/4:
Two expression evaluations conflict if one of them modifies a memory
location (1.7) and the other one accesses or modifies the same memory
location.
So you must use suitable synchronization between those reads and any writes.
It is always safe to read values from multiple threads. It's only when you're also writing to the data that you need to manage concurrent accesses.
The only possible issue for read-only data is ensuring that the value has, in fact, been initialized when the reading is done. If you initialize the value before you start your threads you'll be fine.
It is generally not thread-safe if the variable gets modified in one of the threads.
By thread-safe I suppose you mean to ask whether they have atomic writes. In C++03 this is not true, as C++03 doesn't really know about threads. In C++11 you have std::atomic, which is specialized for pointers.