I wanted to know that is it safe to assume that if multiple threads are accessing a single static container (boost::unordered_map) there is no need for locking the access to the container if multiple threads are only reading data from it. and no writing is done
When multiple threads are only reading and performing no write operation, you do not need to synchronize access.
Paragraph 1.10 of the C++11 Standard defines conflicting operations with respect to data races as:
Two expression evaluations conflict if one of them modifies a memory location (1.7) and the other one
accesses or modifies the same memory location.
And then of course, per 1.10/21:
The execution of a program contains a data race if it contains two conflicting actions in different threads,
at least one of which is not atomic, and neither happens before the other. [...]
Related
I am building a very simple program as an exercise.
The idea is to compute the total size of a directory by recursively iterating over all its contents, and summing the sizes of all files contained in the directory (and its subdirectories).
To show to a user that the program is still working, this computation is performed on another thread, while the main thread prints a dot . once every second.
Now the main thread of course needs to know when it should stop printing dots and can look up a result.
It is possible to use e.g. a std::atomic<bool> done(false); and pass this to the thread that will perform the computation, which will set it to true once it is finished. But I am wondering if in this simple case (one thread writes once completed, one thread reads periodically until nonzero) it is necessary to use atomic data types for this. Obviously if multiple threads might write to it, it needs to be protected. But in this case, there's only one writing thread and one reading thread.
Is it necessary to use an atomic data type here, or is it overkill and could a normal data type be used instead?
Yes, it's necessary.
The issue is that the different cores of the processor can have different views of the "same" data, notably data that's been cached within the CPU. The atomic part ensures that these caches are properly flushed so that you can safely do what you are trying to do.
Otherwise, it's quite possible that the other thread will never actually see the flag change from the first thread.
Yes it is necessary. The rule is that if two threads could potentially be accessing the same memory at the same time, and at least one of the threads is a writer, then you have a data race. Any execution of a program with a data race has undefined behavior.
Relevant quotes from the C++14 standard:
1.10/23
The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.
1.10/6
Two expression evaluations conflict if one of them modifies a memory location (1.7) and the other one accesses or modifies the same memory location.
Yes, it is necessary. Otherwise it is not guaranteed that changes to the bool in one thread will be observable in the other thread. In fact, if the compiler sees that the bool variable is, apparently, not ever used again in the execution thread that sets it, it might completely optimize away the code that sets the value of the bool.
I am building a very simple program as an exercise.
The idea is to compute the total size of a directory by recursively iterating over all its contents, and summing the sizes of all files contained in the directory (and its subdirectories).
To show to a user that the program is still working, this computation is performed on another thread, while the main thread prints a dot . once every second.
Now the main thread of course needs to know when it should stop printing dots and can look up a result.
It is possible to use e.g. a std::atomic<bool> done(false); and pass this to the thread that will perform the computation, which will set it to true once it is finished. But I am wondering if in this simple case (one thread writes once completed, one thread reads periodically until nonzero) it is necessary to use atomic data types for this. Obviously if multiple threads might write to it, it needs to be protected. But in this case, there's only one writing thread and one reading thread.
Is it necessary to use an atomic data type here, or is it overkill and could a normal data type be used instead?
Yes, it's necessary.
The issue is that the different cores of the processor can have different views of the "same" data, notably data that's been cached within the CPU. The atomic part ensures that these caches are properly flushed so that you can safely do what you are trying to do.
Otherwise, it's quite possible that the other thread will never actually see the flag change from the first thread.
Yes it is necessary. The rule is that if two threads could potentially be accessing the same memory at the same time, and at least one of the threads is a writer, then you have a data race. Any execution of a program with a data race has undefined behavior.
Relevant quotes from the C++14 standard:
1.10/23
The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.
1.10/6
Two expression evaluations conflict if one of them modifies a memory location (1.7) and the other one accesses or modifies the same memory location.
Yes, it is necessary. Otherwise it is not guaranteed that changes to the bool in one thread will be observable in the other thread. In fact, if the compiler sees that the bool variable is, apparently, not ever used again in the execution thread that sets it, it might completely optimize away the code that sets the value of the bool.
What should I be concerned about as far as thread safety and undefined behavior goes in a situation where multiple threads are reading from a single source that is constant?
I am working on a signal processing model that allows for parallel execution of independent processes, these processes may share an input buffer, but the process that fills the input buffer will always be complete before the next stage of possibly parallel processes will execute.
Do I need to worry about thread safety issues in this situation? and what could i do about it?
I would like to note that a lock free solution would be best if possible
but the process that fills the input buffer will always be complete before the next stage of possibly parallel processes will execute
If this is guaranteed then there is not a problem having multiple reads from different threads for const objects.
I don't have the official standard so the following is from n4296:
17.6.5.9 Data race avoidance
3 A C++ standard library function shall not directly or indirectly modify objects (1.10) accessible by threads
other than the current thread unless the objects are accessed directly or indirectly via the function’s non-const
arguments, including this.
4 [ Note: This means, for example, that implementations can’t use a static object for internal purposes without
synchronization because it could cause a data race even in programs that do not explicitly share objects
between threads. —end note ]
Here is the Herb Sutter video where I first learned about the meaning of const in the C++11 standard. (see around 7:00 to 10:30)
No, you are OK. Multiple reads from the same constant source are OK and do not pose any risks in all threading models I know of (namely, Posix and Windows).
However,
but the process that fills the input buffer will always be complete
What are the guarantees here? How do you really know this is the case? Do you have a synchronization?
I am new to threading . Correct me if I am wrong that mutex locks the access to a shared data structure so that it cannot be used by other threads until it is unlocked . So, lets consider that there are 2 or more shared data structures . So , should I make different mutex objects for different data structures ? If no ,then how std::mutex will know which object it should lock ? What If I have to lock more than 1 objects at the same time ?
There are several points in your question that can be made more precise. Perhaps clearing this will solve things for you.
To begin with, a mutex, by itself, does not lock access to anything. It is basically something that your code can lock and unlock, and some "magic" ensures that only one thread can lock it at a time.
If, by convention, you decide that any code accessing some data structure foo will first begin by locking a mutex foo_mutex, then it will have the effect of protecting this data structure.
So, having said that, regarding your questions:
It depends on whether the two data structures need to be accessed together or not (e.g., can updating one without the other leave the system in an inconsistent state). If so, you should lock them with a single mutex. If not, you can improve parallelism by using two.
The mutex does not lock anything. It is you who decide by convention whether you can access 1, 2, or a million data structures, while holding it.
If you always needs to access both structures then it could be considered as a single resource so only a single lock is needed.
If you sometimes, even just once, need to access one of the structures independently then they can no longer be considered a single resource and you might need two locks. Of course, a single lock could still be sufficient, but then that lock would lock both resources at once, prohibiting other threads from accessing any of the structures.
Mutex does not "know" anything other than about itself. The lock is performed on mutex itself.
If there are two objects (or pieces of code) that need synchronized access (but can be accessed at the same time) then you have the liberty to use just one mutex for both or one for each. If you use one mutex they will not be accessed at the same time from two different threads.
If it cannot happen that access to one object is required while accessing the other object then you can use two mutexes, one for each. But if it can happen that one object must be accessed while the thread already holds another mutex then care must be taken that code never can reach a deadlock, where two threads hold one mutex each, and both at the same time wait that the other mutex is released.
I can't find the answer but it's a simple question:
Is it safe for two threads to read the value of a pointer to a user-defined object in c++ at the same time with no locks or any other shenanigans?
Yes. Actually it is safe to read any values (of builtin type) concurrently.
Data races can only occur, if a value is modified concurrently with some other thread using it. The key statements from the Standard for this are:
A data race is defined in §1.10/21:
The execution of a program contains a data race if it contains two
conflicting actions in different threads, at least one of which is not
atomic, and neither happens before the other.
where conflicting is defined in §1.10/4:
Two expression evaluations conflict if one of them modifies a memory
location (1.7) and the other one accesses or modifies the same memory
location.
So you must use suitable synchronization between those reads and any writes.
It is always safe to read values from multiple threads. It's only when you're also writing to the data that you need to manage concurrent accesses.
The only possible issue for read-only data is ensuring that the value has, in fact, been initialized when the reading is done. If you initialize the value before you start your threads you'll be fine.
It is generally not thread-safe if the variable gets modified in one of the threads.
By thread-safe I suppose you mean to ask whether they have atomic writes. In C++03 this is not true, as C++03 doesn't really know about threads. In C++11 you have std::atomic, which is specialized for pointers.