C++ volatile required when spinning on boost::shared_ptr operator bool()? [duplicate] - c++

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
When to use volatile with multi threading?
I have two threads referencing the same boost::shared_ptr:
boost::shared_ptr<Widget> shared;
On thread is spinning, waiting for the other thread to reset the boost::shared_ptr:
while(shared)
boost::thread::yield();
And at some point the other thread will call:
shared.reset();
My question is whether or not I need to declare the shared pointer as volatile to prevent the compiler from optimizing the call to shared.operator bool() out of the loop and never detecting the change? I know that if I were simply looping on a variable, waiting for it to reach 0 I would need volatile, but I'm not sure if boost::shared_ptr is implemented in such a way that it is not necessary here.
EDIT: I'm fully aware that condition variables can be used to solve this problem in a different way. But in this case, the busy loop is very uncommon and contending for the lock on the condition variable is an overhead we would rather not incur.

Rant 1:
That code probably won't do what you think it will. When you write code like that, you're introducing a data race into your code. This is almost certainly a bug that will result in your program non-deterministically failing.
Data structures (including shared_ptr) are generally not meant to be accessed concurrently. Do not modify the same structure at the same time in more than one thread. That could corrupt the structure. Do not modify it in one thread and read it in another thread. The reader could see inconsistent data. Probably multiple threads can read it at the same time.
If you think you really want to do some of the above, find out if the data structure allows some of these behaviors in a section probably titled "Thread Safety." If it does allow them, take a second look at whether your performance really needs this, and then use it. (The documentation on shared_ptr does NOT allow what you're doing.)
Rant 2:
Now, for a higher-level concern, you probably shouldn't be doing thread synchronization by waiting for a pointer to be set to NULL. Really, look at condition variables, barriers, or futures as a way of getting one thread to wait until another is finished with something. It's a nicer interface, and whoever looks at your code next (that includes you in 6 months) will thank you.
I know you're concerned about the performance cost of real synchronization. Don't worry about this. It'll be fine. If you're worried about lock contention, use barriers or futures, which don't have a big shared lock to contend for.
Caveat: there is a time for writing code that avoids locks at all cost. But unless you're looking at profiler data that says your synch ops are too slow for your target workload, this isn't the time.
Rant 3:
I hope that shared in your example is global. Otherwise, you have multiple threads with local references to the same shared_ptr that points to the real object you're interested in. It kind of defeats the purpose of having a reference-counted pointer. Just please tell me it's global.

What you actually should do is to use condition variables. Busy waits are evil.
Edit: also depending on your task, futures may be even cleaner way to achieve what you want.

Related

Reader/Writer: multiple heavy readers, only 1 write per day

I have a huge tbb::concurrent_unordered_map that gets "read" heavily by multiple (~60) threads concurrently.
Once per day I need to clear it (either fully, or selectively). Erasing is obviously not thread safe in tbb implementation, so some synchronisation needs to be in place to prevent UB.
However I know for a fact that the "write" will only happen once per day (the exact time is unknown).
I've been looking at std::shared_mutexto allow concurrent reads but I am afraid that even in an uncontended scenario might slow things significantly.
Is there a better solution for this?
Perhaps checking a std::atomic<bool> before locking on the mutex?
Thanks in advance.
It might require a bit of extra work on maintaining it, but you can use copy-on-write scheme.
Keep the map in a singleton within a shared pointer.
For "read" operations, have users thread-safely copy the shared pointer and use it for as long as they want.
For "write" operations, create a new instance map in a new shared pointer, fill it with whatever you want and replace the old version it in the singleton.
This way "read" users will still see the old version and can use it safely. Just make sure they occasionally get the newest version from the singleton. Perhaps, even give them a handle that automatically updates the shared pointer once a second or something.
This works in case you don't need the threads to synchronously update all at once.
Another scheme, you create atomic boolean to indicate when an update is incoming, and just make all threads pause their operations on the map when it is true. Once they all stopped you perform the update and let them resume their operation.
This is a perfect job for a read/write lock.
In C++ this can be implemented by having a shared_mutex, then using a unique_lock to lock it for writing, and a shared_lock to lock it for reading. See this post for a example code.
The end effect is that readers will never block on eachother, reads can all happen at the same time, but if the writer has the lock, everything will block to let the writing operation proceed.
If the writing takes a long time, so long that the once-per-day delay is unacceptable, then you can have the writer create and populate a new copy of the data without taking a lock, then take the write end of the lock and swap the data:
Readers:
Lock mutex with a shared_lock
Do stuff
Unlock
Repeat
Writer:
Create new copy of data
Lock mutex with a unique_lock
Swap data quickly
Unlock
Repeat tomorrow
A shared_lock on a shared_mutex will be fast. You could use a double check locking strategy but I would not do that until you do performance testing and also probably take a look at the source for shared_lock, because I suspect it does something similar already, and a double-check on the read end might just add overhead unnecessarily. Also I don't have enough coffee in me yet to work out double check locking in a read/write lock scenario.
There is a threading construct called a spin lock as well, but it's really just an encapsulation of a double-checked lock that repeats the "check" until it clears. It's a good construct but again you'll want performance analyses and a look at the shared_lock + shared_mutex source, because they might spin already. A good implementation of a spin lock can be found here, it covers some common gotchas. You might have to tweak it to get a read/write spinlock.
Generally speaking though, it's best to use existing constructs during the initial implementation at the very least as a clearly coded proof-of-concept. Then if you know that you're seeing too much read contention, you can optimize from there. But you need the general strategy down first, and 91 times out of a hundred, it's good enough. In this case, no matter what, some manifestation of a read/write lock is what you're going to end up with.

C/C++ arrays with threads - do I need to use mutexes or locks?

I am new to using threads and have read a lot about how data is shared and protected. But I have also not really got a good grasp of using mutexes and locks to protect data.
Below is a description of the problem I will be working on. The important thing to note is that it will be time-critical, so I need to reduce overheads as much as possible.
I have two fixed-size double arrays.
The first array will provide data for subsequent calculations.
Threads will read values from it, but it will never be modified. An element may be read at some time by any of the threads.
The second array will be used to store the results of the calculations performed by the threads. An element of this array will only ever be updated by one thread, and probably only once when the result value
is written to it.
My questions then:
Do I really need to use a mutex in a thread each time I access the data from the read-only array? If so, could you explain why?
Do I need to use a mutex in a thread when it writes to the result array even though this will be the only thread that ever writes to this element?
Should I use atomic data types, and will there be any significant time overhead if I do?
Many answers to this type of question seem to be - no, you don't need the mutex if your variables are aligned. Would my array elements in this example be aligned, or is there some way to ensure they are?
The code will be implemented on 64bit Linux. I am planning on using Boost libraries for multithreading.
I have been mulling this over and looking all over the web for days, and once posted, the answer and clear explanations came back in literally seconds. There is an "accepted answer," but all the answers and comments were equally helpful.
Do I really need to use a mutex in a thread each time I access the data from the read-only array? If so could you explain why?
No. Because the data is never modified, there cannot be synchronization problem.
Do I need to use a mutex in a thread when it writes to the result array even though this will be the only thread that ever writes to this element?
Depends.
If any other thread is going to read the element, you need synchronization.
If any thread may modify the size of the vector, you need synchronization.
In any case, take care of not writing into adjacent memory locations by different threads a lot. That could destroy the performance. See "false sharing". Considering, you probably don't have a lot of cores and therefore not a lot of threads and you say write is done only once, this is probably not going to be a significant problem though.
Should I use atomic data types and will there be any significant time over head if I do?
If you use locks (mutex), atomic variables are not necessary (and they do have overhead). If you need no synchronization, atomic variables are not necessary. If you need synchronization, then atomic variables can be used to avoid locks in some cases. In which cases can you use atomics instead of locks... is more complicated and beyond the scope of this question I think.
Given the description of your situation in the comments, it seems that no synchronization is required at all and therefore no atomics nor locks.
...Would my array elements in this example be aligned, or is there some way to ensure they are?
As pointed out by Arvid, you can request specific alignment using the alginas keyword which was introduced in c++11. Pre c++11, you may resort to compiler specific extensions: https://gcc.gnu.org/onlinedocs/gcc-5.1.0/gcc/Variable-Attributes.html
Under the two conditions given, there's no need for mutexes. Remember every use of a mutex (or any synchronization construct) is a performance overhead. So you want to avoid them as much as possible (without compromising correct code, of course).
No. Mutexes are not needed since threads are only reading the array.
No. Since each thread only writes to a distinct memory location, no race condition is possible.
No. There's no need for atomic access to objects here. In fact, using atomic objects could affect the performance negatively as it prevents the optimization possibilities such as re-ordering operations.
The only time you need to use Locks is when data is modified on a shared resource. Eg if some threads where used to write data and some used to read data (in both cases from the same resource) then you only need a lock for when writing is done. This is to prevent whats known as "race".
There is good information of race on google for when you make programs that manipulate data on a shared resource.
You are on the right track.
1) For the first array (read only) , you do not need to utilize a mutex lock for it. Since the threads are just reading not altering the data there is no way a thread can corrupt the data for another thread
2) I'm a little confused by this question. If you know that thread 1 will only write an element to array slot 1 and thread 2 will only write to array slot 2 then you do not need a mutex lock. However I'm not sure how your achieving this property. If my above statement is not correct for your situation you would definitely need a mutex lock.
3) Given the definition of atomic:
Atomic types are types that encapsulate a value whose access is guaranteed to not cause data races and can be used to synchronize memory accesses among different threads.
Key note, a mutex lock is atomic meaning that there is only 1 assembly instruction needed to grab/release a lock. If it required 2 assembly instructions to grab/release a lock, a lock would not be thread safe. For example, if thread 1 attempted to grab a lock and was switched to thread 2, thread 2 would grab the lock.
Use of atomic data types would decrease your overhead but not significantly.
4) I'm not sure how you can assure your variables are lined. Since threads can switch at any moment in your program (Your OS determines when a thread switches)
Hope this helps

When to use mutexes?

I've been playing around with gtkmm and multi-threaded GUIs and stumbled into the concept of a mutex. From what I've been able to gather, it serves the purpose of locking access to a variable for a single thread in order to avoid concurrency issues. This I understand, seems rather natural, however I still don't get how and when one should use a mutex. I've seen several uses where the mutex is only locked to access particular variables (e.g.like this tutorial). For which type of variables/data should a mutex be used?
PS: Most of the answers I've found on this subject are rather technical, and since I am far from an expert on this I was looking more for a conceptual answer.
If you have data that is accessed from more than a single thread, you probably need a mutex. You usually see something like
theMutex.lock()
do_something_with_data()
theMutex.unlock()
or a better idiom in c++ would be:
{
MutexGuard m(theMutex)
do_something_with_data()
}
where MutexGuard c'tor does the lock() and d'tor does the unlock()
This general rule has a few exceptions
if the data you are using can be accessed in an atomic manner, you don't need a lock. In Visual Studio you have functions like InterlockedIncrement() that do this. gcc has it's own facilities to do this.
If you are accessing the data to only ever read it and never change it, it's usually safe to do without locking. but if even a single thread does any change to the data, all the other threads need to make sure they don't try to read the data while it is being changed. You can also read about Reader-Writer lock for this kind of situations.
variables that are changed among multiple threads. So data that is not modified (immutable) or data that is not shared does not need

Thread safe programming

I keep hearing about thread safe. What is that exactly and how and where can I learn to program thread safe code?
Also, assume I have 2 threads, one that writes to a structure and another one that reads from it. Is that dangerous in any way? Is there anything I should look for? I don't think it is a problem. Both threads will not (well can't ) be accessing the struct at the exact same time..
Also, can someone please tell me how in this example : https://stackoverflow.com/a/5125493/1248779 we are doing a better job in concurrency issues. I don't get it.
It's a very deep topic. At the heart threads are usually about making things go fast by using multiple cores at the same time; or about doing long operations in the background when you don't have a good way to interleave the operation with a 'primary' thread. The latter being very common in UI programming.
Your scenario is one of the classic trouble spots, and one of the first people run into. It's vary rare to have a struct where the members are truly independent. It's very common to want to modify multiple values in the structure to maintain consistency. Without any precautions it is very possible to modify the first value, then have the other thread read the struct and operate on it before the second value has been written.
Simple example would be a 'point' struct for 2d graphics. You'd like to move the point from [2,2] to [5,6]. If you had a different thread drawing a line to that point you could end up drawing to [5,2] very easily.
This is the tip of the iceberg really. There are lots of great books, but learning this space usually goes something like this:
Uh oh, I just read from that thing in an inconsistent state.
Uh oh, I just modified that thing from 2 threads and now it's garbage.
Yay! I learned about locks
Whoa, I have a lot of locks and everything seems to just hang sometimes when I have lots of them locking in nested code.
Hrm. I need to stop doing this locking on the fly, I seem to be missing a lot of places; so I should encapsulate them in a data structure.
That data structure thing was great, but now I seem to be locking all the time and my code is just as slow as a single thread.
condition variables are weird
It's fast because I got clever with how I lock things. Hrm. Sometimes data corrupts.
Whoa.... InterlockedWhatDidYouSay?
Hey, look no lock, I do this thing called a spin lock.
Condition variables. Hrm... I see.
You know what, how about I just start thinking about how to operate on this stuff in completely independent ways, pipelineing my operations, and having as few cross thread dependencies as possible...
Obviously it's not all about condition variables. But there are many problems that can be solved with threading, and probably almost as many ways to do it, and even more ways to do it wrong.
Thread-safety is one aspect of a larger set of issues under the general heading of "Concurrent Programming". I'd suggest reading around that subject.
Your assumption that two threads cannot access the struct at the same time is not good. First: today we have multi-core machines, so two threads can be running at exactly the same time. Second: even on a single core machine the slices of time given to any other thread are unpredicatable. You have to anticipate that ant any arbitrary time the "other" thread might be processing. See my "window of opportunity" example below.
The concept of thread-safety is exactly to answer the question "is this dangerous in any way". The key question is whether it's possible for code running in one thread to get an inconsistent view of some data, that inconsistency happening because while it was running another thread was in the middle of changing data.
In your example, one thread is reading a structure and at the same time another is writing. Suppose that there are two related fields:
{ foreground: red; background: black }
and the writer is in the process of changing those
foreground = black;
<=== window of opportunity
background = red;
If the reader reads the values at just that window of opportunity then it sees a "nonsense" combination
{ foreground: black; background: black }
This essence of this pattern is that for a brief time, while we are making a change, the system becomes inconsistent and readers should not use the values. As soon as we finish our changes it becomes safe to read again.
Hence we use the CriticalSection APIs mentioned by Stefan to prevent a thread seeing an inconsistent state.
what is that exactly?
Briefly, a program that may be executed in a concurrent context without errors related to concurrency.
If ThreadA and ThreadB read and/or write data without errors and use proper synchronization, then the program may be threadsafe. It's a design choice -- making an object threadsafe can be accomplished a number of ways, and more complex types may be threadsafe using combinations of these techniques.
and how and where can I learn to program thread safe code?
boost/libs/thread/ would likely be a good introduction. The topic is quite complex.
The C++11 standard library provides implementations for locks, atomics and threads -- any well written programs which use these would be a good read. The standard library was modeled after boost's implementation.
also, assume I have 2 threads one that writes to a structure and another one that reads from it. Is that dangerous in any way? is there anything I should look for?
Yes, it can be dangerous and/or may produce incorrect results. Just imagine that a thread may run out of its time at any point, and then another thread could then read or modify that structure -- if you have not protected it, it may be in the middle of an update. A common solution is a lock, which can be used to prevent another thread from accessing shared resources during reads/writes.
When writing multithreaded C++ programs on WIN32 platforms, you need to protect certain shared objects so that only one thread can access them at any given time from different threads. You can use 5 system functions to achieve this. They are InitializeCriticalSection, EnterCriticalSection, TryEnterCriticalSection, LeaveCriticalSection, and DeleteCriticalSection.
Also maybe this links can help:
how to make an application thread safe?
http://www.codeproject.com/Articles/1779/Making-your-C-code-thread-safe
Thread safety is a simple concept: is it "safe" to perform operation A on one thread whilst another thread is performing operation B, which may or may not be the same as operation A. This can be extended to cover many threads. In this context, "safe" means:
No undefined behaviour
All invariants of the data structures are guaranteed to be observed by the threads
The actual operations A and B are important. If two threads both read a plain int variable, then this is fine. However, if any thread may write to that variable, and there is no synchronization to ensure that the read and write cannot happen together, then you have a data race, which is undefined behaviour, and this is not thread safe.
This applies equally to the scenario you asked about: unless you have taken special precautions, then it is not safe to have one thread read from a structure at the same time as another thread writes to it. If you can guarantee that the threads cannot access the data structure at the same time, through some form of synchronization such as a mutex, critical section, semaphore or event, then there is not a problem.
You can use things like mutexes and critical sections to prevent concurrent access to some data, so that the writing thread is the only thread accessing the data when it is writing, and the reading thread is the only thread accessing the data when it is reading, thus providing the guarantee I just mentioned. This therefore avoids the undefined behaviour mentioned above.
However, you still need to ensure that your code is safe in the wider context: if you need to modify more than one variable then you need to hold the lock on the mutex across the whole operation rather than for each individual access, otherwise you may find that the invariants of your data structure may not be observed by other threads.
It is also possible that a data structure may be thread safe for some operations but not others. For example, a single-producer single-consumer queue will be OK if one thread is pushing items on the queue and another is popping items off the queue, but will break if two threads are pushing items, or two threads are popping items.
In the example you reference, the point is that global variables are implicitly shared between all threads, and therefore all accesses must be protected by some form of synchronization (such as a mutex) if any thread can modify them. On the other hand, if you have a separate copy of the data for each thread, then that thread can modify its copy without worrying about concurrent access from any other thread, and no synchronization is required. Of course, you always need synchronization if two or more threads are going to operate on the same data.
My book, C++ Concurrency in Action covers what it means for things to be thread safe, how to design thread safe data structures, and the C++ synchronization primitives used for the purpose, such as std::mutex.
Threads safe is when a certain block of code is protected from being accessed by more than one thread. Meaning that the data manipulated always stays in a consistent state.
A common example is the producer consumer problem where one thread reads from a data structure while another thread writes to the same data structure : Detailed explanation
To answer the second part of the question: Imagine two threads both accessing std::vector<int> data:
//first thread
if (data.size() > 0)
{
std::cout << data[0]; //fails if data.size() == 0
}
//second thread
if (rand() % 5 == 0)
{
data.clear();
}
else
{
data.push_back(1);
}
Run these threads in parallel and your program will crash because std::cout << data[0]; might be executed directly after data.clear();.
You need to know that at any point of your thread code, the thread might be interrupted, e.g. after checking that (data.size() > 0), and another thread could become active. Although the first thread looks correct in a single threaded app, it's not in a multi-threaded program.

Is checking current thread inside a function ok?

Is it ok to check the current thread inside a function?
For example if some non-thread safe data structure is only altered by one thread, and there is a function which is called by multiple threads, it would be useful to have separate code paths depending on the current thread. If the current thread is the one that alters the data structure, it is ok to alter the data structure directly in the function. However, if the current thread is some other thread, the actual altering would have to be delayed, so that it is performed when it is safe to perform the operation.
Or, would it be better to use some boolean which is given as a parameter to the function to separate the different code paths?
Or do something totally different?
What do you think?
You are not making all too much sense. You said a non-thread safe data structure is only ever altered by one thread, but in the next sentence you talk about delaying any changes made to that data structure by other threads. Make up your mind.
In general, I'd suggest wrapping the access to the data structure up with a critical section, or mutex.
It's possible to use such animals as reader/writer locks to differentiate between readers and writers of datastructures but the performance advantage for typical cases usually wont merit the additional complexity associated with their use.
From the way your question is stated, I'm guessing you're fairly new to multithreaded development. I highly suggest sticking with the simplist and most commonly used approaches for ensuring data integrity (most books/articles you readon the issue will mention the same uses for mutexes/critical sections). Multithreaded development is extremely easy to get wrong and can be difficult to debug. Also, what seems like the "optimal" solution very often doesn't buy you the huge performance benefit you might think. It's usually best to implement the simplist approach that will work then worry about optimizing it after the fact.
There is a trick that could work in case, as you said, the other threads will only make changes only once in a while, although it is still rather hackish:
make sure your "master" thread can't be interrupted by the other ones (higher priority, non fair scheduling)
check your thread
if "master", just change
if other, put off scheduling, if needed by putting off interrupts, make change, reinstall scheduling
really test to see whether there are no issues in your setup.
As you can see, if requirements change a little bit, this could turn out worse than using normal locks.
As mentioned, the simplest solution when two threads need access to the same data is to use some synchronization mechanism (i.e. critical section or mutex).
If you already have synchronization in your design try to reuse it (if possible) instead of adding more. For example, if the main thread receives its work from a synchronized queue you might be able to have thread 2 queue the data structure update. The main thread will pick up the request and can update it without additional synchronization.
The queuing concept can be hidden from the rest of the design through the Active Object pattern. The activ object may also be able to publish the data structure changes through the Observer pattern to other interested threads.