Locked write, unlocked read - c++

Algorithmic question
How to allow only the following types of threaded operations on an object?
multiple simultaneous reads, no writes
single write, no reads
Example: wrapper for STL container allowing efficient search from multiple threads. For simplicity, let assume no iterator can be accessed from outside of the wrapper in question.
Let's assume we have semaphores and mutexes at out disposal.
I know that boost libraries has this concept implemented. I'd like to understand how is this usually done.

Use boost::shared_mutex to handle frequent read, infrequent write access patterns.
As you've noted, STL containers are 'leaky' in that you can retrieve an iterator which has to be treated as either an implicit ongoing operation for write (if non-const) or read (if const), until the iterator goes out of scope. Writes to the container by other threads while you hold such an iterator can invalidate it. Your wrapper would have to be carefully designed to handle this case and keep the wrapper class efficient.

You want a "multiple-reader / single-writer" mutex : Boost.Thread provides one.

Related

Is erasing and inserting in a single linked list thread safe?

Using std::forward_list are there any data races when erasing and inserting? For example I have one thread that does nothing but add new elements at the end of the list, and I have another thread that walks the (same) list and can erase elements from it.
From what I know of linked lists, each element holds a pointer to the next element, so if I erase the last element, at the same time that I am inserting a new element, would this cause a data race or do these containers work differently (or do they handle that possibility)?
If it is a data race, is there a (simple and fast) way to avoid this? (Note: The thread that inserts is the most speed critical of the two.)
There are thread-safety guarantees for the standard C++ library containers but they tend not to be of the kind people would consider thread-safety guarantees (that is, however, an error of people expecting the wrong thing). The thread-safety guarantees of standard library containers are roughly (the relevant section 17.6.5.9 [res.on.data.races]):
You can have as many readers of a container as you want. What exactly qualifies as reader is a bit subtly but roughly amounts to users of const member functions plus using a few non-const members to only read the data (the thread safety of the read data isn't any of the containers concern, i.e., 23.2.2 [container.requirements.dataraces] specifies that the elements can be changed without the containers introducing data races).
If there is one writer of a container, there shall be no other readers or writes of the container in another thread.
That is, reading one end of a container and writing the other end is not thread safe! In fact, even if the actual container changes don't affect the reader immediately, you always need synchronization of some form when communicating a piece of data from one thread to another thread. That is, even if you can guarantee that the consumer doesn't erase() the node the producer currently insert()s, there would be a data race.
No, neither forward_list nor any other STL containers are thread-safe for writes. You must provide synchronization so that no other threads read or write to the container while a write is occurring. Only simultaneous reads are safe.
The simplest way to do this is to use a mutex to lock access to the container while an insert is occurring. Doing this in a portable way requires C++ 11 (std::mutex) or platform-specific features (mutexes in Windows, perhaps pthreads in Linux/Unix).
Unless you're using a version of the STL that explicitly states it is thread-safe then no, the containers are not thread safe.
It's rare to make general purpose containers thread safe by default, as it imposses a performance hit on users who don't require thread safe access to the container, and this is by far the normal usage pattern.
If thread safety is an issue for you then you'll need to surround your code with locks, or use a data structure that is designed specifically designed for multi threaded access.
std containers are not meant to be thread safe.
You should carefully protect them for modify operations.

Why map is not multithread safe in C++?

I met this problem when I tried to solve an concurrency issue in my code. In the original code, we only use a unique lock to lock the write operation on a cache which is a stl map. But there is no restrictions on read operation to the cache. So I was thinking add a shared lock to the read operation and keep the unique lock to the write. But someone told me that it's not safe to do multithreading on a map due to some internal caching issue that it itself does.
Can someone explain the reason in details? What does the internal caching do?
The implementations of std::map must all meet the usual
guarantees: if all your do is read, then there is no need for
external synchrionization, but as soon as one thread modifies,
all accesses must be synchronized.
It's not clear to me what you mean by "shared lock"; there is no
such thing in the standard. But if any one thread is writing,
you must ensure that no other threads may read at the same time.
(Something like Posix' pthread_rwlock could be used, but
there's nothing similar in the standard, at least not that I can
find off hand.)
Since C++11 at least, a const operation on a standard library class is guaranteed to be thread safe (assuming const operations on objects stored in it are thread safe).
All const member functions of std types can be safely called from multiple threads in C++11 without explicit synchronization. In fact, any type that is ever used in conjunction with the standard library (e.g. as a template parameter to a container) must fulfill this guarantee.
Clarificazion: The standard guarantees that your program will have the desired behaviour as long as you never cause a write and any other access to the same data location without a synchronization point in between. The rationale behind this is that modern CPUs don't have strict sequentially consistent memory models, which would limit scalability and performance. Under the hood, your compiler and standard library will emit appropriate memory fences at places where stronger memory orderings are needed.
I really don't see why there would be any caching issue...
If I refer to the stl definition of a map, it should be implemented as a binary search tree.
A binary search tree is simply a tree with a pool of key-value nodes. Those nodes are sorted following the natural order of their keys and, to avoid any problem, keys must be unique. So no internal caching is needed at all.
As no internal caching is required, read operations are safe in multi-threading context. But it's not the same story for write operations, for those you must provide your own synchronization mechanism as for any non-threading-aware data structure.
Just be aware that you must also forbid any read operations when a write operation is performed by a thread, because this write operation can result in a slow and complete rebalancing of the binary tree, i.e. a quick read operation during a long write operation would return a wrong result.

C++ Thread safe vector.erase

I wrote a threaded Renderer for SFML which takes pointers to drawable objects and stores them in a vector to be draw each frame. Starting out adding objects to the vector and removing objects to the vector would frequently cause Segmentation faults (SIGSEGV). To try and combat this, I would add objects that needed to be removed/added to a queue to be removed later (before drawing the frame). This seemed to fix it, but lately I have noticed that if I add many objects at one time (or add/remove them fast enough) I will get the same SIGSEGV.
Should I be using locks when I add/remove from the vector?
You need to understand the thread-safety guarantees the C++ standard (and implementations of C++2003 for possibly concurrent systems) give. The standard containers are a thread-safe in the following sense:
It is OK to have multiple concurrent threads reading the same container.
If there is one thread modifying a container there shall be no concurrent threads reading or writing the same container.
Different containers are independent of each other.
Many people misunderstand thread-safety of container to mean that these rules are imposed by the container implementation: they are not! It is your responsibility to obey these rules.
The reason these aren't, and actually can't, be imposed by the containers is that they don't have an interface suitable for this. Consider for example the following trivial piece of code:
if (!c.empty() {
auto value = c.back();
// do something with the read value
}
The container can control the access to the calls to empty() and back(). However, between these calls it necessarily needs to release any sort of synchronization facilities, i.e. by the time the thread tries to read c.back() the container may be empty again! There are essentially two ways to deal with this problem:
You need to use external locking if there is possibility that a concurrent thread may be changing the container to span the entire range of accesses which are interdependent in some form.
You change the interface of the containers to become monitors. However, the container interface isn't at all suitable to be changed in this direction because monitors essentially only support "fire and forget" style of interfaces.
Both strategies have their advantages and the standard library containers are clearly supporting the first style, i.e. they require external locking when using concurrently with a potential of at least one thread modifying the container. They don't require any kind of locking (neither internal or external) if there is ever only one thread using them in the first place. This is actually the scenario they were designed for. The thread-safety guarantees given for them are in place to guarantee that there are no internal facilities used which are not thread-safe, say one per-object iterator object or a memory allocation facility shared by multiple threads without being thread-safe, etc.
To answer the original question: yes, you need to use external synchronization, e.g. in the form of mutex locks, if you modify the container in one thread and read it in another thread.
Should I be using locks when I add/remove from the vector?
Yes. If you're using the vector from two threads at the same time and you reallocate, then the backing allocation may be swapped out and freed behind the other thread's feet. The other thread would be reading/writing to freed memory, or memory in use for another unrelated allocation.

Thread safe C++ std::set that supports add, remove and iterators from multple threads

I'm looking for something similar to the CopyOnWriteSet in Java, a set that supports add, remove and some type of iterators from multiple threads.
there isn't one that I know of, the closest is in thread building blocks which has concurrent_unordered_map
The STL containers allow concurrent read access from multiple threads as long as you don't aren't doing concurrent modification. Often it isn't necessary to iterate while adding / removing.
The guidance about providing a simple wrapper class is sane, I would start with something like the code snippet below protecting the methods that you really need concurrent access to and then providing 'unsafe' access to the base std::set so folks can opt into the other methods that aren't safe. If necessary you can protect access as well to acquiring iterators and putting them back, but this is tricky (still less so than writing your own lock free set or your own fully synchronized set).
I work on the parallel pattern library so I'm using critical_section from VS2010 beta boost::mutex works great too and the RAII pattern of using a lock_guard is almost necessary regardless of how you choose to do this:
template <class T>
class synchronized_set
{
//boost::mutex is good here too
critical_section cs;
public:
typedef set<T> std_set_type;
set<T> unsafe_set;
bool try_insert(...)
{
//boost has a lock_guard
lock_guard<critical_section> guard(cs);
}
};
Why not just use a shared mutex to protect concurrent access? Be sure to use RAII to lock and unlock the mutex:
{
Mutex::Lock lock(mutex);
// std::set manipulation goes here
}
where Mutex::Lock is a class that locks the mutex in the constructor and unlocks it in the destructor, and mutex is a mutex object that is shared by all threads. Mutex is just a wrapper class that hides whatever specific OS primitive you are using.
I've always thought that concurrency and set behavior are orthogonal concepts, so it's better to have them in separate classes. In my experiences, classes that try to be thread safe themselves aren't very flexible or all that useful.
You don't want internal locking, as your invariants will often require multiple operations on the data structure, and internal locking only prevents the steps happening at the same time, whereas you need to keep the steps from different macro-operations from interleaving.
You can also take a look at ACE library which has all thread safe containers you might ever need.
All I can think of is to use OpenMP for parallelization, derive a set class from std's and put a shell around each critial set operation that declares that operation critical using #pragma omp critical.
Qt's QSet class uses implicit sharing (copy on write semantics) and similar methods with std::set, you can look its implementation, Qt is lgpl.
Thread safety and copy on write semantics are not the same thing. That being said...
If you're really after copy-on-write semantics the Adobe Source Libraries has a copy_on_write template that adds these semantics to whatever you instantiate it with.

Do I need to lock STL list with mutex in push_back pop_front scenario?

I have a thread push-backing to STL list and another thread pop-fronting from the list. Do I need to lock the list with mutex in such case?
From SGI's STL on Thread Safety:
If multiple threads access a single container, and at least one thread may potentially write, then the user is responsible for ensuring mutual exclusion between the threads during the container accesses.
Since both your threads modify the list, I guess you have to lock it.
Most STL implementations are thread safe in the sens that you can access several instances of a list type from several threads without locking. But you MUST lock when you are accessing the same instance of your list.
Have a look on this for more informations : thread safty in sgi stl
Probably. These operations are not simple enough to be atomic, so they'll only be thread-safe if the implementation explicitly performs the necessary locking.
However, the C++ standard does not specify whether these operations should be thread-safe, so it is up to the individual implementation to decide that. Check the docs. (Or let us know which implementation you're using)
There is no guarantee that a STL implementation is thread-safe, and since it costs performance I would guess that most aren't. You should definitely use a mutex.
Since the stl pop / push operations are AFAIK non-atomic you do have to use a mutex.