I'm currently trying to wrap my head around the problem of thread-safety using C++ STL containers. I recently tried to implement a thread safe std::vector by using a std::mutex as a member variable, just to then realize that although I could make member functions thread-safe by locking the lock, I couldn't make lib functions like std::sort thread-safe, since they only get the begin()/end() iterators, which is a result of the fundamental split between containers and algorithms in the STL in general.
So then I thought, if I can't use locks, how about software transactional memory (STM)?
So now I'm stuck with this:
#include <atomic>
#include <cstdlib>
#include <iostream>
#include <thread>
#include <vector>
#define LIMIT 10
std::atomic<bool> start{false};
std::vector<char> vec;
void thread(char c)
{
while (!start)
std::this_thread::yield();
for (int i = 0; i < LIMIT; ++i) {
__transaction_atomic {
vec.push_back(c);
}
}
}
int main()
{
std::thread t1(thread, '*');
std::thread t2(thread, '#');
start.store(true);
t1.join();
t2.join();
for (auto i = vec.begin(); i != vec.end(); ++i)
std::cout << *i;
std::cout << std::endl;
return EXIT_SUCCESS;
}
Which I compile with:
g++ -std=c++11 -fgnu-tm -Wall
using g++ 4.8.2 and which gives me the following error:
error: unsafe function call to push_back within atomic transaction
Which I kinda get, since push_back or sort or whatever isn't declared transaction_safe but which leaves me with the following questions:
a) How can I fix that error?
b) If I can't fix that error, then what are these transactional blocks usually used for?
c) How would one implement a lock-free thread-safe vector?!
Thanks in advance!
Edit:
Thanks for the answers so far but they don't really scratch my itch. Let me give you an example:
Imagine I have a global vector and access to this vector shall be shared amongst multiple threads. All threads try to do sorted inserts, so they generate a random number and try to insert this number into the vector in a sorted manner, so the vector stays sorted all the time (including duplicates ofc). To do the sorted insert they use std::lower_bound to find the "index" where to insert and then do the insert using vector.insert().
If I write a wrapper for the std::vector that contains a std::mutex as a member, than I can write wrapper functions, e.g. insert which locks the mutex using std::lock_guard and then does the actual std::vector.insert() call. But std::lower_bound doesn't give a damn about the member mutex. Which is a feature, not a bug afaik.
This leaves my threads in quite a pickle because other threads can change the vector while someone's doing his lower_bound thing.
The only fix I can think of: forgett the wrapper and have a global mutex for the vector instead. Whenever anybody wants to do anything on/with/to this vector, he needs that lock.
THATS the problem. What alternatives are there for using this global mutex.
and THATS where software transactional memory came to mind.
So now: how to use STMs on STL containers? (and a), b), c) from above).
I believe that the only way you can make an STL container 100% thread safe is to wrap it in your own object (keeping the actual container private) and use appropriate locking (mutexes, whatever) in your object in order to prevent multi-thread access to the STL container.
This is the moral equivalent of just locking a mutex in the caller around every container operation.
In order to make the container truly thread safe, you'd have to muck about with the container code, which there's no provision for.
Edit: One more note - be careful about the interface you give to your wrapper object. You can't very well go handing out references to stored objects, as that would allow the caller to get around the locking of the wrapper. So you can't just duplicate vector's interface with mutexes and expect things to work.
I'm not sure I understand why you cannot use mutexes. If you lock the mutex each time you are accessing the vector then no matter what operation you are doing you are certain that only a single thread at a time is using it. There is certainly space for improvement depending on your needs for the safe vector, but mutexes should be perfectly viable.
lock mutex -> call std::sort or whatever you need -> unlock mutex
If on the other side what you want is to use std::sort on your class, then again it is a matter of providing thread-safe access and reading methods through the iterators of your container, as those are the ones that std::sort needs to use anyway in order to sort a vector, since it is not a friend of containers or anything of the sort.
You can use simple mutexes to make your class thread safe. As stated in another answer, you need to use a mutex to lock the vector before use and then unlock after use.
CAUTION! All of the STL functions can throw exceptions. If you use simple mutexes, you will have a problem if any function throws because the mutex will not be released. To avoid this problem, wrap the mutex in a class that releases it in the destructor. This is a good programming practice to learn about: http://c2.com/cgi/wiki?ResourceAcquisitionIsInitialization
Related
I need fill one std::vector on differents threads.
Is it correct code? Or I should to add mutex for my code?
void func(int i, std::vector<float>& vec)
{
vec[i] = i;
}
int main()
{
std::vector<float> vec(6);
std::list<std::thread> threads;
for (int i = 0; i < 6; i++)
{
threads.push_back(std::thread(func, i, std::ref(vec)));
}
for (auto iter = threads.begin(); iter != threads.end(); iter++)
{
(*iter).join();
}
}
I tested my code it works fine. Are there any pitfalls? Is it thread safe code?
What about get std::vector data by different threads?
Related question:
Is std::vector thread-safe and concurrent by default? Why or why not?.
It's thread-safe because you're not modifying the vector size, and not attempting to write to the same memory location in different threads.
To future proof this answer for anyone who doesn't drill down into the link:
It's not thread safe only because they're using the [] operator. It's thread safe because each thread is explicitly modifying a different location in memory.
If all threads were only reading the same location using [], that would be thread safe.
If all threads were writing to the same same location, using [] won't stop them from messing with each other.
I think if this were production code, at LEAST a comment describing why this is thread safe would be called for. Not sure of any compile time way to prevent someone from shooting themselves in the foot if they modify this function.
On point #4, we want to communicate to future users of this code that:
No we're not guarding this standard library container even though that should be your gut reaction, and
Yes we've analyzed it and it's safe.
The easy way is to stick a comment in there, but there's a saying:
The compiler doesn't read comments and neither do I.
-Bjarne Stroustrup
I think some kind of [[attributes]] should be the way to do this? Although the built-ins don't seem to support any kind of thread safety checking.
Clang appears to provide Thread Safety Analysis:
The analysis is still under active development, but it is mature enough to be deployed in an industrial setting.
Assuming you implement other functions that would require a std::mutex to be responsible for your std::vector:
std::mutex _mu;
std::vector<int> _vec GUARDED_BY(_mu);
then you can explicitly add the NO_THREAD_SAFETY_ANALYSIS attribute to turn off the safety checking for this one specific function. I think it's best to combine this with a comment:
// I know this doesn't look safe but it is as long as
// the caller always launches it with different values of `i`
void foo(int i, std::vector<int>& vec) NO_THREAD_SAFETY_ANALYSIS;
The use of GUARDED_BY tells me, in the future, that you are thinking about thread safety. The use of NO_THREAD_SAFETY_ANALYSIS shows me that you have determined this function is okay to use - especially when other functions that modify your vector are not marked NO_THREAD_SAFETY_ANALYSIS.
Yes It's thread safe because you are neither writing to the vector object itself from different threads nor writing to the same object in the vector underlying array .
when you construct the vector you allocate space for 6 elements and fill them with zero for pod types like int. These elements are placed in an array owned and managed by the vector and the vector exposes them via iterators and operator [].
So when you edit an element in the vector you don't edit the vector itself so you don't have to protect the vector with a mutex.
You will need a mutex if you are modifying the vector itself or the same element in different threads .
I have a C++11 program that does some computations and uses a std::unordered_map to cache results of those computations. The program uses multiple threads and they use a shared unordered_map to store and share the results of the computations.
Based on my reading of unordered_map and STL container specs, as well as unordered_map thread safety, it seems that an unordered_map, shared by multiple threads, can handle one thread writing at a time, but many readers at a time.
Therefore, I'm using a std::mutex to wrap my insert() calls to the map, so that at most only one thread is inserting at a time.
However, my find() calls do not have a mutex as, from my reading, it seems that many threads should be able to read at once. However, I'm occasionally getting data races (as detected by TSAN), manifesting themselves in a SEGV. The data race clearly points to the insert() and find() calls that I mentioned above.
When I wrap the find() calls in a mutex, the problem goes away. However, I don't want to serialize the concurrent reads, as I'm trying to make this program as fast as possible. (FYI: I'm running using gcc 5.4.)
Why is this happening? Is my understanding of the concurrency guarantees of std::unordered_map incorrect?
You still need a mutex for your readers to keep the writers out, but you need a shared one. C++14 has a std::shared_timed_mutex that you can use along with scoped locks std::unique_lock and std::shared_lock like this:
using mutex_type = std::shared_timed_mutex;
using read_only_lock = std::shared_lock<mutex_type>;
using updatable_lock = std::unique_lock<mutex_type>;
mutex_type mtx;
std::unordered_map<int, std::string> m;
// code to update map
{
updatable_lock lock(mtx);
m[1] = "one";
}
// code to read from map
{
read_only_lock lock(mtx);
std::cout << m[1] << '\n';
}
There are several problems with that approach.
first, std::unordered_map has two overloads of find - one which is const, and one which is not. I'd dare to say that I don't believe that that non-const version of find will mutate the map, but still for the compiler invoking non const method from a multiple threads is a data race and some compilers actually use undefined behavior for nasty optimizations.
so first thing - you need to make sure that when multiple threads invoke std::unordered_map::find they do it with the const version. that can be achieved by referencing the map with a const reference and then invoking find from there.
second, you miss the the part that many thread may invoke const find on your map, but other threads can not invoke non const method on the object! I can definitely imagine many threads call find and some call insert on the same time, causing a data race. imagine that, for example, insert makes the map's internal buffer reallocate while some other thread iterates it to find the wanted pair.
a solution to that is to use C++14 shared_mutex which has an exclusive/shared locking mode. when thread call find, it locks the lock on shared mode, when a thread calls insert it locks it on exclusive lock.
if your compiler does not support shared_mutex, you can use platform specific synchronization objects, like pthread_rwlock_t on Linux and SRWLock on Windows.
another possibility is to use lock-free hashmap, like the one provided by Intel's thread-building blocks library, or concurrent_map on MSVC concurrency runtime. the implementation itself uses lock-free algorithms which makes sure access is always thread-safe and fast on the same time.
Does anyone know where I can find an implimentation that wraps a std::map and makes it thread safe? When I say thread safe I mean that it offers only serial access to the map, one thread at a time. Optimally, this map should use only the standard-library and / or boost constructs.
Does not meet the criteria that you have specified, but you could have a look at the TBB containers. There is so called concurrent_hash_map which allows multiple threads to access concurrently the data in the map. There are some details, but everything is nicely documented and can give you an idea of the "concurrent container". Depending on your needs this might be totally inappropriate...
The boost shared_mutex would provide the best multiple reader/single writer approach to wrapping a standard map given your constraints. I don't know of any "pre-built" implementations that marry these two since the task is generally trivial.
It is generally not a good idea for collection classes to provide thread-safety, because they cannot know how they are being used. You will be much better served by implementing your own locking mechainisms in the higher level constructs that use the collections.
You might look at Thread Safe Template Library
Try this library
http://www.codeproject.com/KB/threads/lwsync.aspx
It is implemented in a modern c++ policy based approach.
Here is some cut from the link to show the idea with the 'vector' case
typedef lwsync::critical_resource<std::vector<int> > sync_vector_t;
sync_vector_t vec;
// some thread:
{
// Critical resource can be naturally used with STL containers.
sync_vector_t::const_accessor vec_access = vec.const_access();
for(std::vector<int>::const_iterator where = vec_access->begin();
where != vec_access->end();
++where;
)
std::cout << *where << std::endl;
}
sync_vector_t::accessor some_vector_action()
{
sync_vector_t::accessor vec_access = vec.access();
vec_access->push_back(10);
return vec_access;
// Access is escalated from within a some_vector_action() scope
// So that one can make some other action with vector before it becomes
// unlocked.
}
{
sync_vector_t::accessor vec_access = some_vector_action();
vec_access->push_back(20);
// Elements 10 and 20 will be placed in vector sequentially.
// Any other action with vector cannot be processed between those two
// push_back's.
}
I came up with this (which I'm sure can be improved to take more than two arguments):
template<class T1, class T2>
class combine : public T1, public T2
{
public:
/// We always need a virtual destructor.
virtual ~combine() { }
};
This allows you to do:
// Combine an std::mutex and std::map<std::string, std::string> into
// a single instance.
combine<std::mutex, std::map<std::string, std::string>> lockableMap;
// Lock the map within scope to modify the map in a thread-safe way.
{
// Lock the map.
std::lock_guard<std::mutex> locked(lockableMap);
// Modify the map.
lockableMap["Person 1"] = "Jack";
lockableMap["Person 2"] = "Jill";
}
If you wish to use an std::recursive_mutex and an std::set, that would also work.
There is a proposition here (by me - shameless plug) that wraps objects (including STL containers) for efficient (zero-cost) thread safe access:
https://github.com/isocpp/CppCoreGuidelines/issues/924
The basic idea is very simple. There are just a few wrapper classes used to enforce read/write locking and, at the same time, presenting either a const (for read-only) or non-const (for read-write) view of the wrapped object.
The idea is to make it compile-time impossible to improperly access a resource shared between threads.
Implementation code can be found here:
https://github.com/galik/GSL/blob/lockable-objects/include/gsl/gsl_lockable
This is up to the application to implement. A "thread-safe" map would make individual calls into the map thread-safe, but many operations need to be made thread-safe across calls. The application that uses the map should associate a mutex with the map, and use that mutex to coordinate accesses to it.
Trying to make thread-safe containers was a mistake in Java, and it would be a mistake in C++.
I'm aware, that I need to use mutex, when I perform operations on single STL container inside multiple threads. However I want to know if there are any exceptions from this rule. Please consider simplified scenario I'm trying to implement.
I have multiple threads adding elements to container and operation is surrounded with mutex lock/unlock. Then threads notify somehow (e.g. using eventfd on linux) single thread dedicated to dispatch elements in this container. What I want to do is to access first element in container without using mutex. Sample code based on deque but note that I ca use any container with queue capability:
std::mutex locker;
std:deque<int> int_queue;
int fd; // eventfd
eventfd_t buffer;
bool some_condition;
Thread 1, 2, 3, etc.
locker.lock ();
int_queue.push_back (1);
locker.unlock ();
eventfd_write (fd, 1);
Thread dedicated to dispatch elements:
while (true)
{
bool some_condition (true);
locker.lock ();
if (int_quque.empty () == false)
{
locker.unlock ();
}
else
{
locker.unlock ();
eventfd_read (fd, &buffer);
}
while (some_condition)
{
int& data (int_queue.front ());
some_condition = some_operation (data); // [1]
}
locker.lock ();
int_queue.pop ();
locker.unlock ();
}
[1] I will do some_operation() on signle element many times, that's why I want to avoid mutex locking here. It's to expensive.
I want to know if this code can lead to any synchronisation problems or something.
What you need is reference stability. I.e. you can use containers this way if the reference to the first element is not invalidated when the container is push_back'd. And even then, you'll want to obtain the reference to the front element under the lock.
I'm more familiar with std::condition_variable for the event notification, so I'll use that:
#include <mutex>
#include <condition_variable>
#include <deque>
std::mutex locker;
std::deque<int> int_queue;
std::condition_variable cv;
void thread_1_2_3()
{
// use lock_guard instead of explicit lock/unlock
// for exception safety
std::lock_guard<std::mutex> lk(locker);
int_queue_.push_back(1);
cv.notify_one();
}
void dispatch()
{
while (true)
{
bool some_condition = true;
std::unique_lock<std::mutex> lk(locker);
while (int_queue.empty())
cv.wait(lk);
// get reference to front under lock
int& data = int_queue.front();
lk.unlock();
// now use the reference without worry
while (some_condition)
some_condition = some_operation(data);
lk.lock();
int_queue.pop_front();
}
}
23.3.3.4 [deque.modifiers] says this about push_back:
An insertion at either end of the deque invalidates all the iterators
to the deque, but has no effect on the validity of references to
elements of the deque.
That is the key to allowing you to hang onto that reference outside of the lock. If thread_1_2_3 starts inserting or erasing in the middle, then you can no longer hang on to this reference.
You can't use a vector this way. But you could use a list this way. Check each container you want to use this way for reference stability.
I can't really see through your question or your code, but in general, the containers in the standard C++ library offer you a loose guarantee that concurrent access at different elements is thread-safe. Be sure to understand the implications and limitations of that, though: If you have a random-access container, or iterators to elements, and you only use those to read or change an element value, then as long as you're doing that at different elements, the result should be well-defined. What isn't OK is changing the container itself, so any erase or insert operations have to be serialized (e.g. by locking access to the entire container), and be sure to understand your container's iterator and reference invalidation rules when you do that.
For individual containers you might be able to say a bit more - for example, insert/erase in a tree-based container, and insert/erase in the middle of a random-access container almost certainly requires a global lock. In a vector/deque you'll need to reacquire iterators. In a list, you might get away with performing insertions concurrently at distinct locations.
Any global operations like size() and empty() need to be serialized as well.
For this particular example this is not safe
int& data (int_queue.front ());
You take a reference to the first element, it could be moved by another thread adding element adding to the queue forcing it to re allocate (deques are typically implemented as "wrap-around" arrays). If you copy the value as opposed to taking a reference, depending on the implementation you might get away with it. If you want to be able to do this, a std::deque doesn't come with any standard "exceptions" to this rule. It's certainly possible to write a data structure similar to a deque where this would be safe, but a deque is not guaranteed to be written like (and is unlikley to be written like) that.
Why do you want to do this? Why does the consumer thread not extract the object within the lock, and then process it out of band?
Assuming that what you want to avoid is having to copy the object outside of the container, a simpler easier to maintain approach could be dynamically allocating the objects, using a container of (smart) pointers and extracting it within the lock (minimal cost). Then you no longer need to consider thread safety issues.
Note that even if you might be able to pull this off in this particular scenario, you cannot use more than one consumer thread. I would recommend against the approach and just find a different approach where you can meet your requirements without walking over the bleeding edge. Multithreading is hard to do right, and very hard to debug or even detect that there is an issue. By adhering to common patterns you make your code easier to reason about and to maintain.
If you do want to a lock free queue, I also recommend you look at http://drdobbs.com/cpp/210604448?pgno=2
I'm looking for something similar to the CopyOnWriteSet in Java, a set that supports add, remove and some type of iterators from multiple threads.
there isn't one that I know of, the closest is in thread building blocks which has concurrent_unordered_map
The STL containers allow concurrent read access from multiple threads as long as you don't aren't doing concurrent modification. Often it isn't necessary to iterate while adding / removing.
The guidance about providing a simple wrapper class is sane, I would start with something like the code snippet below protecting the methods that you really need concurrent access to and then providing 'unsafe' access to the base std::set so folks can opt into the other methods that aren't safe. If necessary you can protect access as well to acquiring iterators and putting them back, but this is tricky (still less so than writing your own lock free set or your own fully synchronized set).
I work on the parallel pattern library so I'm using critical_section from VS2010 beta boost::mutex works great too and the RAII pattern of using a lock_guard is almost necessary regardless of how you choose to do this:
template <class T>
class synchronized_set
{
//boost::mutex is good here too
critical_section cs;
public:
typedef set<T> std_set_type;
set<T> unsafe_set;
bool try_insert(...)
{
//boost has a lock_guard
lock_guard<critical_section> guard(cs);
}
};
Why not just use a shared mutex to protect concurrent access? Be sure to use RAII to lock and unlock the mutex:
{
Mutex::Lock lock(mutex);
// std::set manipulation goes here
}
where Mutex::Lock is a class that locks the mutex in the constructor and unlocks it in the destructor, and mutex is a mutex object that is shared by all threads. Mutex is just a wrapper class that hides whatever specific OS primitive you are using.
I've always thought that concurrency and set behavior are orthogonal concepts, so it's better to have them in separate classes. In my experiences, classes that try to be thread safe themselves aren't very flexible or all that useful.
You don't want internal locking, as your invariants will often require multiple operations on the data structure, and internal locking only prevents the steps happening at the same time, whereas you need to keep the steps from different macro-operations from interleaving.
You can also take a look at ACE library which has all thread safe containers you might ever need.
All I can think of is to use OpenMP for parallelization, derive a set class from std's and put a shell around each critial set operation that declares that operation critical using #pragma omp critical.
Qt's QSet class uses implicit sharing (copy on write semantics) and similar methods with std::set, you can look its implementation, Qt is lgpl.
Thread safety and copy on write semantics are not the same thing. That being said...
If you're really after copy-on-write semantics the Adobe Source Libraries has a copy_on_write template that adds these semantics to whatever you instantiate it with.