Are STL Map or HashMaps thread safe? - c++

Can I use a map or hashmap in a multithreaded program without needing a lock?
i.e. are they thread safe?
I'm wanting to potentially add and delete from the map at the same time.
There seems to be a lot of conflicting information out there.
By the way, I'm using the STL library that comes with GCC under Ubuntu 10.04
EDIT: Just like the rest of the internet, I seem to be getting conflicting answers?

You can safely perform simultaneous read operations, i.e. call const member functions. But you can't do any simultaneous operations if one of then involves writing, i.e. call of non-const member functions should be unique for the container and can't be mixed with any other calls.
i.e. you can't change the container from multiple threads. So you need to use lock/rw-lock
to make the access safe.

No.
Honest. No.
edit
Ok, I'll qualify it.
You can have any number of threads reading the same map. This makes sense because reading it doesn't have any side-effects, so it can't matter whether anyone else is also doing it.
However, if you want to write to it, then you need to get exclusive access, which means preventing any other threads from writing or reading until you're done.
Your original question was about adding and removing in parallel. Since these are both writes, the answer to whether they're thread-safe is a simple, unambiguous "no".

TBB is a free open-source library that provides thread-safe associative containers. (http://www.threadingbuildingblocks.org/)

The most commonly used model for STL containers' thread safety is the SGI one:
The SGI implementation of STL is thread-safe only in the sense that
simultaneous accesses to distinct
containers are safe, and simultaneous
read accesses to to shared containers
are safe.
but in the end it's up to the STL library authors - AFAIK the standard says nothing about STL's thread-safety.
But according to the docs GNU's stdc++ implementation follows it (as of gcc 3.0+), if a number of conditions are met.
HIH

The answer (like most threading problems) is it will work most of the time. Unfortunately if you catch the map while it's resizing then you're going to end up in trouble. So no.
To get the best performance you'll need a multi stage lock. Firstly a read lock which allows accessors which can't modify the map and which can be held by multiple threads (more than one thread reading items is ok). Secondly a write lock which is exclusive which allows modification of the map in ways that could be unsafe (add, delete etc..).
edit Reader-writer locks are good but whether they're better than standard mutex depends on the usage pattern. I can't recommend either without knowing more. Profile both and see which best fits your needs.

Related

In c++, is vector thread safe when writing from one thread and read from another? [duplicate]

I have multiple threads simultaneously calling push_back() on a shared object of std::vector. Is std::vector thread safe? Or do I need to implement the mechanism myself to make it thread safe?
I want to avoid doing extra "locking and freeing" work because I'm a library user rather than a library designer. I hope to look for existing thread-safe solutions for vector. How about boost::vector, which was newly introduced from boost 1.48.0 onward. Is it thread safe?
The C++ standard makes certain threading guarantees for all the classes in the standard C++ library. These guarantees may not be what you'd expect them to be but for all standard C++ library classes certain thread safety guarantees are made. Make sure you read the guarantees made, though, as the threading guarantees of standard C++ containers don't usually align with what you would want them to be. For some classes different, usually stronger, guarantees are made and the answer below specifically applies to the containers. The containers essentially have the following thread-safety guarantees:
there can be multiple concurrent readers of the same container
if there is one writer, there shall be no more writers and no readers
These are typically not what people would want as thread-safety guarantees but are very reasonable given the interface of the standard containers: they are intended to be used efficiently in the absence of multiple accessing threads. Adding any sort of locking for their methods would interfere with this. Beyond this, the interface of the containers isn't really useful for any form of internal locking: generally multiple methods are used and the accesses depend on the outcome of previous accesses. For example, after having checked that a container isn't empty() an element might be accessed. However, with internal locking there is no guarantee that the object is still in the container when it is actually accessed.
To meet the requirements which give the above guarantees you will probably have to use some form of external locking for concurrently accessed containers. I don't know about the boost containers but if they have an interface similar to that of the standard containers I would suspect that they have exactly the same guarantees.
The guarantees and requirements are given in 17.6.4.10 [res.on.objects] paragraph 1:
The behavior of a program is undefined if calls to standard library functions from different threads may introduce a data race. The conditions under which this may occur are specified in 17.6.5.9. [ Note: Modifying an object of a standard library type that is shared between threads risks undefined behavior unless objects of that type are explicitly specified as being sharable without data races or the user supplies a locking mechanism. —endnote]
... and 17.6.5.9 [res.on.data.races]. This section essentially details the more informal description in the not.
I have multiple threads simultaneously calling push_back() on a shared object of std::vector. Is std::vector thread safe?
This is unsafe.
Or do I need to implement the mechanism myself to make it thread safe?
Yes.
I want to avoid doing extra "locking and freeing" work because I'm a library user rather than a library designer. I hope to look for existing thread-safe solutions for vector.
Well, vector's interface isn't optimal for concurrent use. It is fine if the client has access to a lock, but for for the interface to abstract locking for each operation -- no. In fact, vector's interface cannot guarantee thread safety without an external lock (assuming you need operations which also mutate).
How about boost::vector, which was newly introduced from boost 1.48.0 onward. Is it thread safe?
Docs state:
//! boost::container::vector is similar to std::vector but it's compatible
//! with shared memory and memory mapped files.
I have multiple threads simultaneously calling push_back() on a shared object of std::vector. ... I hope to look for existing thread-safe solutions for vector.
Take a look at concurrent_vector in Intel's TBB. Strictly speaking, it's quite different from std::vector internally and is not fully compatible by API, but still might be suitable. You might find some details of its design and functionality in the blogs of TBB developers.

STL containers and threads (concurrent writes) in Linux

I am looking for the optimal strategy to use STL containers (like std::map and std::vector) and pthreads.
What is the canonical way to go? A simple example:
std::map<string, vector<string>> myMap;
How do we guarantee concurrency?
mutex_lock;
write at myMap;
mutex_unlock;
Additionally, I would like to know if pthreads and STL face performance issues when used together.
System: Liunx, g++, pthreads, no boost, no Intel TBB
The C++03 Standard does not talk about concurrency at all, So the concurrency aspect is left out as an implementation detail for compilers. So the documentation that comes with your compiler is where one should look to for answers related to concurrency.
Most of the STL implementations are not thread safe as such.
Since STL containers do not provide any explicit Thread safety, So yes you will have to use your own synchronization mechanism. And while you are at it You should use RAII rather than manage the synchronization resource(mutex unlock etc) manually.
You can refer the Documentations here:
MSDN:
If a single object is being written to by one thread, then all reads and writes to that object on the same or other threads must be protected. For example, given an object A, if thread 1 is writing to A, then thread 2 must be prevented from reading from or writing to A.
GCC Documentation says:
We currently use the SGI STL definition of thread safety, which states:
The SGI implementation of STL is thread-safe only in the sense that simultaneous accesses to distinct containers are safe, and simultaneous read accesses to to shared containers are safe. If multiple threads access a single container, and at least one thread may potentially write, then the user is responsible for ensuring mutual exclusion between the threads during the container accesses.
Point to Note: GCC's Standard Library is a derivative of SGI's STL code.
The canonical way to provide concurrency is to hold a lock while accessing the collection.
That works in 90% of the cases where access to the collection isn't performance-critical anyway. If you're accessing a shared collection so much that locking around it harms performance, you should rethink your design. (And odds are, your design is okay and it won't affect performance anywhere near as much as you might suspect.)
You should take a look at intel thread building blocks tbb ( http://threadingbuildingblocks.org/ ). They have a few very optimized data structures that handle concurrency internally using non-blocking strategies.

Standard way to make STL objects threadsafe?

I need several STL containers, threadsafe.
Basically I was thinking I just need 2 methods added to each of the STL container objects,
.lock()
.unlock()
I could also break it into
.lockForReading()
.unlockForReading()
.lockForWriting()
.unlockForWriting()
The way that would work is any number of locks for parallel reading are acceptable, but if there's a lock for writing then reading AND writing are blocked.
An attempt to lock for writing waits until the lockForReading semaphore drops to 0.
Is there a standard way to do this?
Is how I'm planning on doing this wrong or shortsighted?
This is really kind of bad. External code will not recognize or understand your threading semantics, and the ease of availability of aliases to objects in the containers makes them poor thread-safe interfaces.
Thread-safety occurs at design time. You can't solve thread safety by throwing locks at the problem. You solve thread safety by not having two threads writing to the same data at the same time- in the general case, of course. However, it is not the responsibility of a specific object to handle thread safety, except direct threading synchronization primitives.
You can have concurrent containers, designed to allow concurrent use. However, their interfaces are vastly different to what's offered by the Standard containers. Less aliases to objects in the container, for example, and each individual operation is encapsulated.
The standard way to do this is acquire the lock in a constructor, and release it in the destructor. This is more commonly know as Resource Acquisition Is Initialization, or RAII. I strongly suggest you use this methodology rather than
.lock()
.unlock()
Which is not exception safe. You can easily forget to unlock the mutex prior to throwing, resulting in a deadlock the next time a lock is attempted.
There are several synchronization types in the Boost.Thread library that will be useful to you, notably boost::mutex::scoped_lock. Rather than add lock() and unlock() methods to whatever container you wish to access from multiple threads, I suggest you use a boost:mutex or equivalent and instantiate a boost::mutex::scoped_lock whenever accessing the container.
Is there a standard way to do this?
No, and there's a reason for that.
Is how I'm planning on doing this
wrong or shortsighted?
It's not necessarily wrong to want to synchronize access to a single container object, but the interface of the container class is very often the wrong place to put the synchronization (like DeadMG says: object aliases, etc.).
Personally I think both TBB and stuff like concurrent_vector may either be overkill or still the wrong tools for a "simple" synchronization problem.
I find that ofttimes just adding a (private) Lock object (to the class holding the container) and wrapping up the 2 or 3 access patterns to the one container object will suffice and will be much easier to grasp and maintain for others down the road.
Sam: You don't want a .lock() method because something could go awry that prevents calling the .unlock() method at the end of the block, but if .unlock() is called as a consequence of object destruction of a stack allocated variable then any kind of early return from the function that calls .lock() will be guaranteed to free the lock.
DeadMG:
Intel's Threading Building Blocks (open source) may be what you're looking for.
There's also Microsoft's concurrent_vector and concurrent_queue, which already comes with Visual Studio 2010.

Is C++ STL thread-safe for distinct containers (using STLport implementation)?

I'm using Android 2.2, which comes with a version of STLport. For some reason, it was configured to be non-thread safe. This was done using a #define _NOTHREADS in a configuration header file.
When I constructed and initialized distinct non-shared containers (e.g. strings) from different pthreads, I was getting memory corruption.
With _NOTHREADS, it looks like some low-level code in STL inside allocator.cpp doesn't do proper locking. It seems analogous to C not providing thread safety for malloc.
Does anyone know why STL might be built with _NOTHREADS by default on Android? By turning this off, I'm wondering if there may be a side effect. One thing I can think of is slightly degraded performance, but I don't see much of a choice given I'm using lots of threading.
The SGI STL
The SGI STL is the grandmother of all of the other STL implementations.
See the SGI STL docs.
The SGI implementation of STL is
thread-safe only in the sense that
simultaneous accesses to distinct
containers are safe, and simultaneous
read accesses to to shared containers
are safe. If multiple threads access a
single container, and at least one
thread may potentially write, then the
user is responsible for ensuring
mutual exclusion between the threads
during the container accesses.
G++
libstdc++ docs
We currently use the SGI STL definition of thread safety.
STLPort
STLPort docs
Please refer to SGI site for detailed
document on thread safety. Basic
points are:
simultaneous read access to the same container from within separate
threads is safe;
simultaneous access to distinct containers (not shared between
threads) is safe;
user must provide synchronization for all accesses if
any thread may modify shared
container.
General Information about the C++ Standard
The current C++ standard doesn't address concurrency issues at all, so at least for now there's no requirement that applies to all implementations.
A meaningful answer can only really apply to a specific implementation (STLPort, in this case). STLPort is basically a version of the original SGI STL implementation with improvements to its portability, so you'd probably want to start with the documentation about thread safety in the original SGI version.
Of course, the reason it is without re-entrancy is for performance: speed, less memory use, less resource uses. Presumably this is because the assumption is most client programs won't be multi-threaded.
Sun WorkShop 5.0
This is a bit old but quite informative. The bottom line is that STL only provides locks on allocators.
Strings however are referenced counted objects but any changes to their reference count is done atomically. This is only true however when passing strings around by value. Two threads holding the same reference to a single string object will need to do their own locking.
When you use e.g. std::string or similar objects, and change them from different threads, you are sharing same object between threads. To make any of the member functions you can call on string reentrant, it would mean that no other thread can affect our mem-function, and our function cannot affect any other calls in other threads. The truth is exactly the opposite, since you are sharing the same object via this pointer, which is implicitly given when calling member objects. To illustrate, equivalent to this call
std::string a;
a.insert( ... );
without using OOP syntax would be:
std::string a;
insert( &a, ... );
So, you are implicitly violating requirement that no resource is shared between function calls. You can see more here.
Hope this helps.

question about STL thread-safe and STL debugging

I have two questions about STL
1) why STL is not thread-safe? Is there any structure that is thread-safe?
2) How to debug STL using GDB? In GDB, how can I print a vector?
Container data structures almost always require synchronization (e.g. a mutex) to prevent race conditions. Since threading is not support by the C++ standard (pre C++0x), these could not be added to the STL. Also, synchronization is very expensive for cases where it is not needed. STL containers may be used in multi-threaded applications as long as you perform this synchronization manually. Alternatively, you may create your own thread-safe containers that are compatible with STL algorithms like this thread-safe circular queue.
A vector contains a contiguous block of memory. So, it can be displayed in the same way as a regular array once you find the pointer to this memory block. The exact details depend on the STL implementation you use.
The standard c++ containers are not thread safe because you most likely actually want higher level locking than just the containers themselves. In other words you are likely to want two or more operations to be safe together.
For example, if you have multiple threads running:
v.push_back(0);
v.push_back(1);
You wont get a nice vector of alternating 0's and 1's, they could be jumbled. You would need to lock around both commands to get what you want.
STL is not thread-safe because a lot of people don't need thread safety, and because that introduces threading context into classes that otherwise have no need to know anything about the concept of threads.
You can encapsulate access to containers and provide your own thread safety (or other restrictions imposed by your specific design and implementation.)
Because there are still single-threaded programs.
Take a look here.