I have multiple preforked server processes which accept requests to modify a shared STL C++ list on a server. Each process simply pushes a new element at the end of the list and returns the iterator.
I'm not sure how should each process attempt to acquire lock on the list? Should it be on entire object or are STL Lists capable of handling concurrency since we're just pushing an element at the end of the list?
Assuming you meant threads rather than processes you can share the STL containers but you need to be careful with respect to synchronization. The STL containers are threads safe to some extend but you need to understand the thread safety guarantees given:
One container can be used by multiple readers concurrently.
If there is one writer for a container, there shall neither be concurrent readers nor concurrent writers.
The guarantees are per container, i.e., different containers can concurrently be used by threads without need of synchronization between them.
The reason for these restrictions is that the interface for the containers is geared towards efficient use within one thread and you don't want to impeded the processing of an unshared container with the potential of being shared across threads. Also, the container interface isn't suitable for any sort of container maintained concurrency mechanism. For example, just because v.empty() just returned false it doesn't mean that v.pop() works because the container can be empty by now: If there were internal synchronization any lock would have been released once empty() returned and the container can be changed by the time pop() is called.
It is relatively easy to create a queue to be used for communication between different threads. It would use a std::mutex and a suitable instantiation of std::condition_variable. I think there is something like this proposed for inclusion into the standard but it isn't, yet, part of the standard C++ library. Note, however, that such a class would not return an iterator to the inserted element because by the time you'd access it, the element may be gone again and it would be questionable what the iterator is used for anyway.
The mechanism for doing this kind of synchronisation between multiple processes requires that the developer deal with several issues. Firstly whatever is being shared between the processes needs to be set up outside of them. What this usually means in practice is the use of shared memory.
Then these processes need to communicate with each other with respect to accessing the memory being shared. After all if one thread starts to work on a data structure being shared, but gets swapped out before completing the operation it will leave the data inconsistent.
This synchronisation can be done using operating system constructs such as semaphores in linux, and will allow competing processes to coordinate.
See This for linux based IPC detail
See This for Windows based IPC detail
For some reference you can use the Boost.Interprocess documentation which provides a platform independent implementation of IPC mechanisms.
The standard library containers offer no automagic protection against concurrent modifications, so you need a global lock for every access of the queue.
You even have to be careful with the iterators or references to list elements, since you may not necessarily know when the corresponding element has been removed from the list.
Related
I have multiple threads simultaneously calling push_back() on a shared object of std::vector. Is std::vector thread safe? Or do I need to implement the mechanism myself to make it thread safe?
I want to avoid doing extra "locking and freeing" work because I'm a library user rather than a library designer. I hope to look for existing thread-safe solutions for vector. How about boost::vector, which was newly introduced from boost 1.48.0 onward. Is it thread safe?
The C++ standard makes certain threading guarantees for all the classes in the standard C++ library. These guarantees may not be what you'd expect them to be but for all standard C++ library classes certain thread safety guarantees are made. Make sure you read the guarantees made, though, as the threading guarantees of standard C++ containers don't usually align with what you would want them to be. For some classes different, usually stronger, guarantees are made and the answer below specifically applies to the containers. The containers essentially have the following thread-safety guarantees:
there can be multiple concurrent readers of the same container
if there is one writer, there shall be no more writers and no readers
These are typically not what people would want as thread-safety guarantees but are very reasonable given the interface of the standard containers: they are intended to be used efficiently in the absence of multiple accessing threads. Adding any sort of locking for their methods would interfere with this. Beyond this, the interface of the containers isn't really useful for any form of internal locking: generally multiple methods are used and the accesses depend on the outcome of previous accesses. For example, after having checked that a container isn't empty() an element might be accessed. However, with internal locking there is no guarantee that the object is still in the container when it is actually accessed.
To meet the requirements which give the above guarantees you will probably have to use some form of external locking for concurrently accessed containers. I don't know about the boost containers but if they have an interface similar to that of the standard containers I would suspect that they have exactly the same guarantees.
The guarantees and requirements are given in 17.6.4.10 [res.on.objects] paragraph 1:
The behavior of a program is undefined if calls to standard library functions from different threads may introduce a data race. The conditions under which this may occur are specified in 17.6.5.9. [ Note: Modifying an object of a standard library type that is shared between threads risks undefined behavior unless objects of that type are explicitly specified as being sharable without data races or the user supplies a locking mechanism. —endnote]
... and 17.6.5.9 [res.on.data.races]. This section essentially details the more informal description in the not.
I have multiple threads simultaneously calling push_back() on a shared object of std::vector. Is std::vector thread safe?
This is unsafe.
Or do I need to implement the mechanism myself to make it thread safe?
Yes.
I want to avoid doing extra "locking and freeing" work because I'm a library user rather than a library designer. I hope to look for existing thread-safe solutions for vector.
Well, vector's interface isn't optimal for concurrent use. It is fine if the client has access to a lock, but for for the interface to abstract locking for each operation -- no. In fact, vector's interface cannot guarantee thread safety without an external lock (assuming you need operations which also mutate).
How about boost::vector, which was newly introduced from boost 1.48.0 onward. Is it thread safe?
Docs state:
//! boost::container::vector is similar to std::vector but it's compatible
//! with shared memory and memory mapped files.
I have multiple threads simultaneously calling push_back() on a shared object of std::vector. ... I hope to look for existing thread-safe solutions for vector.
Take a look at concurrent_vector in Intel's TBB. Strictly speaking, it's quite different from std::vector internally and is not fully compatible by API, but still might be suitable. You might find some details of its design and functionality in the blogs of TBB developers.
I'm developing a multi-threaded plugin for a single-threaded application (which has a non-thread-safe API).
My current plugin has two threads: the main one which is application's thread and another one which is used for processing data of the main thread. Long story short, the first one creates objects, gives them an ID, inserts them into a map and sometimes even access and delete them (if application says so); the second one is reading data from that map and is altering objects.
My question is: What tehniques can I use in order to make my plugin thread-safe?
First, you have to identify where race conditions may exist. Then, you will have to use some mechanism to assure that the shared data is accessed in a safe way, hence achieving Thread Safety.
For your particular case, it seems the race condition will be on the shared map and possibly the objects (map's values) it contains as well (if it's possible that both threads attempt to alter the same object simultaneously).
My suggestion is that you use a well tested thread safe map implementation, and then if needed add the extra "protection" for the map's values themselves. This way you ensure the map is always in a consistent state for both threads, and if both threads attempt to modify the same object data (map's values), the data won't be corrupted or left inconsistent.
For the map itself, you can search for "Concurrent Hash Map" or "Atomic Hash Map" data structures for C++ and see if they are of good quality and are available for your compiler/platform. Good examples are Intel's TBB concurrent_hash_map or Facebook's folly AtomicHashMap. They both have advantages and disadvantages and you will have to analyze what's best for your situation.
As for the objects the map contains, you can use plain mutexes (simple, lock, modify data, unlock), atomic operations (trickier, only for simple datatypes) or other method, once more depending on your compiler/platform and speed requirements.
Hope this helps!
I found contradictory information on the web:
http://www.sgi.com/tech/stl/thread_safety.html
The SGI implementation of STL is thread-safe only in the sense that
simultaneous accesses to distinct containers are safe, and
simultaneous read accesses to to shared containers are safe. If
multiple threads access a single container, and at least one thread
may potentially write, then the user is responsible for ensuring
mutual exclusion between the threads during the container accesses.
http://gcc.gnu.org/onlinedocs/libstdc++/manual/using_concurrency.html
The user-code must guard against concurrent method calls which may
access any particular library object's state. Typically, the
application programmer may infer what object locks must be held based
on the objects referenced in a method call. Without getting into great
detail, here is an example which requires user-level locks:
All library objects are safe to use in a multithreaded program as long
as each thread carefully locks out access by any other thread while it
uses any object visible to another thread, i.e., treat library objects
like any other shared resource. In general, this requirement includes
both read and write access to objects; unless otherwise documented as
safe, do not assume that two threads may access a shared standard
library object at the same time.
I bolded the imporant part - maybe I dont understand what they mean by that,when I read object state I think of STL containers
How I understand this:
both documents say the same in different manner. MS STL implementation (actually Dinkumware one) says almost the same as your quoted SGI doc. They mean that they did nothing to make STL objects (e.g. containers) thread-safe, most probably because this would add an overhead unnecessary in many single-threaded applications. Any object is thread-safe in their terms, you can read it from multiple threads.
Also docs guarantee that STL objects are not modified under the hood in some background threads.
FWIW I updated the libstdc++ docs a while ago, it now says (emphasis mine):
The user code must guard against concurrent function calls which access any particular library object's state when one or more of those accesses modifies the state.
The information you cite is not contradictory. STL libraries should be safe to be used in a multi-threaded environment (actually, I've worked with one implementation where it was not the case) but it is users' burden to synchronize access to library objects. For instance, if you create a set of ints in one thread and another set of ints in another thread and you don't share either of them among threads, you should be able to use them; if you share an instance of a set among threads, it's up to you to synch the access to the set.
STL is no more. It is superseded by the C++ Standard Library. If you use the ISO C++ and the Standard Library, you should read (a) the Standard and (b) documentation that comes with your implementation of C++.
SGI STL documentation is mostly of historical interest, unless you for some reason actually use SGI STL.
Can I use a map or hashmap in a multithreaded program without needing a lock?
i.e. are they thread safe?
I'm wanting to potentially add and delete from the map at the same time.
There seems to be a lot of conflicting information out there.
By the way, I'm using the STL library that comes with GCC under Ubuntu 10.04
EDIT: Just like the rest of the internet, I seem to be getting conflicting answers?
You can safely perform simultaneous read operations, i.e. call const member functions. But you can't do any simultaneous operations if one of then involves writing, i.e. call of non-const member functions should be unique for the container and can't be mixed with any other calls.
i.e. you can't change the container from multiple threads. So you need to use lock/rw-lock
to make the access safe.
No.
Honest. No.
edit
Ok, I'll qualify it.
You can have any number of threads reading the same map. This makes sense because reading it doesn't have any side-effects, so it can't matter whether anyone else is also doing it.
However, if you want to write to it, then you need to get exclusive access, which means preventing any other threads from writing or reading until you're done.
Your original question was about adding and removing in parallel. Since these are both writes, the answer to whether they're thread-safe is a simple, unambiguous "no".
TBB is a free open-source library that provides thread-safe associative containers. (http://www.threadingbuildingblocks.org/)
The most commonly used model for STL containers' thread safety is the SGI one:
The SGI implementation of STL is thread-safe only in the sense that
simultaneous accesses to distinct
containers are safe, and simultaneous
read accesses to to shared containers
are safe.
but in the end it's up to the STL library authors - AFAIK the standard says nothing about STL's thread-safety.
But according to the docs GNU's stdc++ implementation follows it (as of gcc 3.0+), if a number of conditions are met.
HIH
The answer (like most threading problems) is it will work most of the time. Unfortunately if you catch the map while it's resizing then you're going to end up in trouble. So no.
To get the best performance you'll need a multi stage lock. Firstly a read lock which allows accessors which can't modify the map and which can be held by multiple threads (more than one thread reading items is ok). Secondly a write lock which is exclusive which allows modification of the map in ways that could be unsafe (add, delete etc..).
edit Reader-writer locks are good but whether they're better than standard mutex depends on the usage pattern. I can't recommend either without knowing more. Profile both and see which best fits your needs.
Hi Guys I want to know what is the difference between thread safe Data and Thread Safe Containers
Thread safe data:
Generally refers to data which is protected using mutexes, semaphores or other similar constructs.
Data is considered thread safe if measures have been put in place to ensure that:
It can be modified from multiple threads in a controlled manner, to ensure the resultant data structure doesn't becoming corrupt, or lead to race conditions in the code.
It can be read in a reliable fashion without the data become corrupt during the read process. This is especially important with STL-style containers which use iterators.
Mutexes generally work by blocking access to other threads while one thread is modifying shared data. This is also known as a critical section, and RAII is a common design pattern used in conjunction with critical sections.
Depending on the CPU type, some primitive data types (e.g. int) and operations (increment) might not need mutex protection (e.g. if they resolve down to an atomic instruction in machine language). However:
It is bad practice to make any assumptions about CPU architecture.
You should always code defensively to ensure code will remain thread-safe regardless of the target platform.
Thread safe containers:
Are containers which have measures in place to ensure that any changes made to them occur in a thread-safe manner.
For example, a thread safe container may allow items to be inserted or removed using a specific set of public methods which ensure that any code which uses it is thread-safe.
In other words, the container class provides the mutex protection as a service to the caller, and the user doesn't have to roll their own.