I have a
std::map<int, std::bitset<256 > > m;
After construction no new keys will be inserted and no keys will be removed. Can I safely assign the bitset in one thread while reading it in other threads without using a mutex?
// thread 1:
std::bitset<256> a = getBitset();
m[1] = a;
// thread 2:
std::bitset<256> b = m[1];
// or alternatively
std::bitset<256> c = m.at(1);
I think that the program will not crash but a data race could occur in the bitset. A data race would be acceptable if the read would deliver a combination of the old and the new bitset.
No, it is not safe.
operator= of the bitset is a modifying operation, and as such is not guaranteed to be free of a data race if the bitset is simultaneously accessed in another thread. And in practice, it will almost surely cause a data race, since it needs to write to the object. This is not specific to std::bitset, but generally true for pretty much all non-empty non-atomic standard library types (and most other non-empty types, as well).
Any data race will result in undefined behavior. There are no partial updates. In that sense, it should never be acceptable.
If you want to be able to do this, either wrap the bitset in a struct with a mutex to protect access to the bitset, or use something like an atomic shared_ptr that allows atomically exchanging the old bitset with a new one and delaying destruction of the old one until all references are gone.
Although I think it is not guaranteed, std::bitset may also be trivially copyable. You could check that with a static_assert on std::is_trivially_copyable_v and if true, you could also use a std::atomic or std::atomic_ref which will allow accessing the bitset atomically (i.e. free of data race) and will probably use a lock only if the underlying architecture doesn't support atomic access to an object of the corresponding size.
Also note that std::bitset b = m[1]; and m[1] also cause undefined behavior, because operator[] of std::map is also a modifying operation, and is not specified to be free of a data race, even if it doesn't insert a new element into the map. You would need to use e.g. the find() member function instead.
I havea class, used for data storage, of which there is only a single instance.
The caller is message driven and has become too large and is a prime candidate for refactoring, such that each message is handled by a separate thread. However, these could then compete to read/write the data.
If I were using mutexes (mutices?), I would only use them on write operations. I don't think that matters here, as the data are atomic, not the functions which access the data.
Is there any easy way to make all of the data atomic? Currently it consists of simple types, vectors and objects of other classes. If I have to add std::atomic<> to every sub-field, I may as well use mutexes.
std::atomic requires the type to be trivially copyable. Since you are saying std::vector is involved, that makes it impossible to use it, either on the whole structure or the std::vector itself.
The purpose of std::atomic is to be able to atomically replace the whole value of the object. You cannot do something like access individual members or so on.
From the limited context you gave in your question, I think std::mutex is the correct approach. Each object that should be independently accessible should have its own mutex protecting it.
Also note that the mutex generally needs to protect writes and reads, since a read happening unsynchronized with a write is a data race and causes undefined behavior, not only unsynchronized writes.
I need a FIFO structure that supports indexing. Each element is an array of data that is saved off a device I'm reading from. The FIFO has a constant size, and at start-up each element is zeroed out.
Here's some pseudo code to help understand the issue:
Thread A (Device Reader):
1. Lock the structure.
2. Pop oldest element off of FIFO (don't need it).
3. Read next array of data (note this is a fixed size array) from the device.
4. Push new data array onto the FIFO.
5. Unlock.
Thread B (Data Request From Caller):
1. Lock the structure.
2. Determine request type.
3. if (request = one array) memcpy over the latest array saved (LIFO).
4. else memcpy over the whole FIFO to the user as a giant array (caller uses arrays).
5. Unlock.
Note that the FIFO shouldn't be changed in Thread B, the caller should just get a copy, so data structures where pop is destructive wouldn't necessarily work without an intermediate copy.
My code also has a boost dependency already and I am using a lockfree spsc_queue elsewhere. With that said, I don't see how this queue would work for me here given the need to work as a LIFO in some cases and also the need to memcpy over the entire FIFO at times.
I also considered a plain std::vector, but I'm worried about performance when I'm constantly pushing and popping.
One point not clear in the question is the compiler target, whether or not the solution is restricted to partial C++11 support (like VS2012), or full support (like VS2015). You mentioned boost dependency, which lends similar features to older compilers, so I'll rely on that and speak generally about options on the assumption that boost may provide what a pre-C++11 compiler may not, or you may elect C++11 features like the now standardized mutex, lock, threads and shared_ptr.
There's no doubt in my mind that the primary tool for the FIFO (which, as you stated, may occasionally need LIFO operation) is the std::deque. Even though the deque supports reasonably efficient dynamic expansion and shrinking of storage, contrary to your primary requirement of a static size, it's main feature is the ability to function as both FIFO and LIFO with good performance in ways vectors can't as easily manage. Internally most implementations provide what may be analogized as a collection of smaller vectors which are marshalled by the deque to function as if a single vector container (for subscripting) while allowing for double ended pushing and popping with efficient memory management. It can be tempting to use a vector, employing a circular buffer technique for fixed sizes, but any performance improvement is minimal, and deque is known to be reliable.
Your point regarding destructive pops isn't entirely clear to me. That could mean several things. std::deque offers back and front as a peek to what's at the ends of the deque, without destruction. In fact, they're required to look because deque's pop_front and pop_back only remove elements, they don't provide access to the element being popped. Taking an element and popping it is a two step process on std::deque. An alternate meaning, however, is that a read only requester needs to pop strictly as a means of navigation, not destruction, which is not really a pop, but a traversal. As long as the structure is under lock, that is easily managed with iterators or indexes. Or, it could also mean you need a independent copy of the queue.
Assuming some structure representing device data:
struct DevDat { .... };
I'm immediately faced with that curious question, should this not be a generic solution? It doesn't matter for the sake of discussion, but it seems the intent is an odd combination of application specific operation and a generalized thread-safe stack "machine", so I'll suggest a generic solution which is easily translated otherwise (that is, I suggest template classes, but you could easily choose non-templates if preferred). These psuedo code examples are sparse, just illustrating container layout ideas and proposed concepts.
class SafeStackBase
{ protected: std::mutex sync;
};
template <typename Element>
class SafeStack : public SafeStackBase
{ public:
typedef std::deque< Element > DeQue;
private:
DeQue que;
};
SafeStack could handle any kind of data in the stack, so that detail is left for Element declaration, which I illustrate with typedefs:
typedef std::vector< DevDat > DevArray;
typedef std::shared_ptr< DevArray > DevArrayPtr;
typedef SafeStack< DevArrayPtr > DeviceQue;
Note I'm proposing vector instead of array because I don't like the idea of having to choose a fixed size, but std::array is an option, obviously.
The SafeStackBase is intended for code and data that isn't aware of the users data type, which is why the mutex is stored there. It could easily part of the template class, but the practice of placing non-type aware data and code in a non-template base helps reduce code bloat when possible (functions which don't use Element, for example, need not be expanded in template instantiations). I suggest the DevArrayPtr so that the arrays can be "plucked out" of the queue without copying the arrays, then shared and distributed outside the structure under shared_ptr's shared ownership. This is a matter of illustration, and does not adequately deal with questions regarding content of those arrays. That could be managed by DevDat, which could marshal reading of the array data, while limiting writing of the array data to an authorized friend (a write accessor strategy), such that Thread B (a reader only) is not carelessly able to modify the content. In this way it's possible to provide these arrays without copying data..just return a copy of the DevArrayPtr for communal access to the entire array. This also supports returning a container of DevArrayPtr's supporting ThreadB point 4 (copy the whole FIFO to the user), as in:
typedef std::vector< DevArrayPtr > QueArrayVec;
typedef std::deque< DevArrayPtr > QueArrayDeque;
typedef std::array< DevArrayPtr, 12 > QueArrays;
The point is that you can return any container you like, which is merely an array of pointers to the internal std::array< DevDat >, letting DevDat control read/write authorization by requiring some authorization object for writing, and if this copy should be operable as a FIFO without potential interference with Thread A's write ownership, QueArrayDeque provides the full feature set as an independent FIFO/LIFO structure.
This brings up an observation about Thread A. There you state lock is step 1, while unlock is step 5, but I submit that only steps 2 and 4 are really required under lock. Step 3 can take time, and even if you assume that is a short time, it's not as short as a pop followed by a push. The point is that the lock is really about controlling the FIFO/LIFO queue structure, and not about reading data from the device. As such, that data can be fashioned into DevArray, which is THEN provided to SafeStack to be pop/pushed under lock.
Assume code inside SafeStack:
typedef std::lock_guard< std::mutex > Lock; // I use typedefs a lot
void StuffIt( const Element & e )
{ Lock l( sync );
que.pop_front();
que.push_back( e );
}
StuffIt does that simple, generic job of popping the front, pushing the back, under lock. Since it takes an const Element &, step 3 of Thread A is already done. Since Element, as I suggest, is a DevArrayPtr, this is used with:
DeviceQue dq;
auto p = std::make_shared<DevArray>();
dq.StuffIt( p );
How the DevArray is populated is up to it's constructor or some function, the point is that a shared_ptr is used to transport it.
This brings up a more generic point about SafeStack. Obviously there is some potential for standard access functions, which could mimic std::deque, but the primary job for SafeStack is to lock/unlock for access control, and do something while under lock. To that end, I submit a generic functor is sufficient to generalize the notion. The preferred mechanics, especially with respect to boost, is up to you, but something like (code inside SafeStack):
bool LockedFunc( std::function< bool(DevQue &)> f )
{
Lock l( sync );
f( que );
}
Or whatever mechanics you like for calling a functor taking a DevQue as a parameter. This means you could fashion callbacks with complete access to the deque (and it's interface) while under lock, or provide functors or lambdas which perform specific tasks under lock.
The design point is to make SafeStack small, focused on that minimal task of doing a few things under lock, taking most any kind of data in the queue. Then, using that last point, provide the array under shared_ptr to provide the service of Thread B steps 3 and 4.
To be clear about that, keep in mind that whatever is done to the shared_ptr to copy it is similar to what can be done to simple POD types, like ints, with respect to containers. That is, one could loop through the elements of the DevQue fashioning a copy of those elements into another container in the same code which would do that for a container of integers (remember, it's a member function of a template - that type is generic). The resulting work is only copying pointers, which is less effort than copying entire arrays of data.
Now, step 4 isn't QUITE clear to me. It appears to say that you need to return a DevArray which is the accumulated content of all entries in the queue. That's trivial to arrange, but it might work a little better with a vector (as that's dynamically expandable), but as long as the std::array has sufficient room, it's certainly possible.
However, the only real difference between such an array and the queue's native "array of arrays" is how it is traversed (and counted). Returning one Element (step 3) is quick, but since step 4 is indicated under lock, that's a bit more than most locked functions should really do if they don't have to.
I'd suggest SafeStack should be able to provide a copy of que (a DeQue typedef), which is quick. Then, outside of the lock, Thread B has a copy of the DeQue ( a std::deque< DevArrayPtr > ) to fashion into it's own "giant array".
Now, more about that array. To this point I've not adequately dealt with marshalling it. I've just suggested that DevDat does that, but this may not be adequate. Certainly the content of the std::array or std::vector conveying a collection of DevDats could be written. Perhaps that deserves it's own outer structure. I'll leave that to you, because the point I've made is that SafeStack is now focused on it's small task (lock/access/unlock) and can take anything which can be owned by a share_ptr (or POD's and copyable objects). In the same way SafeStack is an outer shell marshalling a std::deque with a mutex, some similar outer shell could marshal read only access to the std::vector or std::array of DevDats, with a kind of write accessor used by Thread A. That could be a simple as something that only allows construction of the std::array to create it's content, after which read only access could be all that's provided.
I would suggest you to use boost::circular_buffer which is a fixed size container that supports random access iteration, constant time insert and erase at the beginning and end. You can use it as a FIFO with push_back(), read back() for the latest data saved and iterate over the whole container via begin(), end() or using operator[].
But at start-up the elements are not zeroed out. It has in my opinion an even more convenient interface. The container is empty at first and insertion will increase size until it reaches max size.
I have a flyweight pattern working in serial where the factory uses std::map to store and provide access to the created objects. The factory returns an iterator that points to the object in the map. The objects in the factory are constants, so they will not be updated once inserted, unless they are erased.
I would like to make the factory concurrent using tbb::concurrent_hash_map, but I am unsure what the return should be. I could use an iterator (should it be const_iterator?), but the documentation says that all iterators are invalidated when something does a find or insert in the concurrent_hash_map. So I could use a const_accessor since only read-only access is needed, but then this is different from the serial implementation (iterator vs accessor).
Which one is better to use? Should consistency in types (ie. both iterators) be important? Both serial and threaded compile-time options need to be there.
If you do not erase elements simultaneously with other threads accessing the map, you may use tbb::concurrent_unordered_map instead. This is also a hash-based associative container, but with simpler and more STL-like API. It does not invalidate iterators by insert and find, but as a tradeoff, it does not allow concurrent removal of elements.
If you do need to remove elements concurrently, the only choice with TBB is to use tbb::concurrent_hash_map with accessors.
I also suggest you to discuss your use case at the TBB forum.
So we created a map. We want to get some_type blah = map_variable[some_not_inserted_yet_value] this would call add new item to map if one was not previosly created. So I wonder if read is really thread safe with std::map or it is only possible to thread safly try{ ...find(..)->second...?
The idea that calling find(...)->second is thread-safe is very dependent of your view of thread-safety. If you simply mean that it won't crash, then as long as no one is mutating the dictionary at the same time you're reading it, I suppose you're okay.
That said, indeed, no matter what your minimum thread safety requirements are, calling the operator[] method is inherently not thread-safe as it can mutate the collection.
If a method has no const overload, it means it can mutate the object, so unless the documentation indicates methods are thread-safe, the method is very unlikely to be.
Then again, a const method might not be thread-safe as well, because your object could depend on non-const global state or have mutable fields, so you'll want to be very, very careful if you use unsynchronized classes as if they were.
If you're 100% sure that the map contains the key, then it is technically thread-safe if all other threads are also only invoking read-only methods on the map. Note however, that there is no const version of map<k,v>::operator[](const k&).
The correct way to access the map in a thread-safe fashion is indeed:
map<k,v>::const_iterator match = mymap.find(key);
if ( match != mymap.end() ) {
// found item.
}
As stated before, this only applies if all concurrent access is read-only. One way this can be guaranteed is to use a readers-writers lock.
Note that in C++03, there is no mention of threads in the standard, so even that is not guaranteed to be thread-safe. Make sure to check your implementation's documentation.
Standard library containers have no notion of thread safety. You have to synchronize concurrent read/write access to the container yourself.
try has nothing to do with multithreading. It's used for exception handling.
find does not throw an exception if the key is not found. If the key is not found, find returns the map's end() iterator.
You are correct, operator[] is not "thread safe" because it can mutate the container. You should use the find method and compare the result to std::map::end to see if it found the item. (Also notice that find has a const version while operator[] does not).
Like others have said, the version of C++ before C++11 has no notion of threads or thread safety. However, you can feel safe using find without synchronization because it doesn't change the container, so it's only doing read operations (unless you have a weird implementation, so make sure to check the docs). As with most containers, reading from it from different threads won't cause any harm, however writing to it might.