std::map<int, std::bitset<256 > > thread safety w/o mutex? - c++

I have a
std::map<int, std::bitset<256 > > m;
After construction no new keys will be inserted and no keys will be removed. Can I safely assign the bitset in one thread while reading it in other threads without using a mutex?
// thread 1:
std::bitset<256> a = getBitset();
m[1] = a;
// thread 2:
std::bitset<256> b = m[1];
// or alternatively
std::bitset<256> c = m.at(1);
I think that the program will not crash but a data race could occur in the bitset. A data race would be acceptable if the read would deliver a combination of the old and the new bitset.

No, it is not safe.
operator= of the bitset is a modifying operation, and as such is not guaranteed to be free of a data race if the bitset is simultaneously accessed in another thread. And in practice, it will almost surely cause a data race, since it needs to write to the object. This is not specific to std::bitset, but generally true for pretty much all non-empty non-atomic standard library types (and most other non-empty types, as well).
Any data race will result in undefined behavior. There are no partial updates. In that sense, it should never be acceptable.
If you want to be able to do this, either wrap the bitset in a struct with a mutex to protect access to the bitset, or use something like an atomic shared_ptr that allows atomically exchanging the old bitset with a new one and delaying destruction of the old one until all references are gone.
Although I think it is not guaranteed, std::bitset may also be trivially copyable. You could check that with a static_assert on std::is_trivially_copyable_v and if true, you could also use a std::atomic or std::atomic_ref which will allow accessing the bitset atomically (i.e. free of data race) and will probably use a lock only if the underlying architecture doesn't support atomic access to an object of the corresponding size.
Also note that std::bitset b = m[1]; and m[1] also cause undefined behavior, because operator[] of std::map is also a modifying operation, and is not specified to be free of a data race, even if it doesn't insert a new element into the map. You would need to use e.g. the find() member function instead.

Related

Can I make the data of an entire C++ class be std::atomic<>

I havea class, used for data storage, of which there is only a single instance.
The caller is message driven and has become too large and is a prime candidate for refactoring, such that each message is handled by a separate thread. However, these could then compete to read/write the data.
If I were using mutexes (mutices?), I would only use them on write operations. I don't think that matters here, as the data are atomic, not the functions which access the data.
Is there any easy way to make all of the data atomic? Currently it consists of simple types, vectors and objects of other classes. If I have to add std::atomic<> to every sub-field, I may as well use mutexes.
std::atomic requires the type to be trivially copyable. Since you are saying std::vector is involved, that makes it impossible to use it, either on the whole structure or the std::vector itself.
The purpose of std::atomic is to be able to atomically replace the whole value of the object. You cannot do something like access individual members or so on.
From the limited context you gave in your question, I think std::mutex is the correct approach. Each object that should be independently accessible should have its own mutex protecting it.
Also note that the mutex generally needs to protect writes and reads, since a read happening unsynchronized with a write is a data race and causes undefined behavior, not only unsynchronized writes.

thread safety in std::map

Is it safe to use std map without a lock in multi-thread environment?
Where It is guaranteed that two threads never be manipulating the same entry in the map.
There is already a question on this but I am particularly interested in the case where multiple threads are accessing different entries in the map. particularly unordered maps.
It is safe as long as none of the threads are modifying the map. It is also safe if threads are modifying different elements of the map (provided the elements themselves don't cause race conditions by, for example, modifying some global state):
In 17.6.5.9 Data race avoidance, the standard library guarantees that concurrent const access to containers is safe (at least as far s the containers go. If the elements allow mutation via const access there could be data races at the element level.)
In 23.2.2 Container data races further guarantees are made: non-const concurrent access is safe if the modifications/reads are to different elements of the container1.
As soon as you have one thread making modifications to the container or to the same element in the container while others read or write, you are open to race conditions and undefined behaviour.
1 With the exception of std::vector<bool>
Threads accessing only const members of a map will not race with each other. This is specified in the requirements on library types, at the beginning of the library specification.
Threads accessing non-const members can race with threads accessing const or non-const members.
In other words, they're like pretty much any other object and have no extra thread safety guarantees. The standard library does not currently contain special thread safe containers.
As per C++11 23.2.2 Container data races /2:
Notwithstanding (17.6.5.9), implementations are required to avoid data races when the contents of the contained object in different elements in the same sequence, excepting vector<bool>, are modified concurrently.
Section 17.6.5.9 simply states the limitations applied to the implementation so that it won't cause data races.
That text basically means you have to handle your own race conditions, the containers themselves don't do this.
Is it safe to use std map without a lock in multi-thread environment? Where It is guaranteed that two threads never be manipulating the same entry in the map.
You can freely manipulate different existing entries in the map (only the values, as the map APIs forbid mutation of the keys) but should use some synchronisation facility to write out the changes before any other thread attempts to access or mutate them.
For unordered_map, insert (even by []), emplace, erase, reserve, rehash (explicit or automatic), operator=, clear can't be done safely while other threads are doing more than accessing/mutating elements that they had already found the addresses of, as the above functions can modify the underlying hash table data structure and per-bucket linked lists that track elements. "more than" includes things like find, [] even on an existing element, at, equal_range, even empty, size and load_factor and all the bucket operations.

`pthread_mutex_t`, `sem_t` in a `std::map`

Basically I'm maintaining a set of states for a bunch of objects:
#define SUBSCRIPTION_TYPE int
std::map< SUBSCRIPTION_TYPE , bool > is_object_valid;
And I need to protect each element in is_object_valid with their respective mutex_t(Rather than a big lock). As valid values of SUBSCRIPTION_TYPE are sparse (say, dozens of values ranging from 0 to 10000000 ), a std::map is prefered over std::vector, C-style array, etc.
I'm trying to achieve something like:
std::map< SUBSCRIPTION_TYPE , pthread_mutex_t > mutex_array;
But it doesn't seem to work. (Well, data race may occur when std::map is being updated).
So what's the best way to achieve this? Must I write a thread-safe subscription allocator that maps SUBSCRIPTION_TYPE into consecutive integers so that I can store the mutex_ts in an array?
If any thread is modifying the map itself (inserting, etc.), you
need to protect all accesses to the map. After that: if the
member is just a bool, how much processing can you be doing on
it that adding this time to the time the map level mutex is held
would change anything.
Otherwise: if you need a mutex per object, the simple solution
would be to put them into same object as the one on the map.
But it mutex_t copyable? pthread_mutex_t and std::mutex
aren't. This could make the insertion code overly complex,
since you can't initialize the pthread_mutex_t, or construct
the std::mutex, before the object is inserted. (In C++11, you
could use emplace to solve this problem; contents of a map
don't have to be copyable if you use emplace.) In C++03,
however, you'll have to separate allocation from initialization;
the struct which contains your mapped value and the mutex will
in fact have to be declared with raw memory for the mutex, and
then placement new used to initialize it using the iterator you
get back from insert.
If you have multiple threads reading and writing to the mutex_array, you will need another mutex to guard it.
Are you sure you will have multiple threads writing to the mutex_array?
The other thing is, instead of having two maps, you can have a map<subscription_type, object_struct>
struct object_struct {
bool valid;
pthread_mutex_t mutex;
};
And then have a single overarcing mutex to guard that map.

Threads and string literals

Is it valid (defined behavior) to access a string literal simultaneously with multiple threads? Given a function like this:
const char* give()
{
return "Hello, World!";
}
Would it be save to call the function and dereference the pointer simultaneously?
Edit: Many answers. Will accept the first one who can show me the section out of the standard.
According to the standard:
C++11 1.10/3: The value of an object visible to a thread T at a particular point is the initial value of the object, a value assigned to the object by T, or a value assigned to the object by another thread, according to the rules below.
A string literal, like any other constant object, cannot legally be assigned to; it has static storage duration, and so is initialised before the program starts; therefore, all threads will see its initial value at all times.
Older standards had nothing to say about threads; so if your compiler doesn't support the C++11 threading model then you'll have to consult its documentation for any thread-safety guarantees. However, it's hard to imagine any implementation under which access to immutable objects were not thread-safe.
Yes, it's safe. Why wouldn't it be? It would be unsafe if you'd try to modify the string, but that's illegal anyway.
It is always safe to access immutable data from multiple threads. String literals are an example of immutable data (since it's illegal to modify them at run-time), so it is safe to access them from multiple threads.
As long as you only read data, you can access it from as many threads as you want. When data needs to be changed, that's when it gets complicated.
This depends on the implementation of the C Compiler. But I do not know of an implementation where concurrent read accesses might be unsafe, so in practice this is safe.
String literals are (conceptually) stored in read only memory and initialised on loading (rather than at runtime). It's therefore safe to access them from multiple threads at any time.
Note that more complex structures might not be initialised at load time, and so multiple thread access might have the possibility of issues immediately after the creation of the object.
But string literals are completely safe.

is `std::map<..> a; blah = a[abcd];` thread safe if abcd was not created before this call?

So we created a map. We want to get some_type blah = map_variable[some_not_inserted_yet_value] this would call add new item to map if one was not previosly created. So I wonder if read is really thread safe with std::map or it is only possible to thread safly try{ ...find(..)->second...?
The idea that calling find(...)->second is thread-safe is very dependent of your view of thread-safety. If you simply mean that it won't crash, then as long as no one is mutating the dictionary at the same time you're reading it, I suppose you're okay.
That said, indeed, no matter what your minimum thread safety requirements are, calling the operator[] method is inherently not thread-safe as it can mutate the collection.
If a method has no const overload, it means it can mutate the object, so unless the documentation indicates methods are thread-safe, the method is very unlikely to be.
Then again, a const method might not be thread-safe as well, because your object could depend on non-const global state or have mutable fields, so you'll want to be very, very careful if you use unsynchronized classes as if they were.
If you're 100% sure that the map contains the key, then it is technically thread-safe if all other threads are also only invoking read-only methods on the map. Note however, that there is no const version of map<k,v>::operator[](const k&).
The correct way to access the map in a thread-safe fashion is indeed:
map<k,v>::const_iterator match = mymap.find(key);
if ( match != mymap.end() ) {
// found item.
}
As stated before, this only applies if all concurrent access is read-only. One way this can be guaranteed is to use a readers-writers lock.
Note that in C++03, there is no mention of threads in the standard, so even that is not guaranteed to be thread-safe. Make sure to check your implementation's documentation.
Standard library containers have no notion of thread safety. You have to synchronize concurrent read/write access to the container yourself.
try has nothing to do with multithreading. It's used for exception handling.
find does not throw an exception if the key is not found. If the key is not found, find returns the map's end() iterator.
You are correct, operator[] is not "thread safe" because it can mutate the container. You should use the find method and compare the result to std::map::end to see if it found the item. (Also notice that find has a const version while operator[] does not).
Like others have said, the version of C++ before C++11 has no notion of threads or thread safety. However, you can feel safe using find without synchronization because it doesn't change the container, so it's only doing read operations (unless you have a weird implementation, so make sure to check the docs). As with most containers, reading from it from different threads won't cause any harm, however writing to it might.