Acquiring lock by checking against a condition and rechecking it - c++

Is something like this valid:
std::vector<std::vector<int>> data;
std::shared_mutex m;
...
void Resize() {
// AreAllVectorsEmpty: std::all_of(data.begin(), data.end(), [](auto& v) { return v.empty(); }
if (!AreAllVectorsEmpty()) {
m.lock();
if (!AreAllVectorsEmpty()) {
data.resize(new_size);
}
m.unlock();
}
}
I am checking AreAllVectosEmpty() and then if condition succeeds, then taking lock and then again check for the same condition whether to do the resize.
Would this be thread safe? Resize is only called by one thread, but other threads manipulate elements of data.
Is it a requirement that AreAllVectorsEmpty have a memory fence or acquire semantics?
Edit: Other threads would ofcourse block when m.lock is acquired by Resize.
Edit: Let's also assume new_size is large enough that reallocation happens.
Edit: Update code for shared_mutex.
Edit: AreAllVectorsEmtpy is iterating over the data vector. Nobody else modifies data vector, but data[0], data[1] etc are modified by other threads. My assumption is since data[0]'s size variable is inside the vector and is a simple integer, it is safe to access data[0].size(), data[1].size() etc... in the Resize thread. AreAllVectorsEmpty is iterating over data and checking vector.empty().

I would use a shared_mutex and use:
a shared lock in all threads that just read the vector (while reading the vector)
a unique lock in this thread when resizing the vector
I think first checking for the size, then resizing it, is safe, provided that this is the only thread that modifies the contents of the vector.
A lock automatically implies a memory barrier, otherwise the lock would not make much sense.

The answer depends entirely on how AreAllVectorsEmpty is implemented.
If it just checks a flag that can be set atomically, then yes, it is safe. If it iterates over the vector you intend to change (or other commonly used containers), then no, it is not safe (what happens to iterators, if the vector does re-allocation internally???).
If doing the latter, you need a read/write lock mechanism, have a look at shared mutexes.
You'd then acquire the shared lock before checking, and in case of modification, the exclusive lock.
Be aware that if areAllVectorsEmpty uses some independent data structure (other than the mentioned atomic flag), you might have to protect this one with a separate mutex as well.

The standard does not seem to request that this works, compare http://en.cppreference.com/w/cpp/container#Thread_safety. If it works with your specific compiler and STL? You'll need to look into the sources. But I would not rely on it.
This brings me to the question: why do you want to do it? For performance reasons? Have you measured performance? Is it really a measurable performance hit when you lock before calling AreAllVectorsEmpty?
BTW, please don't directly lock the mutex, please use a std::lock_guard.

// AreAllVectorsEmpty: std::all_of(data.begin(), data.end(), [](auto&
v) { return v.empty(); }
you are accessing internals of the inner vectors (calling empty) and the same time another thread could insert some elements into one of the inner vectors -> data race

Related

Iterating over vector in one thread while other may potentially change reference

I have a vector allow_list that is periodically updated in a thread while another serves a function that checks if a certain string is in that allow_list via:
if (std::find(allow_list->begin(), allow_list->end(), target_string) != allow_list->end()){
allow = true;
}
Now, the other thread may do something like this
// Some operation to a vector called allow_list_updated
allow_list = allow_list_updated;
Should I add a mutex here to lock and unlock before and after these operations? My intuition tells me it's "ok" and shouldn't crash and burn but this seems like undefined behavior to me.
You have race condition and you need to lock. Simple rule if thread can read variable with non atomic write from another you have race on that variable. Another problem you need to lock all vector. If you have lots of reads and rare writes std::shared_mutex might be good idea. If allow_list that is periodically updated only from the edges, list would be better option for allow_list = allow_list_updated since to swap list you need to swap head and tail. Another potential advantage of list is lack of false sharing. Whatever you do your container and its protection should be in one class.
when you update a vector, all iterators become invalid. the reason for this is because the vector may reallocate the contents in memory. it may work sometimes, but eventually will segfault when you access an item that moved. the other reason is if you delete elements then your iterator could be pointing out of bounds or skip over entries. so you definitely need to perform some locking in both threads. which kind of lock depends on the rest of your code
i would also recommend either std::swap or std::move instead of allow_list = allow_list_updated; depending on if allow_list_updated can be discarded after the change; it's much faster. if you're updating this list frequently you'll probably want to use std::swap and keep the two lists in scope somewhere and just .clear() and std::swap() them each update. this will combat memory fragmentation. example:
class updater
{
public:
std::vector<std::string> allowed;
std::vector<std::string> allowed_updated;
void update()
{
// #TODO: do your update to this->allowed_updated, use this->allowed_updated.reserve() if you know how many items there will be
std::swap(this->allowed, this->allowed_updated);
this->allowed_updated.clear();
}
};

Atomic/not-atomic mix, any guarantees?

Let's I have GUI thread with code like this:
std::vector<int> vec;
std::atomic<bool> data_ready{false};
std::thread th([&data_ready, &vec]() {
//we get data
vec.push_back(something);
data_ready = true;
});
draw_progress_dialog();
while (!data_ready) {
process_not_user_events();
sleep_a_little();
}
//is it here safe to use vec?
As you see I not protect "vec" by any kind of lock, but I not use "vec" in two thread at the same moment, the only problem is memory access reodering,
Is it impossible according to C++11 standard that some modifications in "vec" happens after "data_ready = true;"?
It is not clear (for me) from documentation, is it memory ordering relevant only for other atomics, or not.
Plus question, is "default" memory order is what I want, or have to change memory model?
As long as your used memory order is at least acquire/release (which is the default), you are guaranteed to see all updates (not just the ones to atomic variables) the writing thread did before setting the flag to true as soon as you can read the write.
So yes, this is fine.

std::vector multithreaded synchronization with one reader and one writer: Only locking when resizing

I have a vector (a large buffer of strings) which I hope to read from a network thread while it is being written to by the main thread.
As vector is not thread-safe, the typical approach would be to lock the access to it from both threads.
However my insight has been that this should be only strictly necessary when the particular write will resize the vector, which will cause the realloc, which will cause any concurrent readers to have their carpet yanked. At all other times, provided that the concurrent reader is not reading the entry currently being written to, reading and writing to the vector should be allowed to happen concurrently.
So, I'd want to (somehow) only lock the mutex from the write thread when I know that the write I am about to do will cause a resize operation that will realloc the vector.
Is this practical? Can I deduce based on the value returned by capacity and the current size, for certain whether or not the next push_back will resize?
I am assuming you are writing into a vector by doing push_back
You can use boost shared_mutex that allows you have multiple reads and one write.
vector<string> vBuffer;
boost::shared_mutex _access;
void reader()
{
// get shared access
boost::shared_lock<boost::shared_mutex> lock(_access);
// now we have shared access
}
void writer(int index, const string& data)
{
// get upgradable access
boost::upgrade_lock<boost::shared_mutex> lock(_access);
if (vBuffer.capacity()<=index)//test here you can saefly write
{
// get exclusive access
boost::upgrade_to_unique_lock<boost::shared_mutex> uniqueLock(lock);
// now we have exclusive access
vBuffer.resize(vBuffer.2);
}
vBuffer[i]=data;
}
I adapated this example:
Example for boost shared_mutex (multiple reads/one write)?
Hope that helps, feel free to correct me if I am wrong

C++ Vector is this thread-safe? multithreading

So i have multiple threads accessing this function to retrieve database information, is it thread safe?
vector<vector<string> > Database::query(const char* query)
{
pthread_rwlock_wrlock(&mylock); //Write-lock
...
vector<vector<string> > results;
results.push...
pthread_rwlock_unlock(&mylock); //Write-lock
return results;
}
for editors -> sometimes 'fixing' > > to >> is not a good idea but thanks for the rest.
Since results is a local variable, it is in itself safe to use without locks, since there will be a unique copy per thread (it is on the stack, the contents of the vector dynamically allocated in some way, etc). So as long as your database is thread safe, you don't need any locks at all. If the DB is not threadsafe, you need to protect that, of course.
As noted in the other answer, if, for some reason, for example the creation of a string causes a throw bad_alloc;, you need to deal with the fallout of that, and make sure the lock is unlocked (unless you really wish to deadlock all other threads!)
Generally speaking, multiple threads can hold "read" locks. Only one thread can hold "write" lock. And no "read" locks might be held while there is a "write" lock.
It means that while mylock is held locked inside query method, no-one else can have it locked for either read or write, so it is thread-safe. You can read more about readers-writer lock here. Whether you need that mutex locked in there or not is another question.
The code is not exception-safe, however. You must employ RAII in order to unlock a mutex automatically, including on stack unwinding.
It is thread safe because results is created as a local variable, so only one thread will ever access it any instance of results within this method.
If you need a thread-safe vector for some other reason, see this answer on Threadsafe Vector Class for C++.

C++ Access to vector from multiple threads

In my program I've some threads running. Each thread gets a pointer to some object (in my program - vector). And each thread modifies the vector.
And sometimes my program fails with a segm-fault. I thought it occurred because thread A begins doing something with the vector while thread B hasn't finished operating with it? Can it be true?
How am I supposed to fix it? Thread synchronization? Or maybe make a flag VectorIsInUse and set this flag to true while operating with it?
vector, like all STL containers, is not thread-safe. You have to explicitly manage the synchronization yourself. A std::mutex or boost::mutex could be use to synchronize access to the vector.
Do not use a flag as this is not thread-safe:
Thread A checks value of isInUse flag and it is false
Thread A is suspended
Thread B checks value of isInUse flag and it is false
Thread B sets isInUse to true
Thread B is suspended
Thread A is resumed
Thread A still thinks isInUse is false and sets it true
Thread A and Thread B now both have access to the vector
Note that each thread will have to lock the vector for the entire time it needs to use it. This includes modifying the vector and using the vector's iterators as iterators can become invalidated if the element they refer to is erase() or the vector undergoes an internal reallocation. For example do not:
mtx.lock();
std::vector<std::string>::iterator i = the_vector.begin();
mtx.unlock();
// 'i' can become invalid if the `vector` is modified.
If you want a container that is safe to use from many threads, you need to use a container that is explicitly designed for the purpose. The interface of the Standard containers is not designed for concurrent mutation or any kind of concurrency, and you cannot just throw a lock at the problem.
You need something like TBB or PPL which has concurrent_vector in it.
That's why pretty much every class library that offers threads also has synchronization primitives such as mutexes/locks. You need to setup one of these, and aquire/release the lock around every operation on the shared item (read AND write operations, since you need to prevent reads from occuring during a write too, not just preventing multiple writes happening concurrently).