Is the following a correct implementation of a blocking queue ?
Correct as in:
Thread-safe, i.e., has the semantics of a queue in a multi thread scenario.
If empty, blocks if trying to retrieve an element.
If full (std::u8::MAX in this case), blocks if trying to add a new element.
From my static analysis, I would say it is safe from dead locks; but I know how tricky synchronization can be.
Is there anything obviously wrong with this implementation that would prevent me to use in production? (hypothetical scenario)
If this implementation is correct, what are the performance drawbacks compared to a proper BlockingQueue, and why?
Code
use std::collections::VecDeque;
use crate::Semaphore;
use std::sync::Mutex;
pub struct BlockingQueue<T> {
non_empty_queue: Semaphore,
bounded_queue: Semaphore,
data: Mutex<VecDeque<T>>
}
impl<T> BlockingQueue<T> {
pub fn poll(&self) -> T {
self.non_empty_queue.decrement();
let mut lock_guard = self.data.lock().expect("Unable to acquire lock ...");
let result = lock_guard.pop_back().expect("Major flaw!");
self.bounded_queue.increment();
result
}
pub fn offer(&self, t: T) -> () {
self.bounded_queue.decrement();
let mut lock_guard = self.data.lock().expect("Unable to acquire lock ...");
lock_guard.push_front(t);
self.non_empty_queue.increment();
}
pub fn new() -> BlockingQueue<T> {
BlockingQueue {
non_empty_queue: Semaphore::new(0),
bounded_queue: Semaphore::new(std::u8::MAX),
data: Mutex::new(VecDeque::new())
}
}
}
Notes:
The Semaphore is of my own creation. Lets just assume it is correctly implemented.
bounded_queue attempts to prevent insertions beyond std::u8::MAX.
non_empty_queue attemps to prevent "pops" when empty.
Related
I'm using asio for async io, but there are some times where I'd like to "escape" the async world and get my data back into the regular synchronous world.
For instance, consider that I have a std::deque<string> _data that is being used in my async process (in a single thread always running in the background), and were I've created async function to read / write from it.
What is the "natural" way to read from this deque in a synchronous way from another thread ?
So far I've used atomics to do this but this feels a bit "wrong".
For example:
std::string getDataSync()
{
std::atomic<int> signal = 0;
std::string str;
asio::post(io_context, [this, &signal, &str] {
str = _data.front();
_data.pop_front();
signal = 1;
});
while(signal == 0) { }
return str;
}
Is it ok to do this?
Does asio provide anything cleaner to do this kind of operations?
Thanks
If you want to synchronize two threads, then you have to use sychronize primitives (like std::atomic). Asio doesn't provide more advanced primitives, but the STL (and boost) is full of it. For your simple example, you might want to use std::future and std::promise to move the top item of the deque to another thread.
Here is a small example. I assume that you don't want to access the deque directly from the other thread, just the top item. I also assume that you are running boost::asio::run in another thread.
inline constexpr std::string pop_from_queue() { return "hello world"; }
int main() {
auto context = boost::asio::io_context{};
auto promise = std::promise<std::string>{};
auto result = promise.get_future();
boost::asio::post(context,
[&promise] { promise.set_value(pop_from_queue()); });
auto thread = std::thread{[&context] { context.run(); }};
std::cout << result.get(); // blocking
thread.join();
}
I'm trying to run a list of futures concurrently (instead of in sequence) in Rust async-await (being stabilized soon), until any of them resolves to true.
Imagine having a Vec<File>, and a future to run for each file yielding a bool (may be unordered). Here would be a simple sequenced implementation.
async fn my_function(files: Vec<File>) -> bool {
// Run the future on each file, return early if we received true
for file in files {
if long_future(file).await {
return true;
}
}
false
}
async fn long_future(file: File) -> bool {
// Some long-running task here...
}
This works, but I'd like to run a few of these futures concurrently to speed up the process. I came across buffer_unordered() (on Stream), but couldn't figure out how to implement this.
As I understand it, something like join can be used as well to run futures concurrently, given that you gave a multithreaded pool. But I don't see how that could efficiently be used here.
I attempted something like this, but couldn't get it to work:
let any_true = futures::stream::iter(files)
.buffer_unordered(4) // Run up to 4 concurrently
.map(|file| long_future(file).await)
.filter(|stop| stop) // Only propagate true values
.next() // Return early on first true
.is_some();
Along with that, I'm looking for something like any as used in iterators, to replace the if-statement or the filter().next().is_some() combination.
How would I go about this?
I think that you should be able to use select_ok, as mentioned by Some Guy. An example, in which I've replaced the files with a bunch of u32 for illustration:
use futures::future::FutureExt;
async fn long_future(file: u32) -> bool {
true
}
async fn handle_file(file: u32) -> Result<(), ()> {
let should_stop = long_future(file).await;
// Would be better if there were something more descriptive here
if should_stop {
Ok(())
} else {
Err(())
}
}
async fn tims_answer(files: Vec<u32>) -> bool {
let waits = files.into_iter().map(|f| handle_file(f).boxed());
let any_true = futures::future::select_ok(waits).await.is_ok();
any_true
}
I have a blocking queue (it would be really hard for me to change its implementation), and I want to test that it actually blocks. In particular, the pop methods must block if the queue is empty and unblock as soon as a push is performed. See the following pseudo C++11 code for the test:
BlockingQueue queue; // empty queue
thread pushThread([]
{
sleep(large_delay);
queue.push();
});
queue.pop();
Obviously it is not perfect, because it may happen that the whole thread pushThread is executed and terminates before pop is called, even if the delay is large, and the larger the delay the more I have to wait for the test being over.
How can I properly ensure that pop is executed before push is called and that is blocks until push returns?
I do not believe this is possible without adding some extra state and interfaces to your BlockingQueue.
Proof goes something like this. You want to wait until the reading thread is blocked on pop. But there is no way to distinguish between that and the thread being about to execute the pop. This remains true no matter what you put just before or after the call to pop itself.
If you really want to fix this with 100% reliability, you need to add some state inside the queue, guarded by the queue's mutex, that means "someone is waiting". The pop call then has to update that state just before it atomically releases the mutex and goes to sleep on the internal condition variable. The push thread can obtain the mutex and wait until "someone is waiting". To avoid a busy loop here, you will want to use the condition variable again.
All of this machinery is nearly as complicated as the queue itself, so maybe you will want to test it, too... This sort of multi-threaded code is where concepts like "code coverage" -- and arguably even unit testing itself -- break down a bit. There are just too many possible interleavings of operations.
In practice, I would probably go with your original approach of sleeping.
template<class T>
struct async_queue {
T pop() {
auto l = lock();
++wait_count;
cv.wait( l, [&]{ return !data.empty(); } );
--wait_count;
auto r = std::move(data.front());
data.pop_front();
return r;
}
void push(T in) {
{
auto l = lock();
data.push_back( std::move(in) );
}
cv.notify_one();
}
void push_many(std::initializer_list<T> in) {
{
auto l = lock();
for (auto&& x: in)
data.push_back( x );
}
cv.notify_all();
}
std::size_t readers_waiting() {
return wait_count;
}
std::size_t data_waiting() const {
auto l = lock();
return data.size();
}
private:
std::queue<T> data;
std::condition_variable cv;
mutable std::mutex m;
std::atomic<std::size_t> wait_count{0};
auto lock() const { return std::unique_lock<std::mutex>(m); }
};
or somesuch.
In the push thread, busy wait on readers_waiting until it passes 1.
At which point you have the lock and are within cv.wait before the lock is unlocked. Do a push.
In theory an infinitely slow reader thread could have gotten into cv.wait and still be evaluating the first lambda by the time you call push, but an infinitely slow reader thread is no different than a blocked one...
This does, however, deal with slow thread startup and the like.
Using readers_waiting and data_waiting for anything other than debugging is usually code smell.
You can use a std::condition_variable to accomplish this. The help page of cppreference.com actually shows a very nice cosumer-producer example which should be exactly what you are looking for: http://en.cppreference.com/w/cpp/thread/condition_variable
EDIT: Actually the german version of cppreference.com has an even better example :-) http://de.cppreference.com/w/cpp/thread/condition_variable
I have a shared Vec<CacheChange>. Whenever a new CacheChange is written I want to wake up readers. I recall that a Condvar is good for signaling when a predicate/situation is ready, namely, when the Vec is modified.
So I spent some time creating a Monitor abstraction to own the Vec and provide wait and lock semantics.
The problem now is I don't know when to reset the Condvar. What is a good way to give a reasonable amount of time to readers to hit the predicate and work their way to holding the lock? before closing the condvar? Am I approach Condvars the wrong way?
This is Rust code but this more a question of fundamentals for exact concurrent access/notification between multiple readers.
pub struct Monitor<T>(
sync::Arc<MonitorInner<T>>
);
struct MonitorInner<T> {
data: sync::Mutex<T>,
predicate: (sync::Mutex<bool>, sync::Condvar)
}
impl<T> Monitor<T> {
pub fn wait(&self) -> Result<(),sync::PoisonError<sync::MutexGuard<bool>>> {
let mut open = try!(self.0.predicate.0.lock());
while !*open {
open = try!(self.0.predicate.1.wait(open));
}
Ok(())
}
pub fn lock(&self) -> Result<sync::MutexGuard<T>, sync::PoisonError<sync::MutexGuard<T>>> {
self.0.data.lock()
}
pub fn reset(&mut self) -> Result<(),sync::PoisonError<sync::MutexGuard<bool>>> {
let mut open = try!(self.0.predicate.0.lock());
*open = false;
Ok(())
}
pub fn wakeup_all(&mut self) -> Result<(),sync::PoisonError<sync::MutexGuard<bool>>> {
let mut open = try!(self.0.predicate.0.lock());
*open = true;
self.0.predicate.1.notify_all();
Ok(())
}
}
After the first wakeup call, my readers are able to miss reads. Probably because they are still holding the data lock while the predicate has been toggled again.I've seen this in my test code with just one reader and one writer.
Then there's the complication of when to reset the Monitor, ideally it would be locked after all readers had their chance to look at the data. This could cause deadlock issues if the reader ignore their monitors (no guarantee they should service every wakeup call).
Do I need to use some kind of reader tracking system with timeouts and track when new data arrives while monitor reads are still being serviced? Is there an existing paradigm I should be aware of?
The simplest solution is to use a counter instead of a boolean.
struct MonitorInner<T> {
data: sync::Mutex<T>,
signal: sync::Condvar,
counter: sync::AtomicUsize,
}
Then, every time an update is done, the counter is incremented. It is never reset, so there is no question about when to reset.
Of course, it means that readers should remember the value of the counter the last time they were woken up.
I try to turn some central data structure of a large codebase multithreaded.
The access interfaces were changed to represent read/write locks, which may be up- and downgraded:
Before:
Container& container = state.getContainer();
auto value = container.find( "foo" )->bar;
container.clear();
Now:
ReadContainerLock container = state.getContainer();
auto value = container.find( "foo" )->bar;
{
// Upgrade read lock to write lock
WriteContainerLock write = state.upgrade( container );
write.clear();
} // Downgrades write lock to read lock
Using an actual std::mutex for the locking (instead of r/w implementation) works fine but brings no performance benefit (actually degrades runtime).
Actual changing data is relatively rare, so it seems very desirable to go with the read/write concept. The big issue now is that I cannot seem to find any library, which implements the read/write concept and supports upgrade and downgrade and works on Windows, OSX and Linux alike.
Boost has BOOST_THREAD_PROVIDES_SHARED_MUTEX_UPWARDS_CONVERSIONS but does not seem to support downgrading (blocking) atomic upgrading from shared to unique.
Is there any library out there, that supports the desired feature set?
EDIT:
Sorry for being unclear. Of course I mean multiple-readers/single-writer lock semantic.
The question has changed since I answered. As the previous answer is still useful, I will leave it up.
The new question seems to be "I want a (general purpose) reader writer lock where any reader can be upgraded to a writer atomically".
This cannot be done without deadlocks, or the ability to roll back operations (transactional reads), which is far from general-purpose.
Suppose you have Alice and Bob. Both want to read for a while, then they both want to write.
Alice and Bob both get a read lock. They then upgrade to a write lock. Neither can progress, because a write lock cannot be acquired while a read lock is acquired. You cannot unlock the read lock, because then the state Alice read while read locked may not be consistent with the state after the write lock is acquired.
This can only be solved with the possibility the read->write upgrade can fail, or the ability to rollback all operations in a read (so Alice can "unread", Bob can advance, then Alice can re-read and try to get the write lock).
Writing type-safe transactional code isn't really supported in C++. You can do it manually, but beyond simple cases it is error prone. Other forms of transactional rollbacks can also be used. None of them are general purpose reader-writer locks.
You can roll your own. If the states are R, U, W and {} (read, upgradable, write and no lock), these are transitions you can easily support:
{} -> R|U|W
R|U|W -> {}
U->W
W->U
U->R
and implied by the above:
W->R
which I think satisifies your requirements.
The "missing" transition is R->U, which is what lets us have multiple-readers safely. At most one reader (the upgrade reader) has the right to upgrade to write without releasing their read lock. While they are in that upgrade state they do not block other threads from reading (but they do block other threads from writing).
Here is a sketch. There is a shared_mutex A; and a mutex B;.
B represents the right to upgrade to write and the right to read while you hold it. All writers also hold a B, so you cannot both have the right to upgrade to write while someone else has the right to write.
Transitions look like:
{}->R = read(A)
{}->W = lock(B) then write(A)
{}->U = lock(B)
U->W = write(A)
W->U = unwrite(A)
U->R = read(A) then unlock(B)
W->R = W->U->R
R->{} = unread(A)
W->{} = unwrite(A) then unlock(B)
U->{} = unlock(B)
This simply requires std::shared_mutex and std::mutex, and a bit of boilerplate to write up the locks and the transitions.
If you want to be able to spawn a write lock while the upgrade lock "remains in scope" extra work needs to be done to "pass the upgrade lock back to the read lock".
Here are some bonus try transitions, inspired by #HowardHinnat below:
R->try U = return try_lock(B) && unread(A)
R->try W = return R->try U->W
Here is an upgradable_mutex with no try operations:
struct upgradeable_mutex {
std::mutex u;
std::shared_timed_mutex s;
enum class state {
unlocked,
shared,
aspiring,
unique
};
// one step at a time:
template<state start, state finish>
void transition_up() {
transition_up<start, (state)((int)finish-1)>();
transition_up<(state)((int)finish-1), finish>();
}
// one step at a time:
template<state start, state finish>
void transition_down() {
transition_down<start, (state)((int)start-1)>();
transition_down<(state)((int)start-1), finish>();
}
void lock();
void unlock();
void lock_shared();
void unlock_shared();
void lock_aspiring();
void unlock_aspiring();
void aspiring_to_unique();
void unique_to_aspiring();
void aspiring_to_shared();
void unique_to_shared();
};
template<>
void upgradeable_mutex::transition_up<
upgradeable_mutex::state::unlocked, upgradeable_mutex::state::shared
>
() {
s.lock_shared();
}
template<>
void upgradeable_mutex::transition_down<
upgradeable_mutex::state::shared, upgradeable_mutex::state::unlocked
>
() {
s.unlock_shared();
}
template<>
void upgradeable_mutex::transition_up<
upgradeable_mutex::state::unlocked, upgradeable_mutex::state::aspiring
>
() {
u.lock();
}
template<>
void upgradeable_mutex::transition_down<
upgradeable_mutex::state::aspiring, upgradeable_mutex::state::unlocked
>
() {
u.unlock();
}
template<>
void upgradeable_mutex::transition_up<
upgradeable_mutex::state::aspiring, upgradeable_mutex::state::unique
>
() {
s.lock();
}
template<>
void upgradeable_mutex::transition_down<
upgradeable_mutex::state::unique, upgradeable_mutex::state::aspiring
>
() {
s.unlock();
}
template<>
void upgradeable_mutex::transition_down<
upgradeable_mutex::state::aspiring, upgradeable_mutex::state::shared
>
() {
s.lock();
u.unlock();
}
void upgradeable_mutex::lock() {
transition_up<state::unlocked, state::unique>();
}
void upgradeable_mutex::unlock() {
transition_down<state::unique, state::unlocked>();
}
void upgradeable_mutex::lock_shared() {
transition_up<state::unlocked, state::shared>();
}
void upgradeable_mutex::unlock_shared() {
transition_down<state::shared, state::unlocked>();
}
void upgradeable_mutex::lock_aspiring() {
transition_up<state::unlocked, state::aspiring>();
}
void upgradeable_mutex::unlock_aspiring() {
transition_down<state::aspiring, state::unlocked>();
}
void upgradeable_mutex::aspiring_to_unique() {
transition_up<state::aspiring, state::unique>();
}
void upgradeable_mutex::unique_to_aspiring() {
transition_down<state::unique, state::aspiring>();
}
void upgradeable_mutex::aspiring_to_shared() {
transition_down<state::aspiring, state::shared>();
}
void upgradeable_mutex::unique_to_shared() {
transition_down<state::unique, state::shared>();
}
I attempt to get the compiler to work out some of the above transitions "for me" with the transition_up and transition_down trick. I think I can do better, and it did increase code bulk significantly.
Having it 'auto-write' the unlocked-to-unique, and unique-to-(unlocked|shared) was all I got out of it. So probably not worth it.
Creating smart RAII objects that use the above is a bit tricky, as they have to support some transitions that the default unique_lock and shared_lock do not support.
You could just write aspiring_lock and then do conversions in there (either as operator unique_lock, or as methods that return said, etc), but the ability to convert from unique_lock&& down to shared_lock is exclusive to upgradeable_mutex and is a bit tricky to work with implicit conversions...
live example.
Here's my usual suggestion: Seqlock
You can have a single writer and many readers concurrently. Writers compete using a spinlock. A single writer doesn't need to compete so is cheaper.
Readers are truly only reading. They're not writing any state variables, counters, etc. This means you don't really know how many readers are there. But also, there no cache line ping pong so you get the best performance possible in terms of latency and throughput.
What's the catch? the data almost has to be POD. It doesn't really have to POD, but it can not be invalidated (no deleting std::map nodes) as readers may read it while it's being written.
It's only after the fact that readers discover the data is possibly bad and they have to re-read.
Yes, writers don't wait for readers so there's no concept of upgrade/downgrade. You can unlock one and lock the other. You pay less than with any sort of mutex but the data may have changed in the process.
I can go into more detail if you like.
The std::shared_mutex (as implemented in boost if not available on your platform(s)) provides some alternative for the problem.
For atomic upgrade lock semantics, the boost upgrade lock may be the best cross platform alternative.
It does not have an upgrade and downgrade locking mechanism you are looking for, but to get an exclusive lock, the shared access can be relinquished first, then exclusive access sought.
// assumes shared_lock with shared access has been obtained
ReadContainerLock container = state.getContainer();
auto value = container.find( "foo" )->bar;
{
container.shared_mutex().unlock();
// Upgrade read lock to write lock
std::unique_lock<std::shared_mutex> write(container.shared_mutex());
// container work...
write.unlock();
container.shared_mutex().lock_shared();
} // Downgrades write lock to read lock
A utility class can be used to cause the re-locking of the shared_mutex at the end of the scope;
struct re_locker {
re_locker(std::shared_mutex& m) : m_(m) { m_.unlock(); }
~re_locker() { m_.shared_lock(); }
// delete the copy and move constructors and assignment operator (redacted for simplicity)
};
// ...
auto value = container.find( "foo" )->bar;
{
re_locker re_lock(container.shared_mutex());
// Upgrade read lock to write lock
std::unique_lock<std::shared_mutex> write(container.shared_mutex());
// container work...
} // Downgrades write lock to read lock
Depending on what exception guarantees you want or require, you may need to add a "can re-lock" flag to the re_locker to either do the re-lock or not if an exception is thrown during the container operations/work.