The topic says it. I don't understand why the std::queue (or in general: any queue) is not thread-safe by its nature, when there is no iterator involved as with other datastructures.
According to the common rule that
at least one thread is writing to ...
and another thread is reading from a shared resource
I should have gotten a conflict in the following example code:
#include "stdafx.h"
#include <queue>
#include <thread>
#include <iostream>
struct response
{
static int & getCount()
{
static int theCount = 0;
return theCount;
}
int id;
};
std::queue<response> queue;
// generate 100 response objects and push them into the queue
void produce()
{
for (int i = 0; i < 100; i++)
{
response r;
r.id = response::getCount()++;
queue.push(r);
std::cout << "produced: " << r.id << std::endl;
}
}
// get the 100 first responses from the queue
void consume()
{
int consumedCounter = 0;
for (;;)
{
if (!queue.empty())
{
std::cout << "consumed: " << queue.front().id << std::endl;
queue.pop();
consumedCounter++;
}
if (consumedCounter == 100)
break;
}
}
int _tmain(int argc, _TCHAR* argv[])
{
std::thread t1(produce);
std::thread t2(consume);
t1.join();
t2.join();
return 0;
}
Everything seems to be working fine:
- No integrity violated / data corrupted
- The order of the elements in which the consumer gets them are correct (0<1<2<3<4...), of course the order in which the prod. and cons. are printing is random as there is no signaling involved.
Imagine you check for !queue.empty(), enter the next block and before getting to access queue.first(), another thread would remove (pop) the one and only element, so you query an empty queue.
Using a synchronized queue like the following
#pragma once
#include <queue>
#include <mutex>
#include <condition_variable>
template <typename T>
class SharedQueue
{
public:
SharedQueue();
~SharedQueue();
T& front();
void pop_front();
void push_back(const T& item);
void push_back(T&& item);
int size();
bool empty();
private:
std::deque<T> queue_;
std::mutex mutex_;
std::condition_variable cond_;
};
template <typename T>
SharedQueue<T>::SharedQueue(){}
template <typename T>
SharedQueue<T>::~SharedQueue(){}
template <typename T>
T& SharedQueue<T>::front()
{
std::unique_lock<std::mutex> mlock(mutex_);
while (queue_.empty())
{
cond_.wait(mlock);
}
return queue_.front();
}
template <typename T>
void SharedQueue<T>::pop_front()
{
std::unique_lock<std::mutex> mlock(mutex_);
while (queue_.empty())
{
cond_.wait(mlock);
}
queue_.pop_front();
}
template <typename T>
void SharedQueue<T>::push_back(const T& item)
{
std::unique_lock<std::mutex> mlock(mutex_);
queue_.push_back(item);
mlock.unlock(); // unlock before notificiation to minimize mutex con
cond_.notify_one(); // notify one waiting thread
}
template <typename T>
void SharedQueue<T>::push_back(T&& item)
{
std::unique_lock<std::mutex> mlock(mutex_);
queue_.push_back(std::move(item));
mlock.unlock(); // unlock before notificiation to minimize mutex con
cond_.notify_one(); // notify one waiting thread
}
template <typename T>
int SharedQueue<T>::size()
{
std::unique_lock<std::mutex> mlock(mutex_);
int size = queue_.size();
mlock.unlock();
return size;
}
The call to front() waits until it has an element and locks the underlying queue so only one thread may access it at a time.
Related
I am currently tasked with writing a generic thread pool, having multiple worker threads and one scheduling thread in c++. To support running any kind of function within one pool I used variadic functions and parameter packs, in which I have little experience.
My thread pool and worker class looks like this
template<typename R, typename ...A>
class Worker {
public:
// starts thread that runs during lifetime of worker object
Worker() : thread_([this] {
while (!stopped) {
// run task that worker has been set to and remove it thereafter
if (task_ != NULL) {
idle = false;
task_(std::get<A>(args_)...);
task_ = NULL;
}
idle = true;
}
}) { }
~Worker() {
stop();
}
void stop() {
stopped = true;
thread_.join();
}
bool idling() {
return idle;
}
void set_work(std::function<R(A...)> task, std::tuple<A...> args) {
task_ = task;
args_ = args;
}
private:
std::thread thread_;
std::function<R(A...)> task_;
std::tuple<A...> args_;
bool idle = false;
bool stopped = false;
};
template<typename R, typename ...A>
class ThreadPool {
public:
// pool runs scheduling thread which assigns queued tasks to idling workers
ThreadPool(size_t num_workers) : workers(num_workers), num_workers_(num_workers), runner([this, num_workers] {
while(!stopped) {
for (size_t i = 0; i < num_workers; i++) {
if (workers[i].idling() && !q.empty()) {
workers[i].set_work(q.front().first, q.front().second);
q.pop();
}
}
}
}) { }
void add_task(std::function<R(A...)> task, A... args) {
q.push({task, std::make_tuple(args...)});
}
size_t tasks_left() {
return q.size();
}
size_t workers_idling() {
size_t n = 0;
for (size_t i = 0; i < num_workers_; i++) {
if (workers[i].idling()) n++;
}
return n;
}
void stop() {
for (size_t i = 0; i < num_workers_; i++) {
workers[i].stop();
}
stopped = true;
runner.join();
}
private:
std::vector<Worker<R, A...>> workers;
std::queue<std::pair<std::function<R(A...)>, std::tuple<A...>>> q;
std::thread runner;
bool stopped = false;
size_t num_workers_;
};
The first hurdle I encountered was that I was not able to use references as variadic types, so I used the whole object.
But any class not specifying a default constructor, throws the following error https://pastebin.com/ye6enTD3.
Accordingly for any other class which does, the member variables are not consistently the same as the object I passed to the worker.
I would appreciate your help on this topic.
I would start out with something like this.
Note stopping a threadpool that still has work needs design decissions.
And yes you will need mutexes and condition variables to make everything work and synchronize. A good threadpool implementation is not trivial.
#include <future>
#include <thread>
#include <queue>
#include <mutex>
#include <condition_variable>
#include <iostream>
class task_itf
{
public:
virtual void operator()() = 0;
};
template<typename retval_t>
struct task_t final :
public task_itf
{
public:
explicit task_t(std::function<retval_t()> fn) :
m_task{ fn }
{
}
void operator()() override
{
m_task();
}
auto future()
{
return m_task.get_future();
}
private:
std::packaged_task<retval_t ()> m_task;
};
class threadpool_t
{
public:
threadpool_t() :
m_running{ true }
{
}
template<typename fn_t>
auto schedule(fn_t fn) -> std::future<decltype(fn())>
{
using retval_t = decltype(fn());
auto task = std::make_shared<task_t<retval_t>>(fn);
{
std::scoped_lock<std::mutex> lock{ m_mtx };
m_queue.push(task);
}
m_queued.notify_one();
return task->future();
}
private:
// .. todo let the threads pickup queue entries one by one.
// if a thread is finished with a task and there are entries
// in the queue it can immediately pickup the next.
// otherwise wait for signal on m_cv;
std::mutex m_mtx;
std::condition_variable m_queued;
bool m_running;
// shared_ptr, because we hand over task to another thread later
std::queue<std::shared_ptr<task_itf>> m_queue;
};
int main()
{
threadpool_t pool;
pool.schedule([] {std::cout << "Hello world"; });
}
There is a use case where there are received different types of triggers and the requirements are
Have only 5 records of the same trigger, and override the oldest one in case of new trigger received while 5 records are available
The order of triggers should be reserved.
maintaining first come first go concept of queue
be able to remove single trigger anywhere in the records while maintaining the order.
template <typename T, std::size_t kSize>
class kqueue
{
private:
// main queue mutex
std::mutex m_mainMutex;
// array of mutex to disable data racing on queues
std::array<std::mutex, kSize> m_mutexArray;
// number of queues inside a queueu
uint32_t K;
// main queue that holds all data
// T stands for data type
// uint8_t is the data type for sub queue number
std::vector<std::pair<T, uint8_t> > m_mainQueue;
// Array of sub queues to hold the positions of each sub queue data
// in the main queue.
std::array<std::queue<uint8_t> , kSize> m_subQueues;
public:
virtual kqueue();
virtual ~kqueue();
// void push(T Data);
void pop();
bool isEmpty();
void pushToK(uint8_t K, T Data);
void popFromK(uint8_t K);
bool isKEmpty(uint8_t K);
std::vector<std::pair<T, uint8_t>::iterator front() const;
}; // end of class kqueue
template <>
kqueue::kqueue()
{
}
// main class destructor "can be overridden"
kqueue::~kqueue()
{
}
// /* this function isn't right as the push function isn't identifying which sub queue to push in */
// template <typename T>
// void kqueue::push(T Data)
// {
// std::lock_guard<std::mutex> lock(m_mainMutex);
// m_mainQueue.emplace_back(Data);
// }
template <>
void kqueue::pop()
{
std::lock_guard<std::mutex> lock(m_mainMutex);
m_mainQueue.erase(m_mainQueue.begin());
}
template <typename T>
void kqueue::pushToK(uint8_t K, T Data)
{
const std::lock_guard<std::mutex> lock(m_mutexArray[K]);
{
std::lock_guard<std::mutex> lock(m_mainMutex);
m_mainQueue.emplace_back(Data, K);
}
}
template <>
void kqueue::popFromK(uint8_t K)
{
// locks targeted mutex to prevent data racing on such queue
const std::lock_guard<std::mutex> lock(m_mutexArray[K]);
{
const std::lock_guard<std::mutex> lock(m_mainMutex);
for (auto i = m_mainQueue.begin(); i != m_mainQueue.end() ; i++)
{
if( std::get<1>(*i) == K)
{
m_mainQueue.erase(i);
m_mainQueue.shrink_to_fit();
break;
}
}
}
}
template <>
bool kqueue::isEmpty()
{
return m_mainQueue.empty();
}
template <>
bool kqueue::isKEmpty(uint8_t K)
{
return m_subQueues[K].empty();
}
template <>
std::vector<std::pair<T, uint8_t>::iterator kqueue::front() const
{
const std::lock_guard<std::mutex> lock(m_mainMutex);
return m_mainQueue.begin();
}
I implemented a ThreadPool to test my knowledge of C++ concurrency. However, when I run the following code, it does not proceed and my mac becomes extremely slow and eventually does not respond—I check the monitor later and find the reason is that the kernel_task launches several clang processes and each runs nearly 100% CPU. I've carefully gone through the code several times, but still unable to locate the problem.
Here's the test code for ThreadPool. When I run this code, there is nothing printed on the terminal. Worse still, even if I cancel the process(via contrl+c), kernel_task creates several clang later and my computer crashes.
// test code for ThreadPool
#include <iostream>
#include <functional>
#include <future>
#include "thread_pool.hpp"
int task() {
static std::atomic<int> i = 1;
std::cout << i.fetch_add(1, std::memory_order_relaxed) << "task\n";
return i.load(std::memory_order_relaxed);
}
int main() {
ThreadPool<int()> thread_pool(1);
auto f1 = thread_pool.submit(task, false);
std::cout << "hello" << '\n';
auto f2 = thread_pool.submit(task, false);
std::cout << f1.get() << '\n';
std::cout << f2.get() << '\n';
}
Here's the definition of ThreadPool.
// thread_pool.hpp
#include <atomic>
#include <algorithm>
#include <chrono>
#include <functional>
#include <future>
#include <memory>
#include <queue>
#include <thread>
#include "queue.hpp"
template<typename Func>
class ThreadPool {
public:
ThreadPool(std::size_t=std::thread::hardware_concurrency()); // should I minus one here for the main thread?
~ThreadPool();
template<typename... Args,
typename ReturnType=typename std::result_of<std::decay_t<Func>(std::decay_t<Args>...)>::type>
std::future<ReturnType> submit(Func f, bool local=true);
private:
void worker_thread();
void run_task();
using LocalThreadType = std::queue<std::packaged_task<Func>>;
static thread_local LocalThreadType local_queue; // local queue, not used for now
using ThreadSafeQueue = LockBasedQueue<std::packaged_task<int()>,
std::list<std::packaged_task<int()>>>;
std::shared_ptr<ThreadSafeQueue> shared_queue;
std::atomic_bool done;
std::vector<std::thread> threads;
};
template<typename Func>
ThreadPool<Func>::ThreadPool(std::size_t n): done(false) {
threads.emplace_back(&ThreadPool::worker_thread, this);
}
template<typename Func>
ThreadPool<Func>::~ThreadPool() {
done.store(true, std::memory_order_relaxed);
for (auto& t: threads)
t.join();
}
template<typename Func>
template<typename...Args,
typename ReturnType>
std::future<ReturnType> ThreadPool<Func>::submit(Func f, bool local) {
auto result = local? post_task(f, local_queue):
post_task(f, *shared_queue);
return result;
}
template<typename Func>
void ThreadPool<Func>::run_task() {
if (!local_queue.empty()) {
auto task = std::move(local_queue.front());
local_queue.pop();
task();
}
else {
std::packaged_task<Func> task;
auto flag = shared_queue->try_pop(task);
if (flag)
task();
else {
using namespace std::chrono_literals;
std::this_thread::sleep_for(1s);
}
}
}
template<typename Func>
void ThreadPool<Func>::worker_thread() {
while (!done.load(std::memory_order_relaxed)) {
run_task();
}
}
template<typename Func>
thread_local typename ThreadPool<Func>::LocalThreadType ThreadPool<Func>::local_queue = {};
Here's the definition of post_task and LockBasedQueue, which have passed the test code in the next code block.
// queue.hpp
#include <mutex>
#include <list>
#include <deque>
#include <queue>
#include <memory>
#include <future>
#include <condition_variable>
template<typename T, typename Container>
class LockBasedQueue; // forward declaration
template<typename Func, typename... Args,
typename ReturnType=typename std::result_of<std::decay_t<Func>(std::decay_t<Args>...)>::type,
typename Container=std::list<Func>,
typename ThreadQueue=LockBasedQueue<std::packaged_task<ReturnType(Args...)>, Container>>
std::future<ReturnType> post_task(Func f, ThreadQueue& task_queue) {
std::packaged_task<ReturnType(Args...)> task(f);
std::future res = task.get_future();
task_queue.push(std::move(task)); // packaged_task is not copyable
return res;
}
// the general template is omitted and it's not needed in this context
template<typename T>
class LockBasedQueue<T, std::list<T>> {
public:
// constructors
LockBasedQueue(): head(std::make_unique<Node>()), tail(head.get()) {}
LockBasedQueue(LockBasedQueue&&);
// assignments
LockBasedQueue& operator=(LockBasedQueue&&);
// general purpose operations
void swap(LockBasedQueue&);
bool empty() const;
std::size_t size() const;
// queue operations
void push(const T&);
void push(T&&);
template <typename... Args>
void emplace(Args&&... args);
T pop();
bool try_pop(T&);
// delete front() and back(), these functions may waste notifications. To enable these function, one should replace notify_one() with notify_all() in push() and emplace()
T& front() = delete;
const T& front() const = delete;
T& back() = delete;
const T& back() const = delete;
private:
struct Node {
std::unique_ptr<T> data; // data is a pointer as it may be empty
std::unique_ptr<Node> next;
};
Node* get_tail() {
std::lock_guard l(tail_mutex);
return tail;
}
Node* get_head() {
std::lock_guard l(head_mutex);
return head.get();
}
std::unique_lock<std::mutex> get_head_lock() {
std::unique_lock l(head_mutex);
data_cond.wait(l, [this] { return head.get() != get_tail(); });
return l;
}
T pop_data() {
auto data = std::move(*head->data);
head = std::move(head->next); // we move head to the next so that the tail is always valid
return std::move(data);
}
std::unique_ptr<Node> head;
std::mutex head_mutex;
Node* tail;
std::mutex tail_mutex;
std::condition_variable data_cond;
};
template<typename T>
LockBasedQueue<T, std::list<T>>::LockBasedQueue(
LockBasedQueue<T, std::list<T>>&& other) {
{
std::scoped_lock l(head_mutex, other.head_mutex);
head(std::move(other.data_queue));
}
{
std::lock_guard l(tail_mutex);
tail = head.get();
}
{
std::lock_guard l(other.tail_mutex);
other.tail = nullptr;
}
}
template<typename T>
LockBasedQueue<T, std::list<T>>&
LockBasedQueue<T, std::list<T>>::operator=(
LockBasedQueue<T, std::list<T>>&& rhs) {
{
std::scoped_lock l(head_mutex, rhs.head_mutex);
head(std::move(rhs.data_queue));
}
{
std::lock_guard l(tail_mutex);
tail = head.get();
}
{
std::lock_guard l(rhs.tail_mutex);
rhs.tail = nullptr;
}
}
template<typename T>
void LockBasedQueue<T, std::list<T>>::swap(
LockBasedQueue<T, std::list<T>>& other) {
{
std::scoped_lock l(head_mutex, other.head_mutex);
head(std::move(other.data_queue));
}
{
std::lock_guard l(tail_mutex);
tail = head.get();
}
{
std::lock_guard l(other.tail_mutex);
other.tail = other.head.get();
}
}
template<typename T>
inline bool LockBasedQueue<T, std::list<T>>::empty() const {
return get_head() == get_tail();
}
template<typename T>
std::size_t LockBasedQueue<T, std::list<T>>::size() const {
int n = 0;
std::lock_guard l(tail_mutex); // do not use get_tail() here to avoid race condition
for (auto p = get_head(); p != tail; p = p->next.get())
++n;
return n;
}
template<typename T>
void LockBasedQueue<T, std::list<T>>::push(const T& data) {
push(T(data));
}
template<typename T>
void LockBasedQueue<T, std::list<T>>::push(T&& data) {
{
auto p = std::make_unique<Node>();
std::lock_guard l(tail_mutex);
tail->data = std::make_unique<T>(std::move(data)); // we add data to the current tail, this allows us to move head to the next when popping
tail->next = std::move(p);
tail = tail->next.get();
}
data_cond.notify_one();
}
template<typename T>
template<typename...Args>
void LockBasedQueue<T, std::list<T>>::emplace(Args&&... args) {
{
auto p = std::make_unique<Node>();
std::lock_guard l(tail_mutex);
tail->data = std::make_unique<T>(std::forward<Args>(args)...);
tail->next = std::move(p);
tail = tail->next.get();
}
data_cond.notify_one();
}
template<typename T>
T LockBasedQueue<T, std::list<T>>::pop() {
auto l(get_head_lock());
return pop_data();
}
template<typename T>
bool LockBasedQueue<T, std::list<T>>::try_pop(T& data) {
std::lock_guard l(head_mutex);
if (head.get() == get_tail())
return false;
data = pop_data();
return true;
}
Here's the code I used to test LockBasedQueue and post_task. The following test code works without any problem.
// test code for LockBasedQueue and post_task
#include <iostream>
#include <functional>
#include <future>
#include "queue.hpp"
LockBasedQueue<std::packaged_task<int()>, std::list<std::packaged_task<int()>>> task_queue; // thread safe queue, which handles locks inside
void task_execution_thread() {
bool x = true;
while (x) { // for debugging purpose, we only execute this loop once
auto task = task_queue.pop(); // Returns the front task and removes it from queue. Waits if task_queue is empty
task(); // execute task
x = false;
}
}
int task() {
static std::atomic<int> i = 1;
std::cout << i.fetch_add(1, std::memory_order_relaxed) << "task\n";
return i.load(std::memory_order_relaxed);
}
int main() {
std::thread t1(task_execution_thread);
std::thread t2(task_execution_thread);
auto f1 = post_task(task, task_queue);
auto f2 = post_task(task, task_queue);
std::cout << "f1: " << f1.get() << '\n';
std::cout << "f2: " << f2.get() << '\n';
t1.join();
t2.join();
}
I test the code using g++ -std=c++2a on the MacOS 11.2.3.
shared_queue is default initialised therefore calling methods on it is undefined behaviour. Initialising it in the constructor of ThreadPool:
ThreadPool<Func>::ThreadPool(std::size_t n) : done(false), shared_queue(std::make_shared<ThreadSafeQueue>()) {
makes your code work: https://godbolt.org/z/P9G1T5
The topic says it. I don't understand why the std::queue (or in general: any queue) is not thread-safe by its nature, when there is no iterator involved as with other datastructures.
According to the common rule that
at least one thread is writing to ...
and another thread is reading from a shared resource
I should have gotten a conflict in the following example code:
#include "stdafx.h"
#include <queue>
#include <thread>
#include <iostream>
struct response
{
static int & getCount()
{
static int theCount = 0;
return theCount;
}
int id;
};
std::queue<response> queue;
// generate 100 response objects and push them into the queue
void produce()
{
for (int i = 0; i < 100; i++)
{
response r;
r.id = response::getCount()++;
queue.push(r);
std::cout << "produced: " << r.id << std::endl;
}
}
// get the 100 first responses from the queue
void consume()
{
int consumedCounter = 0;
for (;;)
{
if (!queue.empty())
{
std::cout << "consumed: " << queue.front().id << std::endl;
queue.pop();
consumedCounter++;
}
if (consumedCounter == 100)
break;
}
}
int _tmain(int argc, _TCHAR* argv[])
{
std::thread t1(produce);
std::thread t2(consume);
t1.join();
t2.join();
return 0;
}
Everything seems to be working fine:
- No integrity violated / data corrupted
- The order of the elements in which the consumer gets them are correct (0<1<2<3<4...), of course the order in which the prod. and cons. are printing is random as there is no signaling involved.
Imagine you check for !queue.empty(), enter the next block and before getting to access queue.first(), another thread would remove (pop) the one and only element, so you query an empty queue.
Using a synchronized queue like the following
#pragma once
#include <queue>
#include <mutex>
#include <condition_variable>
template <typename T>
class SharedQueue
{
public:
SharedQueue();
~SharedQueue();
T& front();
void pop_front();
void push_back(const T& item);
void push_back(T&& item);
int size();
bool empty();
private:
std::deque<T> queue_;
std::mutex mutex_;
std::condition_variable cond_;
};
template <typename T>
SharedQueue<T>::SharedQueue(){}
template <typename T>
SharedQueue<T>::~SharedQueue(){}
template <typename T>
T& SharedQueue<T>::front()
{
std::unique_lock<std::mutex> mlock(mutex_);
while (queue_.empty())
{
cond_.wait(mlock);
}
return queue_.front();
}
template <typename T>
void SharedQueue<T>::pop_front()
{
std::unique_lock<std::mutex> mlock(mutex_);
while (queue_.empty())
{
cond_.wait(mlock);
}
queue_.pop_front();
}
template <typename T>
void SharedQueue<T>::push_back(const T& item)
{
std::unique_lock<std::mutex> mlock(mutex_);
queue_.push_back(item);
mlock.unlock(); // unlock before notificiation to minimize mutex con
cond_.notify_one(); // notify one waiting thread
}
template <typename T>
void SharedQueue<T>::push_back(T&& item)
{
std::unique_lock<std::mutex> mlock(mutex_);
queue_.push_back(std::move(item));
mlock.unlock(); // unlock before notificiation to minimize mutex con
cond_.notify_one(); // notify one waiting thread
}
template <typename T>
int SharedQueue<T>::size()
{
std::unique_lock<std::mutex> mlock(mutex_);
int size = queue_.size();
mlock.unlock();
return size;
}
The call to front() waits until it has an element and locks the underlying queue so only one thread may access it at a time.
I have a class which has several readers and several writers.
I want to use read/write lock (using shared_mutex)
All the examples and information about this lock, use and release the lock in the same function:
std::shared_mutex
I want to use (for reason I wan't explain here) shared_mutex in this way:
Lock4Read();
UnLock4Read();
Lock4Write();
UnLock4Write();
So I can lock the object I need, do my logic, and release it at the end (in other function).
How can I do it ?
I know I can do it using linux pthread_rwlock_rdlock, but can I do it using shared_mutex ?
I don't pretty sure that you want to do something like that, but check this, maybe it will be helpful for you :)
#include <iostream>
#include <mutex>
#include <shared_mutex>
#include <thread>
class ThreadMom {
public:
void Lock4Read() { mutex_.lock_shared(); }
void UnLock4Read() { mutex_.unlock_shared(); }
void Lock4Write() { mutex_.lock(); }
void UnLock4Write() { mutex_.unlock(); }
private:
std::shared_mutex mutex_;
};
template <typename T> class Value {
public:
T get() const {return value_;}
void set(const T& value) {value_ = value;}
private:
T value_;
};
int main() {
ThreadMom mom;
Value<int> value;
value.set(0);
auto increment_and_print = [&mom, &value](int which) {
for (int i = 0; i < 3; i++) {
mom.Lock4Write();
value.set(i * which);
mom.UnLock4Write();
mom.Lock4Read();
std::cout << std::this_thread::get_id() << ' ' << value.get() << '\n';
mom.UnLock4Read();
}
};
std::thread thread1(increment_and_print, 1);
std::thread thread2(increment_and_print, 2);
thread1.join();
thread2.join();
}