std queue pop a moved std string in multithreading - c++

I am currently implementing a string processor. I used to using single-thread, but it is kind of slow, so I would like to use multi-thread to boost it. Now it has some problems I could not solve on my own.
I use thread-safe queue to implement producer and consumer. And the push and pop method of the thread-safe queue is below, and if whole file is needed, take a look at here:
template <typename Tp>
void ThreadSafeQueue<Tp>::enqueue(Tp &&data) {
std::lock_guard<std::mutex> lk(mtx);
q.emplace(std::forward<Tp>(data));
cv.notify_one();
}
template <typename Tp>
bool ThreadSafeQueue<Tp>::dequeue(Tp &data) {
std::unique_lock<std::mutex> lk(mtx);
while (!broken && q.empty()) {
cv.wait(lk);
}
if (!broken && !q.empty()) {
data = std::move(q.front());
q.pop();
}
return !broken;
}
When I use this struct to store string (aka Tp=std::string), problem occurs. I am using it this way:
producer:
__prepare_data__(raw_data)
std::vector<std::thread> vec_threads;
for(int i=0;i<thread_num;++i)
{
vec_threads.emplace_back(consumer,std::ref(raw_data),std::ref(processed_data))
}
for(int i=0;i<thread_num;++i)
{
if(vec_threads[i].joinable())
{
vec_thread[i].join();
}
__collect_data__(processed_data)
}
and consumer:
std::string buf;
while(deque(buf))
{
__process__(buf)
}
In the above codes, all values passed to consumer threads are passed by reference (aks using std::ref wrapper), so the __collect_data__ procedure is valid.
I will not meet any problem in these cases:
The number of string pieces is small. (This does not mean the string length is short.)
Only one consumer is working.
I will meet the problem in these cases:
The number of string is large, millions or so.
2 or more consumers is working.
And what exception the system would throw varies between these two:
Corrupted double-linked list, followed by a bunch of memory indicator. GDB told me the line causing problem is the pop in the dequeue method.
Pure segment fault. GDB told me the problem occurred when consumer threads were joining.
The first case happens the most frequently, so I would like to ask as the title indicates, Would it cause any undefined behavior when popping an already moved std::string? Or if you have any other insights, please let me know!

While there are issues with your code, there are none that explain your crash. I suggest you investigate your data processing code, not your queue.
For reference, your logic around queue shutdown is slightly wrong. For example, shutdown waits on the condition variable until the queue is empty but the dequeue operation does not notify on that variable. So you might deadlock.
It is easier to just ignore the "broken" flag in the dequeue operation until the queue is empty. That way the worker threads will drain the queue before quitting. Also, don't let the shutdown block until empty. If you want to wait until all threads are done with the queue, just join the threads.
Something like this:
template <typename Tp>
bool ThreadSafeQueue<Tp>::dequeue(Tp &data) {
std::unique_lock<std::mutex> lk(mtx);
while (!broken && q.empty()) {
cv.wait(lk);
}
if (q.empty())
return false; // broken
data = std::move(q.front());
q.pop();
return true;
}
template <typename Tp>
void ThreadSafeQueue<Tp>::shutdown() {
std::unique_lock<std::mutex> lk(mtx);
broken = true;
cv.notify_all();
}
There are other minor issues, for example it is in practice more efficient (and safe) to unlock mutexes before notifying the condition variables so that the woken threads do not race with the waking thread on acquiring/releasing the mutex. But that is not a correctness issue.
I also suggest you delete the move constructor on the queue. You rightfully noted that it shouldn't be called. Better make sure that it really isn't.

Related

And odd use of conditional variable with local mutex

Poring through legacy code of old and large project, I had found that there was used some odd method of creating thread-safe queue, something like this:
template < typename _Msg>
class WaitQue: public QWaitCondition
{
public:
typedef _Msg DataType;
void wakeOne(const DataType& msg)
{
QMutexLocker lock_(&mx);
que.push(msg);
QWaitCondition::wakeOne();
}
void wait(DataType& msg)
{
/// wait if empty.
{
QMutex wx; // WHAT?
QMutexLocker cvlock_(&wx);
if (que.empty())
QWaitCondition::wait(&wx);
}
{
QMutexLocker _wlock(&mx);
msg = que.front();
que.pop();
}
}
unsigned long size() {
QMutexLocker lock_(&mx);
return que.size();
}
private:
std::queue<DataType> que;
QMutex mx;
};
wakeOne is used from threads as kind of "posting" function" and wait is called from other threads and waits indefinitely until a message appears in queue. In some cases roles between threads reverse at different stages and using separate queues.
Is this even legal way to use a QMutex by creating local one? I kind of understand why someone could do that to dodge deadlock while reading size of que but how it even works? Is there a simpler and more idiomatic way to achieve this behavior?
Its legal to have a local condition variable. But it normally makes no sense.
As you've worked out in this case is wrong. You should be using the member:
void wait(DataType& msg)
{
QMutexLocker cvlock_(&mx);
while (que.empty())
QWaitCondition::wait(&mx);
msg = que.front();
que.pop();
}
Notice also that you must have while instead of if around the call to QWaitCondition::wait. This is for complex reasons about (possible) spurious wake up - the Qt docs aren't clear here. But more importantly the fact that the wake and the subsequent reacquire of the mutex is not an atomic operation means you must recheck the variable queue for emptiness. It could be this last case where you previously were getting deadlocks/UB.
Consider the scenario of an empty queue and a caller (thread 1) to wait into QWaitCondition::wait. This thread blocks. Then thread 2 comes along and adds an item to the queue and calls wakeOne. Thread 1 gets woken up and tries to reacquire the mutex. However, thread 3 comes along in your implementation of wait, takes the mutex before thread 1, sees the queue isn't empty, processes the single item and moves on, releasing the mutex. Then thread 1 which has been woken up finally acquires the mutex, returns from QWaitCondition::wait and tries to process... an empty queue. Yikes.

How could I quit a C++ blocking queue?

After reading some other articles, I got to know that I could implement a c++ blocking queue like this:
template<typename T>
class BlockingQueue {
public:
std::mutex mtx;
std::condition_variable not_full;
std::condition_variable not_empty;
std::queue<T> queue;
size_t capacity{5};
BlockingQueue()=default;
BlockingQueue(int cap):capacity(cap) {}
BlockingQueue(const BlockingQueue&)=delete;
BlockingQueue& operator=(const BlockingQueue&)=delete;
void push(const T& data) {
std::unique_lock<std::mutex> lock(mtx);
while (queue.size() >= capacity) {
not_full.wait(lock, [&]{return queue.size() < capacity;});
}
queue.push(data);
not_empty.notify_all();
}
T pop() {
std::unique_lock<std::mutex> lock(mtx);
while (queue.empty()) {
not_empty.wait(lock, [&]{return !queue.empty();});
}
T res = queue.front();
queue.pop();
not_full.notify_all();
return res;
}
bool empty() {
std::unique_lock<std::mutex> lock(mtx);
return queue.empty();
}
size_t size() {
std::unique_lock<std::mutex> lock(mtx);
return queue.size();
}
void set_capacity(const size_t capacity) {
this->capacity = (capacity > 0 ? capacity : 10);
}
};
This works for me, but I do not know how could I shut it down if I start it in the background thread:
void main() {
BlockingQueue<float> q;
bool stop{false};
auto fun = [&] {
std::cout << "before entering loop\n";
while (!stop) {
q.push(1);
}
std::cout << "after entering loop\n";
};
std::thread t_bg(fun);
t_bg.detach();
// Some other tasks here
stop = true;
// How could I shut it down before quit here, or could I simply let the operation system do that when the whole program is over?
}
The problem is that when I want to shut down the background thread, the background thread might have been sleeping because the queue is full and the push operation is blocked. How could I stop it when I want the background thread to stop ?
One easy way would be to add a flag that you set from outside when you want to abort a pop() operation that's already blocked. And then you'd have to decide what an aborted pop() is going to return. One way is for it to throw an exception, another would be to return an std::optional<T>. Here's the first method (I'll only write the changed parts.)
Add this type wherever you think is appropriate:
struct AbortedPopException {};
Add this to your class fields:
mutable std::atomic<bool> abort_flag = false;
Also add this method:
void abort () const {
abort_flag = true;
}
Change the while loop in the pop() method like this: (you don't need the while at all, since I believe the condition variable wait() method that accepts a lambda does not wake up/return spuriously; i.e. the loop is inside the wait already.)
not_empty.wait(lock, [this]{return !queue.empty() || abort_flag;});
if (abort_flag)
throw AbortedPopException{};
That's it (I believe.)
In your main(), when you want to shut the "consumer" down you can call abort() on your queue. But you'll have to handle the thrown exception there as well. It's your "exit" signal, basically.
Some side notes:
Don't detach from threads! Specially here where AFAICT there is no reason for it (and some actual danger too.) Just signal them to exit (in any manner appropriate) and join() them.
Your stop flag should be atomic. You read from it in your background thread and write to it from your main thread, and those can (and in fact do) overlap in time, so... data race!
I don't understand why you have a "full" state and "capacity" in your queue. Think about whether they are necessary.
UPDATE 1: In response to OP's comment about detaching... Here's what happens in your main thread:
You spawn the "producer" thread (i.e. the one that pushed stuff onto the queue)
Then you do all the work you want to do (e.g. consuming the stuff on the queue)
Sometime, perhaps at the end of main(), you signal the thread to stop (e.g. by setting stop flag to true)
then, and only then you join() with the thread.
It is true that your main thread will block while it is waiting for the thread to pick up the "stop" signal, exit its loop, and return from its thread function, but that's a very very short wait. And you have nothing else to do. More importantly, you'll know that your thread exited cleanly and predictably, and from that point on, you know definitely that that thread won't be running (not important for you here, but could be critical for some other threaded task.)
That is the pattern that you usually want to follow in spawning worker thread that loop over a short task.
Update 2: About "full" and "capacity" of the queue. That's fine. It's certainly your decision. No problem with that.
Update 3: About "throwing" vs. returning an "empty" object to signal an aborted "blocking pop()". I don't think there is anything wrong with throwing like that; specially since it is very very rare (just happens once at the end of the operation of the producer/consumer.) However, if all T types that you want to store in your Queue have an "invalid" or "empty" state, then you certainly can use that. But throwing is more general, if more "icky" to some people.

How to test my blocking queue actually blocks

I have a blocking queue (it would be really hard for me to change its implementation), and I want to test that it actually blocks. In particular, the pop methods must block if the queue is empty and unblock as soon as a push is performed. See the following pseudo C++11 code for the test:
BlockingQueue queue; // empty queue
thread pushThread([]
{
sleep(large_delay);
queue.push();
});
queue.pop();
Obviously it is not perfect, because it may happen that the whole thread pushThread is executed and terminates before pop is called, even if the delay is large, and the larger the delay the more I have to wait for the test being over.
How can I properly ensure that pop is executed before push is called and that is blocks until push returns?
I do not believe this is possible without adding some extra state and interfaces to your BlockingQueue.
Proof goes something like this. You want to wait until the reading thread is blocked on pop. But there is no way to distinguish between that and the thread being about to execute the pop. This remains true no matter what you put just before or after the call to pop itself.
If you really want to fix this with 100% reliability, you need to add some state inside the queue, guarded by the queue's mutex, that means "someone is waiting". The pop call then has to update that state just before it atomically releases the mutex and goes to sleep on the internal condition variable. The push thread can obtain the mutex and wait until "someone is waiting". To avoid a busy loop here, you will want to use the condition variable again.
All of this machinery is nearly as complicated as the queue itself, so maybe you will want to test it, too... This sort of multi-threaded code is where concepts like "code coverage" -- and arguably even unit testing itself -- break down a bit. There are just too many possible interleavings of operations.
In practice, I would probably go with your original approach of sleeping.
template<class T>
struct async_queue {
T pop() {
auto l = lock();
++wait_count;
cv.wait( l, [&]{ return !data.empty(); } );
--wait_count;
auto r = std::move(data.front());
data.pop_front();
return r;
}
void push(T in) {
{
auto l = lock();
data.push_back( std::move(in) );
}
cv.notify_one();
}
void push_many(std::initializer_list<T> in) {
{
auto l = lock();
for (auto&& x: in)
data.push_back( x );
}
cv.notify_all();
}
std::size_t readers_waiting() {
return wait_count;
}
std::size_t data_waiting() const {
auto l = lock();
return data.size();
}
private:
std::queue<T> data;
std::condition_variable cv;
mutable std::mutex m;
std::atomic<std::size_t> wait_count{0};
auto lock() const { return std::unique_lock<std::mutex>(m); }
};
or somesuch.
In the push thread, busy wait on readers_waiting until it passes 1.
At which point you have the lock and are within cv.wait before the lock is unlocked. Do a push.
In theory an infinitely slow reader thread could have gotten into cv.wait and still be evaluating the first lambda by the time you call push, but an infinitely slow reader thread is no different than a blocked one...
This does, however, deal with slow thread startup and the like.
Using readers_waiting and data_waiting for anything other than debugging is usually code smell.
You can use a std::condition_variable to accomplish this. The help page of cppreference.com actually shows a very nice cosumer-producer example which should be exactly what you are looking for: http://en.cppreference.com/w/cpp/thread/condition_variable
EDIT: Actually the german version of cppreference.com has an even better example :-) http://de.cppreference.com/w/cpp/thread/condition_variable

Thread pool stuck on wait condition

I'm encountering a stuck in my c++ program using this thread pool class:
class ThreadPool {
unsigned threadCount;
std::vector<std::thread> threads;
std::list<std::function<void(void)> > queue;
std::atomic_int jobs_left;
std::atomic_bool bailout;
std::atomic_bool finished;
std::condition_variable job_available_var;
std::condition_variable wait_var;
std::mutex wait_mutex;
std::mutex queue_mutex;
std::mutex mtx;
void Task() {
while (!bailout) {
next_job()();
--jobs_left;
wait_var.notify_one();
}
}
std::function<void(void)> next_job() {
std::function<void(void)> res;
std::unique_lock<std::mutex> job_lock(queue_mutex);
// Wait for a job if we don't have any.
job_available_var.wait(job_lock, [this]()->bool { return queue.size() || bailout; });
// Get job from the queue
mtx.lock();
if (!bailout) {
res = queue.front();
queue.pop_front();
}else {
// If we're bailing out, 'inject' a job into the queue to keep jobs_left accurate.
res = [] {};
++jobs_left;
}
mtx.unlock();
return res;
}
public:
ThreadPool(int c)
: threadCount(c)
, threads(threadCount)
, jobs_left(0)
, bailout(false)
, finished(false)
{
for (unsigned i = 0; i < threadCount; ++i)
threads[i] = std::move(std::thread([this, i] { this->Task(); }));
}
~ThreadPool() {
JoinAll();
}
void AddJob(std::function<void(void)> job) {
std::lock_guard<std::mutex> lock(queue_mutex);
queue.emplace_back(job);
++jobs_left;
job_available_var.notify_one();
}
void JoinAll(bool WaitForAll = true) {
if (!finished) {
if (WaitForAll) {
WaitAll();
}
// note that we're done, and wake up any thread that's
// waiting for a new job
bailout = true;
job_available_var.notify_all();
for (auto& x : threads)
if (x.joinable())
x.join();
finished = true;
}
}
void WaitAll() {
std::unique_lock<std::mutex> lk(wait_mutex);
if (jobs_left > 0) {
wait_var.wait(lk, [this] { return this->jobs_left == 0; });
}
lk.unlock();
}
};
gdb say (when stopping the blocked execution) that the stuck was in (std::unique_lock&, ThreadPool::WaitAll()::{lambda()#1})+58>
I'm using g++ v5.3.0 with support for c++14 (-std=c++1y)
How can I avoid this problem?
Update
I've edited (rewrote) the class: https://github.com/edoz90/threadpool/blob/master/ThreadPool.h
The issue here is a race condition on your job count. You're using one mutex to protect the queue, and another to protect the count, which is semantically equivalent to the queue size. Clearly the second mutex is redundant (and improperly used), as is the job_count variable itself.
Every method that deals with the queue has to gain exclusive access to it (even JoinAll to read its size), so you should use the same queue_mutex in the three bits of code that tamper with it (JoinAll, AddJob and next_job).
Btw, splitting the code at next_job() is pretty awkward IMO. You would avoid calling a dummy function if you handled the worker thread body in a single function.
EDIT:
As other comments have already stated, you would probably be better off getting your eyes off the code and reconsidering the problem globally for a while.
The only thing you need to protect here is the job queue, so you need only one mutex.
Then there is the problem of waking up the various actors, which requires a condition variable since C++ basically does not give you any other useable synchronization object.
Here again you don't need more than one variable. Terminating the thread pool is equivalent to dequeueing the jobs without executing them, which can be done any which way, be it in the worker threads themselves (skipping execution if the termination flag is set) or in the JoinAll function (clearing the queue after gaining exclusive access).
Last but not least, you might want to invalidate AddJob once someone decided to close the pool, or else you could get stuck in the destructor while someone keeps feeding in new jobs.
I think you need to keep it simple.
you seem to be using a mutex too many. So there's queue_mutex and you use that when you add and process jobs.
Now what's the need for another separate mutex when you are waiting on reading the queue?
Why can't you use just a conditional variable with the same queue_mutex to read the queue in your WaitAll() method?
Update
I would also recommend using a lock_guard instead of the unique_lock in your WaitAll. There really isn't a need to lock the queue_mutex beyond the WaitAll under exceptional conditions. If you exit the WaitAll exceptionally it should be released regardless.
Update2
Ignore my Update above. Since you are using a condition variable you can't use a lock guard in the WaitAll. But if you are using a unique_lock always go with the try_to_lock version especially if you have more than a couple control paths

Communication b/w two threads over a common datastructure. Design Issue

I currently have two threads a producer and a consumer. The producer is a static methods that inserts data in a Deque type static container and informs the consumer through boost::condition_variable that an object has been inserted in the deque object . The consumer then reads data from the Deque type and removes it from the container.The two threads communicate using boost::condition_variable
Here is an abstract of what is happening. This is the code for the consumer and producer
//Static Method : This is the producer. Different classes add data to the container using this method
void C::Add_Data(obj a)
{
try
{
int a = MyContainer.size();
UpdateTextBoxA("Current Size is " + a);
UpdateTextBoxB("Running");
MyContainer.push_back(a);
condition_consumer.notify_one(); //This condition is static member
UpdateTextBoxB("Stopped");
}
catch (std::exception& e)
{
std::string err = e.what();
}
}//end method
//Consumer Method - Runs in a separate independent thread
void C::Read_Data()
{
while(true)
{
boost::mutex::scoped_lock lock(mutex_c);
while(MyContainer.size()!=0)
{
try
{
obj a = MyContainer.front();
....
....
....
MyContainer.pop_front();
}
catch (std::exception& e)
{
std::string err = e.what();
}
}
condition_consumer.wait(lock);
}
}//end method
Now the objects being inserted in the Deque type object are very fast about 500 objects a second.While running this I noticed that TextBoxB was always at "Stopped" while I believe it was suppose to toggle between "Running" and "Stoped". Plus very slow. Any suggestions on what I might have not considered and might be doing wrong ?
1) You should do MyContainer.push_back(a); under mutex - otherwise you would get data race, which is undefined behaviour (+ you may need to protect MyContainer.size(); by mutex too, depending on it's type and C++ISO/Compiler version you use).
2) void C::Read_Data() should be:
void C::Read_Data()
{
scoped_lock slock(mutex_c);
while(true) // you may also need some exit condition/mechanism
{
condition_consumer.wait(slock,[&]{return !MyContainer.empty();});
// at this line MyContainer.empty()==false and slock is locked
// so you may pop value from deque
}
}
3) You are mixing logic of concurrent queue with logic of producing/consuming. Instead you may isolate concurrent queue part to stand-alone entity:
LIVE DEMO
// C++98
template<typename T>
class concurrent_queue
{
queue<T> q;
mutable mutex m;
mutable condition_variable c;
public:
void push(const T &t)
{
(lock_guard<mutex>(m)),
q.push(t),
c.notify_one();
}
void pop(T &result)
{
unique_lock<mutex> u(m);
while(q.empty())
c.wait(u);
result = q.front();
q.pop();
}
};
Thanks for your reply. Could you explain the second parameter in the conditional wait statement [&]{return !MyContainer.empty();}
There is second version of condition_variable::wait which takes predicate as second paramter. It basically waits while that predicate is false, helping to "ignore" spurious wake-ups.
[&]{return !MyContainer.empty();} - this is lambda function. It is new feature of C++11 - it allows to define functions "in-place". If you don't have C++11 then just make stand-alone predicate or use one-argument version of wait with manual while loop:
while(MyContainer.empty()) condition_consumer.wait(lock);
One question in your 3rd point you suggested that I should Isolate the entire queue while My adding to the queue method is static and the consumer(queue reader) runs forever in a separate thread. Could you tell me why is that a flaw in my design?
There is no problem with "runs forever" or with static. You can even make static concurrent_queue<T> member - if your design requires that.
Flaw is that multithreaded synchronization is coupled with other kind of work. But when you have concurrent_queue - all synchronization is isolated inside that primitive, and code which produces/consumes data is not polluted with locks and waits:
concurrent_queue<int> c;
thread producer([&]
{
for(int i=0;i!=100;++i)
c.push(i);
});
thread consumer([&]
{
int x;
do{
c.pop(x);
std::cout << x << std::endl;
}while(x!=11);
});
producer.join();
consumer.join();
As you can see, there is no "manual" synchronization of push/pop, and code is much cleaner.
Moreover, when you decouple your components in such way - you may test them in isolation. Also, they are becoming more reusable.