I created the following code for a project where I don't have access to any modern C++ threading libraries like boost. My desire is to have the ability to have the lock automatically release when it leaves scope.
The Shared lock works fine. If a thread acquires it, nothing else can acquire it until the first thread releases it. The Scoped one does not work though.
Here's some output showing what I mean. I gave each thread a distinct name, had them instantiate the Scoped lock with the same Shared lock, print 'acquired', sleep for five seconds, print 'released', then leave scope. Instead of getting the acquire/release pairs I'd expect, I get four 'acquired's in quick succession, a five second gap, then the 'released's. I even changed the lock in Scoped to a pointer, and printed the address before acquiring it, just to make sure I wasn't crazy. It looks like it's the same Shared object, but the lock isn't preventing multiple accesses.
Lock '140734928395200'.
acquired: !!!!!
Lock '140734928395200'.
acquired: -------
Lock '140734928395200'.
acquired: ***************
Lock '140734928395200'.
acquired: ##
released: !!!!!
released: -------
released: ***************
released: ##
Here's the source code for Lock.h:
#include <pthread.h>
namespace Lock
{
class Shared
{
public:
Shared()
{
::pthread_mutex_init(&(this->mutex), nullptr);
}
~Shared()
{
}
void acquire()
{
::pthread_mutex_lock(&(this->mutex));
}
void release()
{
::pthread_mutex_unlock(&(this->mutex));
}
private:
pthread_mutex_t mutex;
};
class Scoped
{
public:
Scoped(Lock::Shared& lock) : lock(lock)
{
this->lock.acquire();
}
virtual ~Scoped()
{
this->lock.release();
}
private:
Lock::Shared& lock;
};
};
Here's my main.cc file for testing. I'm building with:
g++ -std=c++11 -o try -pthread main.cc && ./try
with g++4.7 on an up to date Ubuntu system.
#include <pthread.h>
#include <iostream>
#include "Lock.h"
#include <unistd.h>
struct data
{
data(std::string name, Lock::Shared& lock) : name(name), lock(lock) { ; }
std::string name;
Lock::Shared& lock;
};
void* doStuff(void* v)
{
data* d = (data*)v;
for (int i = 0; i < 5; i++)
{
Lock::Scoped(d->lock);
//d->lock->acquire();
std::cout << "acquired: " << d->name << std::endl;
::sleep(5);
std::cout << "released: " << d->name << std::endl;
//d->lock->release();
::sleep(1);
}
}
int main(int argc, char* argv[])
{
pthread_t fred;
pthread_t barney;
pthread_t wilma;
pthread_t betty;
Lock::Shared lock;
data f("##", lock);
data b("***************", lock);
data w("-------", lock);
data e("!!!!!", lock);
::pthread_create(&fred, nullptr, doStuff, (void*)&f);
::pthread_create(&barney, nullptr, doStuff, (void*)&b);
::pthread_create(&wilma, nullptr, doStuff, (void*)&w);
::pthread_create(&betty, nullptr, doStuff, (void*)&e);
::pthread_join(fred, nullptr);
::pthread_join(barney, nullptr);
::pthread_join(wilma, nullptr);
::pthread_join(betty, nullptr);
return 0;
}
The problem is:
for (int i = 0; i < 5; i++)
{
Lock::Scoped(d->lock);
which creates a temporaray Lock::Scoped that is constructed and destructed immediately, thus it does not have the intended synchronization effect. Change to:
for (int i = 0; i < 5; i++)
{
Lock::Scoped lk(d->lock);
The problem is here:
Lock::Scoped(d->lock);
This creates an unnamed temporary that goes out of scope right away.
To fix, give it a name:
Lock::Scoped lck(d->lock);
Related
I'm trying to write a thread pool in c++ that fulfills the following criteria:
a single writer occasionally writes a new input value, and once it does, many threads concurrently access this same value, and each spit out a random floating point number.
each worker thread uses the same function, so there's no reason to build a thread-safe queue for all the different functions. I store the common function inside the thread_pool class.
these functions are by far the most computationally-intensive aspect of the program. Any locks that prevent these functions from doing their work is the primary thing I'm trying to avoid.
the floating point output from all these functions is simply averaged.
the user has a single function called thread_pool::start_work that changes this shared input, and tells all the workers to work for a fixed number of tasks.
thread_pool::start_work returns std::future
Below is what I have so far. It can be built and run with g++ test_tp.cpp -std=c++17 -lpthread; ./a.out Unfortunately it either deadlocks or does the work too many (or sometimes too few) times. I am thinking that it's because m_num_comps_done is not thread-safe. There are chances that all the threads skip over the last count, and then they all end up yielding. But isn't this variable atomic?
#include <vector>
#include <thread>
#include <mutex>
#include <shared_mutex>
#include <queue>
#include <atomic>
#include <future>
#include <iostream>
#include <numeric>
/**
* #class join_threads
* #brief RAII thread killer
*/
class join_threads
{
std::vector<std::thread>& m_threads;
public:
explicit join_threads(std::vector<std::thread>& threads_)
: m_threads(threads_) {}
~join_threads() {
for(unsigned long i=0; i < m_threads.size(); ++i) {
if(m_threads[i].joinable())
m_threads[i].join();
}
}
};
// how remove the first two template parameters ?
template<typename func_input_t, typename F>
class thread_pool
{
using func_output_t = typename std::result_of<F(func_input_t)>::type;
static_assert( std::is_floating_point<func_output_t>::value,
"function output type must be floating point");
unsigned m_num_comps;
std::atomic_bool m_done;
std::atomic_bool m_has_an_input;
std::atomic<int> m_num_comps_done; // need to be atomic? why?
F m_f; // same function always used
func_input_t m_param; // changed occasionally by a single writer
func_output_t m_working_output; // many reader threads average all their output to get this
std::promise<func_output_t> m_out;
mutable std::shared_mutex m_mut;
mutable std::mutex m_output_mut;
std::vector<std::thread> m_threads;
join_threads m_joiner;
void worker_thread() {
while(!m_done)
{
if(m_has_an_input){
if( m_num_comps_done.load() < m_num_comps - 1 ) {
std::shared_lock<std::shared_mutex> lk(m_mut);
func_output_t tmp = m_f(m_param); // long time
m_num_comps_done++;
// quick
std::lock_guard<std::mutex> lk2(m_output_mut);
m_working_output += tmp / m_num_comps;
}else if(m_num_comps_done.load() == m_num_comps - 1){
std::shared_lock<std::shared_mutex> lk(m_mut);
func_output_t tmp = m_f(m_param); // long time
m_num_comps_done++;
std::lock_guard<std::mutex> lk2(m_output_mut);
m_working_output += tmp / m_num_comps;
m_num_comps_done++;
try{
m_out.set_value(m_working_output);
}catch(std::future_error& e){
std::cout << "future_error caught: " << e.what() << "\n";
}
}else{
std::this_thread::yield();
}
}else{
std::this_thread::yield();
}
}
}
public:
/**
* #brief ctor spawns working threads
*/
thread_pool(F f, unsigned num_comps)
: m_num_comps(num_comps)
, m_done(false)
, m_has_an_input(false)
, m_joiner(m_threads)
, m_f(f)
{
unsigned const thread_count=std::thread::hardware_concurrency(); // should I subtract one?
try {
for(unsigned i=0; i<thread_count; ++i) {
m_threads.push_back( std::thread(&thread_pool::worker_thread, this));
}
} catch(...) {
m_done=true;
throw;
}
}
~thread_pool() {
m_done=true;
}
/**
* #brief changes the shared data member,
* resets the num_comps_left variable,
* resets the accumulator thing to 0, and
* resets the promise object
*/
std::future<func_output_t> start_work(func_input_t new_param) {
std::unique_lock<std::shared_mutex> lk(m_mut);
m_param = new_param;
m_num_comps_done = 0;
m_working_output = 0.0;
m_out = std::promise<func_output_t>();
m_has_an_input = true; // only really matters just after initialization
return m_out.get_future();
}
};
double slowSum(std::vector<double> nums) {
// std::this_thread::sleep_for(std::chrono::milliseconds(200));
return std::accumulate(nums.begin(), nums.end(), 0.0);
}
int main(){
// construct
thread_pool<std::vector<double>, std::function<double(std::vector<double>)>>
le_pool(slowSum, 1000);
// add work
auto ans = le_pool.start_work(std::vector<double>{1.2, 3.2, 4213.1});
std::cout << "final answer is: " << ans.get() << "\n";
std::cout << "it should be 4217.5\n";
return 1;
}
You check the "done" count, then get the lock. This allows multiple threads to be waiting for the lock. In particular, there might not be a thread that enters the second if body.
The other side of that is because you have all threads running all the time, the "last" thread may not get access to its exclusive section early (before enough threads have run) or even late (because additional threads are waiting at the mutex in the first loop).
To fix the first issue, since the second if block has all of the same code that is in the first if block, you can have just one block that checks the count to see if you've reached the end and should set the out value.
The second issue requires you to check m_num_comps_done a second time after acquiring the mutex.
I'm working on an assignment for school, one of the requirements of which is that I cannot use global variables, but I do need static variables for shared memory. The premise of the assignment is to use the pthread library and semaphores to ensure that created threads execute in reverse order. I've gotten it to work with global static semaphore/condvar/mutex as such:
#include <pthread.h>
#include <stdio.h>
#include <iostream>
#include <semaphore.h>
using namespace std;
#define NUM 5
static sem_t threadCounter;
static pthread_cond_t nextThreadCond = PTHREAD_COND_INITIALIZER;
static pthread_cond_t makingThreadCond = PTHREAD_COND_INITIALIZER;
static pthread_mutex_t makingThreadMutex = PTHREAD_MUTEX_INITIALIZER;
static pthread_mutex_t nextThreadMutex = PTHREAD_MUTEX_INITIALIZER;
void *wait_func(void *args)
{
// cout<<"Waiting"<<endl;
// pthread_cond_wait(&makingThreadCond, &makingThreadMutex);
// cout<<"Woke up"<<endl;
int tid = *((int *)args);
int val;
sem_getvalue(&threadCounter, &val);
// cout << tid << ":" << val << endl;
while (tid != val-1)
{
pthread_cond_wait(&nextThreadCond, &nextThreadMutex);
sem_getvalue(&threadCounter, &val);
// cout<<"evaluating condition in"<<tid<<", val is "<<val<<endl;
}
sem_wait(&threadCounter); // decrement threadCounter
// cout << "after decrement" << endl;
sem_getvalue(&threadCounter, &val);
// cout << "decremented val "<<val << endl;
cout<<"Exiting thread #"<<tid<<endl;
pthread_mutex_unlock(&nextThreadMutex);
// cout<<"after nextThreadMutex unlock"<<endl;
pthread_cond_broadcast(&nextThreadCond);
// cout<<"after nextThreadCond broadcast"<<endl;
}
int main()
{
pthread_t tid[NUM];
if (sem_init(&threadCounter, 0, NUM) < 0)
{
cout << "Failed to init sem" << endl;
}
for (int i = 0; i < NUM; i++)
{
int *argId = (int *)malloc(sizeof(*argId));
*argId = i;
if (pthread_create(&tid[i], NULL, wait_func, argId))
{
cout << "Couldn't make thread " << i << endl;
}
}
for (int i = 0; i < NUM; i++)
{
pthread_join(tid[i], NULL);
}
}
but this isn't allowed as I said, so I tried to convert it where I share them through a struct and passed in with pthread_create arguments as such:
#include <pthread.h>
#include <stdio.h>
#include <iostream>
#include <semaphore.h>
using namespace std;
#define NUM 5
struct args
{
int tid;
sem_t* sem;
pthread_cond_t* cond;
pthread_mutex_t* mut;
};
void *wait_func(void *args_ptr)
{
// cout<<"Waiting"<<endl;
// pthread_cond_wait(&makingThreadCond, &makingThreadMutex);
// cout<<"Woke up"<<endl;
struct args* args = (struct args*) args_ptr;
int tid = (args->tid);
pthread_cond_t cond = *(args->cond);
pthread_mutex_t mut = *(args->mut);
sem_t sem = *(args->sem);
int val;
sem_getvalue(&sem, &val);
// cout << tid << ":" << val << endl;
while (tid != val - 1)
{
pthread_cond_wait(&cond, &mut);
sem_getvalue(&sem, &val);
// cout<<"evaluating condition in"<<tid<<", val is "<<val<<endl;
}
sem_wait(&sem); // decrement threadCounter
// cout << "after decrement" << endl;
sem_getvalue(&sem, &val);
// cout << "decremented val "<<val << endl;
cout << "Exiting thread #" << tid << endl;
pthread_mutex_unlock(&mut);
// cout<<"after nextThreadMutex unlock"<<endl;
pthread_cond_broadcast(&cond);
// cout<<"after nextThreadCond broadcast"<<endl;
}
int main()
{
static sem_t threadCounter;
static pthread_cond_t nextThreadCond = PTHREAD_COND_INITIALIZER;
static pthread_mutex_t nextThreadMutex = PTHREAD_MUTEX_INITIALIZER;
pthread_t tid[NUM];
if (sem_init(&threadCounter, 0, NUM) < 0)
{
cout << "Failed to init sem" << endl;
}
for (int i = 0; i < NUM; i++)
{
int *argId = (int *)malloc(sizeof(*argId));
*argId = i;
struct args args;
args.tid = *argId;
args.sem = &threadCounter;
args.cond = &nextThreadCond;
args.mut = &nextThreadMutex;
if (pthread_create(&tid[i], NULL, wait_func, &args))
{
cout << "Couldn't make thread " << i << endl;
}
}
// cout << "Before posting sem" << endl;
// sem_post(&makingThreads);
// cout << "Sem posetd" << endl;
// cout<<"Broadcasting"<<endl;
// pthread_cond_broadcast(&makingThreadCond);
for (int i = 0; i < NUM; i++)
{
pthread_join(tid[i], NULL);
}
}
This gets stuck immediately with "Exiting thread #4" twice. I would think that the second code is equivalent to the first, just without global variables but there must be something I'm missing.
struct args args;
This declares an object inside the scope of your for loop. When execution reaches the end of the for loop, this object gets destroyed -- like any other object that's declared locally within a function or within some inner scope -- and this happens before either the loop starts again from the beginning, or if the for loop stops iterating altogether. Either way, as soon the execution reaches the next } this object goes away. It is gone for good. It gets destroyed. It is no more. It joins the choir-invisible. It becomes an ex-object.
But before that happens, before the end of this loop, the following occurs:
if (pthread_create(&tid[i], NULL, wait_func, &args))
So you start a new execution thread, and pass it a pointer to this object, which is about to meet its maker.
And as soon as pthread_create() returns, that's the end of the loop and your args object is gone, and the abovementioned happens: it gets destroyed; it is no more; it joins the choir-invisible; and it becomes an ex-object.
And the C and the C++ standards give you absolutely no guarantees whatsoever, that your new execution thread actually starts running, and reaches the point where it reads this pointer, and what it's pointing to, before the end of this loop gets reached.
And, more likely than not, each new execution thread doesn't get around to reading the pointer to the args object, in the main execution thread, until long after it gets destroyed. So it grabs stuff from a pointer to a destroyed object. Goodbye.
As such, this execution thread's actions become undefined behavior.
This explains the random, unpredictable behavior that you've observed.
The usual approach is to malloc or new everything that gets passed to your new execution thread, and pass to the execution thread a pointer to the newed or malloced object.
It is also possible to carefully write some code that will make the main execution thread stop and wait until the new execution thread retrieves whatever it needs to do, and then proceeds on its own. A bunch more code will be needed to implement that approach, if you so choose.
Your code also has evidence of your initial attempts to take this approach:
int *argId = (int *)malloc(sizeof(*argId));
*argId = i;
struct args args;
args.tid = *argId;
mallocing this pointer, assigning to it, then copying it to args.tid accomplishes absolutely nothing useful. The same thing can be done simply by:
struct args args;
args.tid = i;
The only thing that malloc does is leak memory. Furthermore, this whole args object, declared as a local variable in the for loop's inner scope, is doomed for the reasons explained above.
P.S. When taking the "malloc the entire args object" approach, this also will leak memory unless you also take measures to diligently free the malloced object, when it is appropriate to do so.
You are passing a pointer to the local variable args to pthread_create. The variable's lifetime ends when the for loop iteration ends and the pointer becomes dangling.
The thread may be accessing it later though, causing undefined behavior.
You need to allocate args dynamically (but not argId), and pass that to the thread. The thread function must then assure the deletion of the pointer. Also don't name your variables the same thing as a type. That is very confusing. The struct keyword in a variable declaration is generally (if you don't name variables and types the same) not needed in C++ and may cause other issues when used without reason, so don't use it and name thing differently.
struct Args
{
int tid;
sem_t* sem;
pthread_cond_t* cond;
pthread_mutex_t* mut;
};
//...
auto args = new Args{i, &threadCounter, &nextThreadCond, &nextThreadMutex};
if (pthread_create(&tid[i], NULL, wait_func, args))
{
cout << "Couldn't make thread " << i << endl;
}
and at the end of the thread function delete the pointer:
void *wait_func(void *args_ptr)
{
auto args = static_cast<Args*>(args_ptr);
//...
delete args;
}
static_cast is safer than the C style cast, since it is much more restricted in the types it can cast between and e.g. can't accidentally drop a const or anything similar.
None of the variables seem to have a reason to be static either in the global or local case.
pthread_cond_t cond = *(args->cond);
pthread_mutex_t mut = *(args->mut);
This tries to create a new condition variable and mutex and initialize it based on the value of the condition variable and mutex pointed to. That doesn't make sense and won't work.
while (tid != val - 1)
{
pthread_cond_wait(&cond, &mut);
sem_getvalue(&sem, &val);
// cout<<"evaluating condition in"<<tid<<", val is "<<val<<endl;
Here, you pass to pthread_cond_wait a pointer to the local condition variable and mutex you created above rather than a pointer to the shared one. Look at this code:
int a;
foo(&a);
void foo(int* a)
{
int b = *a;
bar (&b); // If bar changes *b, that will not affect a!
}
See the problem? You passed bar a pointer to b, not a. So if bar changes the thing the pointer points to, it won't be modifying a but the local copy of b.
Don't try to create mutexes or condition variables that are copies of other mutexes or condition variables. It doesn't make semantic sense and it won't work.
Instead, you can do this:
pthread_cond_t* cond = (args->cond);
pthread_mutex_t* mut = (args->mut);
Now you can pass cond and mut to pthread_cond_wait, and you'll be passing pointers to the shared synchronization objects.
I have simple code: first thread pushes std::strings to the std::list, and second thread pops std::strings from this std::list. All std::lists operations are protected with std::mutex m. This code permanently prints error to console: "Error: lst.begin() == lst.end()".
If I replace std::lock_guard with construction m.lock() and m.unlock() the code begins work correctly. What is wrong with std::lock_guard?
#include <iostream>
#include <thread>
#include <mutex>
#include <list>
#include <string>
std::mutex m;
std::list<std::string> lst;
void f2()
{
for (int i = 0; i < 5000; ++i)
{
std::lock_guard<std::mutex> { m };
lst.push_back(std::to_string(i));
}
m.lock();
lst.push_back("-1"); // last list's element
m.unlock();
}
void f1()
{
std::string str;
while (true)
{
m.lock();
if (!lst.empty())
{
if (lst.begin() == lst.end())
{
std::cerr << "Error: lst.begin() == lst.end()" << std::endl;
}
str = lst.front();
lst.pop_front();
m.unlock();
if (str == "-1")
{
break;
}
}
else
{
m.unlock();
std::this_thread::yield();
}
}
}
// tested in MSVS2017
int main()
{
std::thread tf2{ f2 };
f1();
tf2.join();
}
You did not obey CppCoreGuidelines CP.44: Remember to name your lock_guards and unique_locks :).
In
for (int i = 0; i < 5000; ++i)
{
std::lock_guard<std::mutex> { m };
lst.push_back(std::to_string(i));
}
you are only creating a temporary std::lock_guard object which is created and destroyed immediately. You need to name the object like in
{
std::lock_guard<std::mutex> lg{ m };
lst.push_back(std::to_string(i));
}
so that the lock guard lives until the end of the block.
And as you already recognized (CppCoreGuidelines):
Use RAII lock guards (lock_guard, unique_lock, shared_lock), never call mutex.lock and mutex.unlock directly (RAII)
If you are using Microsoft Visual Studio, I recommend using the code analysis and activating at least the Microsoft Native Recommended Rules. If you do this you will get a compiler analysis warning.
warning C26441: Guard objects must be named (cp.44).
I'm trying to stop multiple worker threads using a std::atomic_flag. Starting from Issue using std::atomic_flag with worker thread the following works:
#include <iostream>
#include <atomic>
#include <chrono>
#include <thread>
std::atomic_flag continueFlag;
std::thread t;
void work()
{
while (continueFlag.test_and_set(std::memory_order_relaxed)) {
std::cout << "work ";
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
}
void start()
{
continueFlag.test_and_set(std::memory_order_relaxed);
t = std::thread(&work);
}
void stop()
{
continueFlag.clear(std::memory_order_relaxed);
t.join();
}
int main()
{
std::cout << "Start" << std::endl;
start();
std::this_thread::sleep_for(std::chrono::milliseconds(200));
std::cout << "Stop" << std::endl;
stop();
std::cout << "Stopped." << std::endl;
return 0;
}
Trying to rewrite into multiple worker threads:
#include <iostream>
#include <atomic>
#include <chrono>
#include <thread>
#include <vector>
#include <memory>
struct thread_data {
std::atomic_flag continueFlag;
std::thread thread;
};
std::vector<thread_data> threads;
void work(int threadNum, std::atomic_flag &continueFlag)
{
while (continueFlag.test_and_set(std::memory_order_relaxed)) {
std::cout << "work" << threadNum << " ";
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
}
void start()
{
const unsigned int numThreads = 2;
for (int i = 0; i < numThreads; i++) {
////////////////////////////////////////////////////////////////////
//PROBLEM SECTOR
////////////////////////////////////////////////////////////////////
thread_data td;
td.continueFlag.test_and_set(std::memory_order_relaxed);
td.thread = std::thread(&work, i, td.continueFlag);
threads.push_back(std::move(td));
////////////////////////////////////////////////////////////////////
//PROBLEM SECTOR
////////////////////////////////////////////////////////////////////
}
}
void stop()
{
//Flag stop
for (auto &data : threads) {
data.continueFlag.clear(std::memory_order_relaxed);
}
//Join
for (auto &data : threads) {
data.thread.join();
}
threads.clear();
}
int main()
{
std::cout << "Start" << std::endl;
start();
std::this_thread::sleep_for(std::chrono::milliseconds(200));
std::cout << "Stop" << std::endl;
stop();
std::cout << "Stopped." << std::endl;
return 0;
}
My issue is "Problem Sector" in above. Namely creating the threads. I cannot wrap my head around how to instantiate the threads and passing the variables to the work thread.
The error right now is referencing this line threads.push_back(std::move(td)); with error Error C2280 'thread_data::thread_data(const thread_data &)': attempting to reference a deleted function.
Trying to use unique_ptr like this:
auto td = std::make_unique<thread_data>();
td->continueFlag.test_and_set(std::memory_order_relaxed);
td->thread = std::thread(&work, i, td->continueFlag);
threads.push_back(std::move(td));
Gives error std::atomic_flag::atomic_flag(const std::atomic_flag &)': attempting to reference a deleted function at line td->thread = std::thread(&work, i, td->continueFlag);. Am I fundamentally misunderstanding the use of std::atomic_flag? Is it really both immovable and uncopyable?
Your first approach was actually closer to the truth. The problem is that it passed a reference to an object within the local for loop scope to each thread, as a parameter. But, of course, once the loop iteration ended, that object went out of scope and got destroyed, leaving each thread with a reference to a destroyed object, resulting in undefined behavior.
Nobody cared about the fact that you moved the object into the std::vector, after creating the thread. The thread received a reference to a locally-scoped object, and that's all it knew. End of story.
Moving the object into the vector first, and then passing to each thread a reference to the object in the std::vector will not work either. As soon as the vector internally reallocates, as part of its natural growth, you'll be in the same pickle.
What needs to happen is to have the entire threads array created first, before actually starting any std::threads. If the RAII principle is religiously followed, that means nothing more than a simple call to std::vector::resize().
Then, in a second loop, iterate over the fully-cooked threads array, and go and spawn off a std::thread for each element in the array.
I was almost there with my unique_ptr solution. I just needed to pass the call as a std::ref() as such:
std::vector<std::unique_ptr<thread_data>> threads;
void start()
{
const unsigned int numThreads = 2;
for (int i = 0; i < numThreads; i++) {
auto td = std::make_unique<thread_data>();
td->continueFlag.test_and_set(std::memory_order_relaxed);
td->thread = std::thread(&work, i, std::ref(td->continueFlag));
threads.push_back(std::move(td));
}
}
However, inspired by Sam above I also figured a non-pointer way:
std::vector<thread_data> threads;
void start()
{
const unsigned int numThreads = 2;
//create new vector, resize doesn't work as it tries to assign/copy which atomic_flag
//does not support
threads = std::vector<thread_data>(numThreads);
for (int i = 0; i < numThreads; i++) {
auto& t = threads.at(i);
t.continueFlag.test_and_set(std::memory_order_relaxed);
t.thread = std::thread(&work, i, std::ref(t.continueFlag));
}
}
I've written my own version of thread safe queue. However, when I run this program, it hangs/deadlocks itself.
Wondering, why is this locks/hangs forever.
void concurrentqueue::addtoQueue(const int number)
{
locker currentlock(lock_for_queue);
numberlist.push(number);
pthread_cond_signal(&queue_availability_condition);
}
int concurrentqueue::getFromQueue()
{
int number = 0;
locker currentlock(lock_for_queue);
if ( empty() )
{
pthread_cond_wait(&queue_availability_condition,&lock_for_queue);
}
number = numberlist.front();
numberlist.pop();
return number;
}
bool concurrentqueue::empty()
{
return numberlist.empty();
}
I've written, the class locker as RAII.
class locker
{
public:
locker(pthread_mutex_t& lockee): target(lockee)
{
pthread_mutex_lock(&target);
}
~locker()
{
pthread_mutex_unlock(&target);
}
private:
pthread_mutex_t target;
};
My writer/reader thread code is very simple. Writer thread, adds to the queue and reader thread, reads from the queue.
void * writeintoqueue(void* myqueue)
{
void *t = 0;
concurrentqueue *localqueue = (concurrentqueue *) myqueue;
for ( int i = 0; i < 10 ; ++i)
{
localqueue->addtoQueue(i*10);
}
pthread_exit(t);
}
void * readfromqueue(void* myqueue)
{
void *t = 0;
concurrentqueue *localqueue = (concurrentqueue *) myqueue;
int number = 0;
for ( int i = 0 ; i < 10 ; ++i)
{
number = localqueue->getFromQueue();
std::cout << "The number from the queue is " << number << std::endl;
}
pthread_exit(t);
}
This is definitely not safe:
if ( empty() )
{
pthread_cond_wait(&queue_availability_condition,&lock_for_queue);
}
If another thread that was not previously waiting calls getFromQueue() after addtoQueue() has signalled the condition variable and exited but before the waiting thread has aquired the lock then this thread could exit and expect the queue to have values in it. You must recheck that the queue is not empty.
Change the if into a while:
while ( empty() )
{
pthread_cond_wait(&queue_availability_condition,&lock_for_queue);
}
Reformulating spong's comment as an answer: your locker class should NOT be copying the pthread_mutex_t by value. You should use a reference or a pointer instead, e.g.:
class locker
{
public:
locker(pthread_mutex_t& lockee): target(lockee)
{
pthread_mutex_lock(&target);
}
~locker()
{
pthread_mutex_unlock(&target);
}
private:
pthread_mutex_t& target; // <-- this is a reference
};
The reason for this is that all pthreads data types should be treated as opaque types -- you don't know what's in them and should not copy them. The library does things like looking at a particular memory address to determine if a lock is held, so if there are two copies of a variable that indicates if the lock is held, odd things could happen, such as multiple threads appearing to succeed in locking the same mutex.
I tested your code, and it also deadlocked for me. I then ran it through Valgrind, and although it did not deadlock in that case (due to different timings, or maybe Valgrind only simulates one thread at a time), Valgrind reported numerous errors. After fixing locker to use a reference instead, it ran without deadlocking and without generating any errors in Valgrind.
See also Debugging with pthreads.