Switching from global static variables to static variables breaks code - c++

I'm working on an assignment for school, one of the requirements of which is that I cannot use global variables, but I do need static variables for shared memory. The premise of the assignment is to use the pthread library and semaphores to ensure that created threads execute in reverse order. I've gotten it to work with global static semaphore/condvar/mutex as such:
#include <pthread.h>
#include <stdio.h>
#include <iostream>
#include <semaphore.h>
using namespace std;
#define NUM 5
static sem_t threadCounter;
static pthread_cond_t nextThreadCond = PTHREAD_COND_INITIALIZER;
static pthread_cond_t makingThreadCond = PTHREAD_COND_INITIALIZER;
static pthread_mutex_t makingThreadMutex = PTHREAD_MUTEX_INITIALIZER;
static pthread_mutex_t nextThreadMutex = PTHREAD_MUTEX_INITIALIZER;
void *wait_func(void *args)
{
// cout<<"Waiting"<<endl;
// pthread_cond_wait(&makingThreadCond, &makingThreadMutex);
// cout<<"Woke up"<<endl;
int tid = *((int *)args);
int val;
sem_getvalue(&threadCounter, &val);
// cout << tid << ":" << val << endl;
while (tid != val-1)
{
pthread_cond_wait(&nextThreadCond, &nextThreadMutex);
sem_getvalue(&threadCounter, &val);
// cout<<"evaluating condition in"<<tid<<", val is "<<val<<endl;
}
sem_wait(&threadCounter); // decrement threadCounter
// cout << "after decrement" << endl;
sem_getvalue(&threadCounter, &val);
// cout << "decremented val "<<val << endl;
cout<<"Exiting thread #"<<tid<<endl;
pthread_mutex_unlock(&nextThreadMutex);
// cout<<"after nextThreadMutex unlock"<<endl;
pthread_cond_broadcast(&nextThreadCond);
// cout<<"after nextThreadCond broadcast"<<endl;
}
int main()
{
pthread_t tid[NUM];
if (sem_init(&threadCounter, 0, NUM) < 0)
{
cout << "Failed to init sem" << endl;
}
for (int i = 0; i < NUM; i++)
{
int *argId = (int *)malloc(sizeof(*argId));
*argId = i;
if (pthread_create(&tid[i], NULL, wait_func, argId))
{
cout << "Couldn't make thread " << i << endl;
}
}
for (int i = 0; i < NUM; i++)
{
pthread_join(tid[i], NULL);
}
}
but this isn't allowed as I said, so I tried to convert it where I share them through a struct and passed in with pthread_create arguments as such:
#include <pthread.h>
#include <stdio.h>
#include <iostream>
#include <semaphore.h>
using namespace std;
#define NUM 5
struct args
{
int tid;
sem_t* sem;
pthread_cond_t* cond;
pthread_mutex_t* mut;
};
void *wait_func(void *args_ptr)
{
// cout<<"Waiting"<<endl;
// pthread_cond_wait(&makingThreadCond, &makingThreadMutex);
// cout<<"Woke up"<<endl;
struct args* args = (struct args*) args_ptr;
int tid = (args->tid);
pthread_cond_t cond = *(args->cond);
pthread_mutex_t mut = *(args->mut);
sem_t sem = *(args->sem);
int val;
sem_getvalue(&sem, &val);
// cout << tid << ":" << val << endl;
while (tid != val - 1)
{
pthread_cond_wait(&cond, &mut);
sem_getvalue(&sem, &val);
// cout<<"evaluating condition in"<<tid<<", val is "<<val<<endl;
}
sem_wait(&sem); // decrement threadCounter
// cout << "after decrement" << endl;
sem_getvalue(&sem, &val);
// cout << "decremented val "<<val << endl;
cout << "Exiting thread #" << tid << endl;
pthread_mutex_unlock(&mut);
// cout<<"after nextThreadMutex unlock"<<endl;
pthread_cond_broadcast(&cond);
// cout<<"after nextThreadCond broadcast"<<endl;
}
int main()
{
static sem_t threadCounter;
static pthread_cond_t nextThreadCond = PTHREAD_COND_INITIALIZER;
static pthread_mutex_t nextThreadMutex = PTHREAD_MUTEX_INITIALIZER;
pthread_t tid[NUM];
if (sem_init(&threadCounter, 0, NUM) < 0)
{
cout << "Failed to init sem" << endl;
}
for (int i = 0; i < NUM; i++)
{
int *argId = (int *)malloc(sizeof(*argId));
*argId = i;
struct args args;
args.tid = *argId;
args.sem = &threadCounter;
args.cond = &nextThreadCond;
args.mut = &nextThreadMutex;
if (pthread_create(&tid[i], NULL, wait_func, &args))
{
cout << "Couldn't make thread " << i << endl;
}
}
// cout << "Before posting sem" << endl;
// sem_post(&makingThreads);
// cout << "Sem posetd" << endl;
// cout<<"Broadcasting"<<endl;
// pthread_cond_broadcast(&makingThreadCond);
for (int i = 0; i < NUM; i++)
{
pthread_join(tid[i], NULL);
}
}
This gets stuck immediately with "Exiting thread #4" twice. I would think that the second code is equivalent to the first, just without global variables but there must be something I'm missing.

struct args args;
This declares an object inside the scope of your for loop. When execution reaches the end of the for loop, this object gets destroyed -- like any other object that's declared locally within a function or within some inner scope -- and this happens before either the loop starts again from the beginning, or if the for loop stops iterating altogether. Either way, as soon the execution reaches the next } this object goes away. It is gone for good. It gets destroyed. It is no more. It joins the choir-invisible. It becomes an ex-object.
But before that happens, before the end of this loop, the following occurs:
if (pthread_create(&tid[i], NULL, wait_func, &args))
So you start a new execution thread, and pass it a pointer to this object, which is about to meet its maker.
And as soon as pthread_create() returns, that's the end of the loop and your args object is gone, and the abovementioned happens: it gets destroyed; it is no more; it joins the choir-invisible; and it becomes an ex-object.
And the C and the C++ standards give you absolutely no guarantees whatsoever, that your new execution thread actually starts running, and reaches the point where it reads this pointer, and what it's pointing to, before the end of this loop gets reached.
And, more likely than not, each new execution thread doesn't get around to reading the pointer to the args object, in the main execution thread, until long after it gets destroyed. So it grabs stuff from a pointer to a destroyed object. Goodbye.
As such, this execution thread's actions become undefined behavior.
This explains the random, unpredictable behavior that you've observed.
The usual approach is to malloc or new everything that gets passed to your new execution thread, and pass to the execution thread a pointer to the newed or malloced object.
It is also possible to carefully write some code that will make the main execution thread stop and wait until the new execution thread retrieves whatever it needs to do, and then proceeds on its own. A bunch more code will be needed to implement that approach, if you so choose.
Your code also has evidence of your initial attempts to take this approach:
int *argId = (int *)malloc(sizeof(*argId));
*argId = i;
struct args args;
args.tid = *argId;
mallocing this pointer, assigning to it, then copying it to args.tid accomplishes absolutely nothing useful. The same thing can be done simply by:
struct args args;
args.tid = i;
The only thing that malloc does is leak memory. Furthermore, this whole args object, declared as a local variable in the for loop's inner scope, is doomed for the reasons explained above.
P.S. When taking the "malloc the entire args object" approach, this also will leak memory unless you also take measures to diligently free the malloced object, when it is appropriate to do so.

You are passing a pointer to the local variable args to pthread_create. The variable's lifetime ends when the for loop iteration ends and the pointer becomes dangling.
The thread may be accessing it later though, causing undefined behavior.
You need to allocate args dynamically (but not argId), and pass that to the thread. The thread function must then assure the deletion of the pointer. Also don't name your variables the same thing as a type. That is very confusing. The struct keyword in a variable declaration is generally (if you don't name variables and types the same) not needed in C++ and may cause other issues when used without reason, so don't use it and name thing differently.
struct Args
{
int tid;
sem_t* sem;
pthread_cond_t* cond;
pthread_mutex_t* mut;
};
//...
auto args = new Args{i, &threadCounter, &nextThreadCond, &nextThreadMutex};
if (pthread_create(&tid[i], NULL, wait_func, args))
{
cout << "Couldn't make thread " << i << endl;
}
and at the end of the thread function delete the pointer:
void *wait_func(void *args_ptr)
{
auto args = static_cast<Args*>(args_ptr);
//...
delete args;
}
static_cast is safer than the C style cast, since it is much more restricted in the types it can cast between and e.g. can't accidentally drop a const or anything similar.
None of the variables seem to have a reason to be static either in the global or local case.

pthread_cond_t cond = *(args->cond);
pthread_mutex_t mut = *(args->mut);
This tries to create a new condition variable and mutex and initialize it based on the value of the condition variable and mutex pointed to. That doesn't make sense and won't work.
while (tid != val - 1)
{
pthread_cond_wait(&cond, &mut);
sem_getvalue(&sem, &val);
// cout<<"evaluating condition in"<<tid<<", val is "<<val<<endl;
Here, you pass to pthread_cond_wait a pointer to the local condition variable and mutex you created above rather than a pointer to the shared one. Look at this code:
int a;
foo(&a);
void foo(int* a)
{
int b = *a;
bar (&b); // If bar changes *b, that will not affect a!
}
See the problem? You passed bar a pointer to b, not a. So if bar changes the thing the pointer points to, it won't be modifying a but the local copy of b.
Don't try to create mutexes or condition variables that are copies of other mutexes or condition variables. It doesn't make semantic sense and it won't work.
Instead, you can do this:
pthread_cond_t* cond = (args->cond);
pthread_mutex_t* mut = (args->mut);
Now you can pass cond and mut to pthread_cond_wait, and you'll be passing pointers to the shared synchronization objects.

Related

C++ pthread deadlock solution

I am trying to use p_thread to print out numbers in order like:
0
1
2
3
4
Without using a global variable, just local variable, but I met deadlock and I don't know how to fix it yet.
This is my code:
#include <pthread.h>
#include <iostream>
#include <string.h>
#include <stdlib.h>
#include <fcntl.h>
static pthread_mutex_t bsem; // Mutex semaphore
static pthread_cond_t waitTurn = PTHREAD_COND_INITIALIZER; // Condition variable to control the turn
//static int turn; // Index to control access to the turn array
static int nthreads; // Number of threads from input
struct SFE{
int turn;
int thread;
};
void *thread_function(void *void_ptr_argv)
{
SFE *threadNum = (SFE *) void_ptr_argv;
pthread_mutex_lock(&bsem);
// if its not our turn then wait
while(threadNum->turn != threadNum->thread){
pthread_cond_wait(&waitTurn, &bsem);
}
pthread_mutex_unlock(&bsem);
std::cout << "I am Thread " << threadNum->turn << std::endl;
pthread_mutex_lock(&bsem);
threadNum->turn++;
pthread_cond_broadcast(&waitTurn);
pthread_mutex_unlock(&bsem);
return nullptr;
}
int main()
{
std::cin >> nthreads;
pthread_mutex_init(&bsem, NULL); // Initialize bsem to 1
pthread_t *tid= new pthread_t[nthreads];
SFE threadNumber;
threadNumber.turn = 0;
for(int i=0;i<nthreads;i++)
{
// initialize the thread number here (remember to follow the rules from the specifications of the assignment)
threadNumber.thread = i;
pthread_create(&tid[i], nullptr, thread_function, (void*)&threadNumber);
}
for(int i = 0; i < nthreads; i++)
{
pthread_join(tid[i], nullptr);
}
return 0;
}
I am expecting a simple way to solve my problem
Think about threadNum->thread. threadNum is an unique object, thus each thread gets an unspecified number threadNum->thread between 0 and nthreads. Only the last thread gets the correct number nthreads - 1 .
You should update
struct SFE{
int *turn;
int thread;
};
Allocate the array SFE threadNum[nthread] and pass &threadNum[i] to i-th thread.
int turn = 0;
for(int i=0;i<nthreads;i++)
{
// initialize the thread number here (remember to follow the rules from the specifications of the assignment)
threadNumber[i].turn = &turn;
threadNumber[i].thread = i;
pthread_create(&tid[i], nullptr, thread_function, &threadNumber[i]);
}
Your code is C-ism. This code w/o iostream and new is a clean C. If you use C++, all will be much easier.

"Segmentation fault (core dumped)" while using pthread_create

So I've got a problem: when I trying to create the last thread it always says that core is dumped. Doesn't matter if I write to create 5 or 2 threads. Here is my code:
UPD: Now I can't do more than 3 threads and threads don't do functions that I want them to do(consume and produce)
UPD_2: Now I've go a message like that: terminate called after throwing an instance of 'terminate called recursively
terminate called recursively
Aborted (core dumped)
#include<cstdlib>
#include <iostream>
#include <string>
#include <mutex>
#include <pthread.h>
#include <condition_variable>
#define NUM_THREADS 4
using namespace std;
struct thread_data
{
int thread_id;
int repeat;
};
class our_monitor{
private:
int buffer[100];
mutex m;
int n = 0, lo = 0, hi = 0;
condition_variable in,out;
unique_lock<mutex> lk;
public:
our_monitor():lk(m)
{
}
void insert(int val, int repeat)
{
in.wait(lk, [&]{return n <= 100-repeat;});
for(int i=0; i<repeat; i++)
{
buffer[hi] = val;
hi = (hi + 1) % 100; //ring buffer
n = n +1; //one more item in buffer
}
lk.unlock();
out.notify_one();
}
int remove(int repeat)
{
out.wait(lk, [&]{return n >= repeat;});
int val;
for(int i=0; i<repeat; i++)
{
val = buffer[lo];
lo = (lo + 1) % 100;
n -= 1;
}
lk.unlock();
in.notify_one();
return val;
}
};
our_monitor mon;
void* produce(void *threadarg)
{
struct thread_data *my_data;
my_data = (struct thread_data *) threadarg;
cout<<"IN produce after paramiters"<< my_data->repeat<<endl;
int item;
item = rand()%100 + 1;
mon.insert(item, my_data->repeat);
cout<< "Item: "<< item << " Was prodused by thread:"<< my_data->thread_id << endl;
}
void* consume(void *threadarg)
{
struct thread_data *my_data;
my_data = (struct thread_data *) threadarg;
cout<<"IN consume after paramiters"<< my_data->repeat<<endl;
int item;
item = mon.remove(my_data->repeat);
if(item) cout<< "Item: "<< item << " Was consumed by thread:"<< my_data->thread_id << endl;
}
int main()
{
our_monitor *mon = new our_monitor();
pthread_t threads[NUM_THREADS];
thread_data td[NUM_THREADS];
int rc;
int i;
for( i = 0; i < NUM_THREADS; i++ )
{
td[i].thread_id = i;
td[i].repeat = rand()%5 + 1;
if(i % 2 == 0)
{
cout << "main() : creating produce thread, " << i << endl;
rc = pthread_create(&threads[i], NULL, produce, (void*) &td[i]);
if (rc)
{
cout << "Error:unable to create thread," << rc << endl;
exit(-1);
}
} else
{
cout << "main() : creating consume thread, " << i << endl;
rc = pthread_create(&threads[i], NULL, consume, (void *)&td[i]);
if (rc)
{
cout << "Error:unable to create thread," << rc << endl;
exit(-1);
}
}
}
pthread_join(threads[0], NULL);
pthread_join(threads[1], NULL);
pthread_join(threads[2], NULL);
//pthread_exit(NULL);
}
UPD: Now I can't do more than 3 threads and threads don't do functions that I want them to do(consume and produce)
UPD_2: Now I've go a message like that: terminate called after throwing an instance of 'terminate called recursively
terminate called recursively
Aborted (core dumped)
From cppref regarding std::condition_variable.wait(...)
"Calling this function if lock.mutex() is not locked by the current
thread is undefined behavior."
http://en.cppreference.com/w/cpp/thread/condition_variable/wait
Unfortunately, the program doesn't crash on line 47, but on line 55, where you unlock the lock that wasn't locked.
Lock the lock when you enter your functions. I've done a quick check of the rest of your logic, and I'm like 85% sure it's otherwise ok.
While I have you here, this is not strictly necessary, but it's good practice. std::lock_guard and std::unique_lock automatically lock the mutex when it enters scope and unlock it when it leaves scope. This helps simplify exception handling and weird function returns. I recommend you get rid of lk as a member variable and use it as a scoped local variable instead.
void insert(int val, int repeat)
{
{ // Scoped. Somewhat pedantic in this case, but it's always best to signal after the mutex is unlocked
std::unique_lock<std::mutex> lk(m);
in.wait(lk, [&]{return n <= 100-repeat;});
for(int i=0; i<repeat; i++)
{
buffer[hi] = val;
hi = (hi + 1) % 100; //ring buffer
n = n +1; //one more item in buffer
}
}
out.notify_one();
}
Ok, now for the final issue. The cool thing about producer/consumer is that we could produce and consume at the same time. However, we just locked our functions so this is no longer possible. What you can do now is move your condition lock/wait/unlock/work/signal inside the for loop
in pseudocode:
// produce:
while (true)
{
{
unique_lock lk(m)
wait(m, predicate)
}
produce 1
signal
}
The is equivalent to using semaphores (which C++'11 stl doesn't have, but you can easily make your own as shown above.)
// produce:
semaphore in(100);
semaphore out(0);
while (true)
{
in.down(1) // Subtracts 1 from in.count. Blocks when in.count == 0 (meaning the buffer is full)
produce 1
out.up(1) // Adds 1 to out.count
}
When main ends, td goes out of scope and ceases to exist. But you passed pointers into it to threads. You need to make sure td continues to exist as long as any threads might be using it.

std::atomic_flag to stop multiple threads

I'm trying to stop multiple worker threads using a std::atomic_flag. Starting from Issue using std::atomic_flag with worker thread the following works:
#include <iostream>
#include <atomic>
#include <chrono>
#include <thread>
std::atomic_flag continueFlag;
std::thread t;
void work()
{
while (continueFlag.test_and_set(std::memory_order_relaxed)) {
std::cout << "work ";
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
}
void start()
{
continueFlag.test_and_set(std::memory_order_relaxed);
t = std::thread(&work);
}
void stop()
{
continueFlag.clear(std::memory_order_relaxed);
t.join();
}
int main()
{
std::cout << "Start" << std::endl;
start();
std::this_thread::sleep_for(std::chrono::milliseconds(200));
std::cout << "Stop" << std::endl;
stop();
std::cout << "Stopped." << std::endl;
return 0;
}
Trying to rewrite into multiple worker threads:
#include <iostream>
#include <atomic>
#include <chrono>
#include <thread>
#include <vector>
#include <memory>
struct thread_data {
std::atomic_flag continueFlag;
std::thread thread;
};
std::vector<thread_data> threads;
void work(int threadNum, std::atomic_flag &continueFlag)
{
while (continueFlag.test_and_set(std::memory_order_relaxed)) {
std::cout << "work" << threadNum << " ";
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
}
void start()
{
const unsigned int numThreads = 2;
for (int i = 0; i < numThreads; i++) {
////////////////////////////////////////////////////////////////////
//PROBLEM SECTOR
////////////////////////////////////////////////////////////////////
thread_data td;
td.continueFlag.test_and_set(std::memory_order_relaxed);
td.thread = std::thread(&work, i, td.continueFlag);
threads.push_back(std::move(td));
////////////////////////////////////////////////////////////////////
//PROBLEM SECTOR
////////////////////////////////////////////////////////////////////
}
}
void stop()
{
//Flag stop
for (auto &data : threads) {
data.continueFlag.clear(std::memory_order_relaxed);
}
//Join
for (auto &data : threads) {
data.thread.join();
}
threads.clear();
}
int main()
{
std::cout << "Start" << std::endl;
start();
std::this_thread::sleep_for(std::chrono::milliseconds(200));
std::cout << "Stop" << std::endl;
stop();
std::cout << "Stopped." << std::endl;
return 0;
}
My issue is "Problem Sector" in above. Namely creating the threads. I cannot wrap my head around how to instantiate the threads and passing the variables to the work thread.
The error right now is referencing this line threads.push_back(std::move(td)); with error Error C2280 'thread_data::thread_data(const thread_data &)': attempting to reference a deleted function.
Trying to use unique_ptr like this:
auto td = std::make_unique<thread_data>();
td->continueFlag.test_and_set(std::memory_order_relaxed);
td->thread = std::thread(&work, i, td->continueFlag);
threads.push_back(std::move(td));
Gives error std::atomic_flag::atomic_flag(const std::atomic_flag &)': attempting to reference a deleted function at line td->thread = std::thread(&work, i, td->continueFlag);. Am I fundamentally misunderstanding the use of std::atomic_flag? Is it really both immovable and uncopyable?
Your first approach was actually closer to the truth. The problem is that it passed a reference to an object within the local for loop scope to each thread, as a parameter. But, of course, once the loop iteration ended, that object went out of scope and got destroyed, leaving each thread with a reference to a destroyed object, resulting in undefined behavior.
Nobody cared about the fact that you moved the object into the std::vector, after creating the thread. The thread received a reference to a locally-scoped object, and that's all it knew. End of story.
Moving the object into the vector first, and then passing to each thread a reference to the object in the std::vector will not work either. As soon as the vector internally reallocates, as part of its natural growth, you'll be in the same pickle.
What needs to happen is to have the entire threads array created first, before actually starting any std::threads. If the RAII principle is religiously followed, that means nothing more than a simple call to std::vector::resize().
Then, in a second loop, iterate over the fully-cooked threads array, and go and spawn off a std::thread for each element in the array.
I was almost there with my unique_ptr solution. I just needed to pass the call as a std::ref() as such:
std::vector<std::unique_ptr<thread_data>> threads;
void start()
{
const unsigned int numThreads = 2;
for (int i = 0; i < numThreads; i++) {
auto td = std::make_unique<thread_data>();
td->continueFlag.test_and_set(std::memory_order_relaxed);
td->thread = std::thread(&work, i, std::ref(td->continueFlag));
threads.push_back(std::move(td));
}
}
However, inspired by Sam above I also figured a non-pointer way:
std::vector<thread_data> threads;
void start()
{
const unsigned int numThreads = 2;
//create new vector, resize doesn't work as it tries to assign/copy which atomic_flag
//does not support
threads = std::vector<thread_data>(numThreads);
for (int i = 0; i < numThreads; i++) {
auto& t = threads.at(i);
t.continueFlag.test_and_set(std::memory_order_relaxed);
t.thread = std::thread(&work, i, std::ref(t.continueFlag));
}
}

Accessing random number engine from multiple threads

this is my first question, so please forgive me any violations against your policy. I want to have one global random number engine per thread, to which purpose I've devised the following scheme: Each thread I start gets a unique index from an atomic global int. There is a static vector of random engines, whose i-th member is thought to be used by the thread with the index i. If the index if greater than the vector size elements are added to it in a synchronized manner. To prevent performance penalties, I check twice if the index is greater than the vector size: once in an unsynced manner, and once more after locking the mutex. So far so good, but the following example fails with all sorts of errors (heap corruption, malloc-errors, etc.).
#include<vector>
#include<thread>
#include<mutex>
#include<atomic>
#include<random>
#include<iostream>
using std::cout;
std::atomic_uint INDEX_GEN{};
std::vector<std::mt19937> RNDS{};
float f = 0.0f;
std::mutex m{};
class TestAThread {
public:
TestAThread() :thread(nullptr){
cout << "Calling constructor TestAThread\n";
thread = new std::thread(&TestAThread::run, this);
}
TestAThread(TestAThread&& source) : thread(source.thread){
source.thread = nullptr;
cout << "Calling move constructor TestAThread. My ptr is " << thread << ". Source ptr is" << source.thread << "\n";
}
TestAThread(const TestAThread& source) = delete;
~TestAThread() {
cout << "Calling destructor TestAThread. Pointer is " << thread << "\n";
if (thread != nullptr){
cout << "Deleting thread pointer\n";
thread->join();
delete thread;
thread = nullptr;
}
}
void run(){
int index = INDEX_GEN.fetch_add(1);
std::uniform_real_distribution<float> uniformRnd{ 0.0f, 1.0f };
while (true){
if (index >= RNDS.size()){
m.lock();
// add randoms in a synchronized manner.
while (index >= RNDS.size()){
cout << "index is " << index << ", size is " << RNDS.size() << std::endl;
RNDS.emplace_back();
}
m.unlock();
}
f += uniformRnd(RNDS[index]);
}
}
std::thread* thread;
};
int main(int argc, char* argv[]){
std::vector<TestAThread> threads;
for (int i = 0; i < 10; ++i){
threads.emplace_back();
}
cout << f;
}
What am I doing wrong?!
Obviously f += ... would be a race-condition regardless of the right-hand side, but I suppose you already knew that.
The main problem that I see is your use of the global std::vector<std::mt19937> RNDS. Your mutex-protected critical section only encompasses adding new elements; not accessing existing elements:
... uniformRnd(RNDS[index]);
That's not thread-safe because resizing RNDS in another thread could cause RNDS[index] to be moved into a new memory location. In fact, this could happen after the reference RNDS[index] is computed but before uniformRnd gets around to using it, in which case what uniformRnd thinks is a Generator& will be a dangling pointer, possibly to a newly-created object. In any event, uniformRnd's operator() makes no guarantee about data races [Note 1], and neither does RNDS's operator[].
You could get around this problem by:
computing a reference (or pointer) to the generator within the protected section (which cannot be contingent on whether the container's size is sufficient), and
using a std::deque instead of a std::vector, which does not invalidate references when it is resized (unless the referenced object has been removed from the container by the resizing).
Something like this (focusing on the race condition; there are other things I'd probably do differently):
std::mt19937& get_generator(int index) {
std::lock_guard<std::mutex> l(m);
if (index <= RNDS.size()) RNDS.resize(index + 1);
return RNDS[index];
}
void run(){
int index = INDEX_GEN.fetch_add(1);
auto& gen = get_generator(index);
std::uniform_real_distribution<float> uniformRnd{ 0.0f, 1.0f };
while (true) {
/* Do something with uniformRnd(gen); */
}
}
[1] The prototype for operator() of uniformRnd is template< class Generator > result_type operator()( Generator& g );. In other words, the argument must be a mutable reference, which means that it is not implicitly thread-safe; only const& arguments to standard library functions are free of data races.

homemade scoped lock doesn't lock

I created the following code for a project where I don't have access to any modern C++ threading libraries like boost. My desire is to have the ability to have the lock automatically release when it leaves scope.
The Shared lock works fine. If a thread acquires it, nothing else can acquire it until the first thread releases it. The Scoped one does not work though.
Here's some output showing what I mean. I gave each thread a distinct name, had them instantiate the Scoped lock with the same Shared lock, print 'acquired', sleep for five seconds, print 'released', then leave scope. Instead of getting the acquire/release pairs I'd expect, I get four 'acquired's in quick succession, a five second gap, then the 'released's. I even changed the lock in Scoped to a pointer, and printed the address before acquiring it, just to make sure I wasn't crazy. It looks like it's the same Shared object, but the lock isn't preventing multiple accesses.
Lock '140734928395200'.
acquired: !!!!!
Lock '140734928395200'.
acquired: -------
Lock '140734928395200'.
acquired: ***************
Lock '140734928395200'.
acquired: ##
released: !!!!!
released: -------
released: ***************
released: ##
Here's the source code for Lock.h:
#include <pthread.h>
namespace Lock
{
class Shared
{
public:
Shared()
{
::pthread_mutex_init(&(this->mutex), nullptr);
}
~Shared()
{
}
void acquire()
{
::pthread_mutex_lock(&(this->mutex));
}
void release()
{
::pthread_mutex_unlock(&(this->mutex));
}
private:
pthread_mutex_t mutex;
};
class Scoped
{
public:
Scoped(Lock::Shared& lock) : lock(lock)
{
this->lock.acquire();
}
virtual ~Scoped()
{
this->lock.release();
}
private:
Lock::Shared& lock;
};
};
Here's my main.cc file for testing. I'm building with:
g++ -std=c++11 -o try -pthread main.cc && ./try
with g++4.7 on an up to date Ubuntu system.
#include <pthread.h>
#include <iostream>
#include "Lock.h"
#include <unistd.h>
struct data
{
data(std::string name, Lock::Shared& lock) : name(name), lock(lock) { ; }
std::string name;
Lock::Shared& lock;
};
void* doStuff(void* v)
{
data* d = (data*)v;
for (int i = 0; i < 5; i++)
{
Lock::Scoped(d->lock);
//d->lock->acquire();
std::cout << "acquired: " << d->name << std::endl;
::sleep(5);
std::cout << "released: " << d->name << std::endl;
//d->lock->release();
::sleep(1);
}
}
int main(int argc, char* argv[])
{
pthread_t fred;
pthread_t barney;
pthread_t wilma;
pthread_t betty;
Lock::Shared lock;
data f("##", lock);
data b("***************", lock);
data w("-------", lock);
data e("!!!!!", lock);
::pthread_create(&fred, nullptr, doStuff, (void*)&f);
::pthread_create(&barney, nullptr, doStuff, (void*)&b);
::pthread_create(&wilma, nullptr, doStuff, (void*)&w);
::pthread_create(&betty, nullptr, doStuff, (void*)&e);
::pthread_join(fred, nullptr);
::pthread_join(barney, nullptr);
::pthread_join(wilma, nullptr);
::pthread_join(betty, nullptr);
return 0;
}
The problem is:
for (int i = 0; i < 5; i++)
{
Lock::Scoped(d->lock);
which creates a temporaray Lock::Scoped that is constructed and destructed immediately, thus it does not have the intended synchronization effect. Change to:
for (int i = 0; i < 5; i++)
{
Lock::Scoped lk(d->lock);
The problem is here:
Lock::Scoped(d->lock);
This creates an unnamed temporary that goes out of scope right away.
To fix, give it a name:
Lock::Scoped lck(d->lock);