Thread synchronization between data pointed by vectors of std::shared_ptr - c++

I'm pretty new to concurrent programming and I have a specific issue to which I could not find a solution by browsing the internet..
Basically I have this situation (schematic pseudocode):
void fun1(std::vector<std::shared_ptr<SmallObj>>& v) {
for(int i=0; i<v.size(); i++)
.. read and write on *v[i] ..
}
void fun2(std::vector<std::shared_ptr<SmallObj>>& w) {
for(int i=0; i<w.size(); i++)
.. just read on *w[i] ..
}
int main() {
std::vector<std::shared_ptr<SmallObj>> tot;
for(int iter=0; iter<iterMax; iter++) {
for(int nObj=0; nObj<nObjMax; nObj++)
.. create a SmallObj in the heap and store a shared_ptr in tot ..
std::vector<std::shared_ptr<SmallObj>> v, w;
.. copy elements of "tot" in v and w ..
fun1(v);
fun2(w);
}
return 0;
}
What I want to do is operating concurrently spawning two threads to execute fun1 and fun2 but I need to regulate the access to the SmallObjs using some locking mechanism. How can I do it? In the literature I can only find examples of using mutexes to lock the access to a specific object or a portion of code, but not on the same pointed variables by different objects (in this case v and w)..
Thank you very much and sorry for my ignorance on the matter..

I need to regulate the access to the SmallObjs using some locking mechanism. How can I do it?
Use getters and setters for your data members. Use a std::mutex (or a std::recursive_mutex depending on whether recursive locking is needed) data member to guard the accesses, then always lock with a lock guard.
Example (also see the comments in the code):
class SmallObject{
int getID() const{
std::lock_guard<std::mutex> lck(m_mutex);
return ....;
}
void setID(int id){
std::lock_guard<std::mutex> lck(m_mutex);
....;
}
MyType calculate() const{
std::lock_guard<std::mutex> lck(m_mutex);
//HERE is a GOTCHA if `m_mutex` is a `std::mutex`
int k = this->getID(); //Ooopsie... Deadlock
//To do the above, change the decaration of `m_mutex` from
//std::mutex, to std::recursive_mutex
}
private:
..some data
mutable std::mutex m_mutex;
};

The simplest solution is to hold an std::mutex for the whole vector:
#include <mutex>
#include <thread>
#include <vector>
void fun1(std::vector<std::shared_ptr<SmallObj>>& v,std::mutex &mtx) {
for(int i=0; i<v.size(); i++)
//Anything you can do before read/write of *v[i]...
{
std::lock_guard<std::mutex> guard(mtx);
//read-write *v[i]
}
//Anything you can do after read/write of *v[i]...
}
void fun2(std::vector<std::shared_ptr<SmallObj>>& w,std::mutex &mtx) {
for(int i=0; i<w.size(); i++) {
//Anything that can happen before reading *w[i]
{
std::lock_guard<std::mutex> guard(mtx);
//read *w[i]
}
//Anything that can happen after reading *w[i]
}
int main() {
std::mutex mtx;
std::vector<std::shared_ptr<SmallObj>> tot;
for(int iter=0; iter<iterMax; iter++) {
for(int nObj=0; nObj<nObjMax; nObj++)
.. create a SmallObj in the heap and store a shared_ptr in tot ..
std::vector<std::shared_ptr<SmallObj>> v, w;
.. copy elements of "tot" in v and w ..
std::thread t1([&v,&mtx] { fun1(v,mtx); });
std::thread t2([&w,&mtx] { fun2(w,mtx); });
t1.join();
t2.join();
}
return 0;
}
However you will only realistically get any parallelism on the bits done in the before/after blocks in the loop of fun1() and fun2().
You could further increase parallelism by introducing more locks.
For example you can maybe get away with only 2 mutexes which control odd and even elements:
void fun1(std::vector<int>&v,std::mutex& mtx0,std::mutex& mtx1 ){
for(size_t i{0};i<v.size();++i){
{
std::lock_guard<std::mutex> guard(i%2==0?mtx0:mtx1);
//read-write *v[i]
}
}
}
With a similar format for fun2().
You might be able to reduce contention by working from opposite ends of the vectors or using try_lock and moving onto subsequent elements and 'coming back' to the locked element when available.
That can be most significant if the execution of an iteration of one function is much greater than the other and there's some advantage in getting the results from the 'faster' one before the other finishes.
Alternatives:
It's obviously possible to add an std::mutex to each object.
Whether that works / is necessary will depend on what is actually done in the functions fun1 and fun2 as well as how those mutexes are managed.
If it's necessary to lock before either of the loops start there may be in fact no benefit in parallelism because one of fun1() or fun2() will essentially wait for the other to finish and the two will in effect run in series.

Related

Passing vector with std::ref to threads does not seem to update the actual vector rather threads only updating local vector copy

I am facing a strange issue when passing a vector std::ref to pool of threads.
I cannot put the exact code but I have something like :
Approach A
void myfunc(int start, int end, std::vector<double>& result)
{
for(int i=start; i<end; ++i)
{
result[i] = ...
}
}
vector<double> myvec(N);
const size_t num_threads = std::thread::hardware_concurrency();
vector<std::thread> threads;
threads.reserve(num_threads);
for(int idx=0; idx<num_threads; ++idx)
threads.push_back(std::thread(myfunc, (idx*nloop/num_threads), (idx+1)*nloop/num_threads), std::ref(myvec)));
for(std::thread& t : threads)
{
if(t.joinable())
t.join();
}
some_read_opration(myvec)...
But if I explicitly create threads and divide tasks by hard coding ranges, it gives me expected values.
This works:
Approach B
std::thread t1(myfunc, 0, nloop/2, std::ref(myvec));
std::thread t2(myfunc, nloop/2, nloop, std::ref(myvec));
t1.join();
t2.join();
some_read_opration(myvec)...
I want parallelize my loop in more generic way, not sure where I am going wrong here.
In the A approach : The issue I am facing is that, it seems like each thread updates result[] element locally but its not being reflected at myvec in the function which creates threads, even if I am passing std::ref(myvec).
Any help would be great.

C++ thread safe class, not working as expected

I am trying to implement a thread-safe class. I put lock_guard for setter and getter for each member variable.
#include <iostream>
#include <omp.h>
#include <mutex>
#include <vector>
#include <map>
#include <string>
class B {
std::mutex m_mutex;
std::vector<int> m_vec;
std::map<int,std::string> m_map;
public:
void insertInVec(int x) {
std::lock_guard<std::mutex> lock(m_mutex);
m_vec.push_back(x);
}
void insertInMap(int x, const std::string& s) {
std::lock_guard<std::mutex> lock(m_mutex);
m_map[x] = s;
}
const std::string& getValAtKey(int k) {
std::lock_guard<std::mutex> lock(m_mutex);
return m_map[k];
}
int getValAtIdx(int i) {
std::lock_guard<std::mutex> lock(m_mutex);
return m_vec[i];
}
};
int main() {
B b;
#pragma omp parallel for num_threads(4)
for (int i = 0; i < 100000; i++) {
b.insertInVec(i);
b.insertInMap(i, std::to_string(i));
}
std::cout << b.getValAtKey(20) << std::endl;
std::cout << b.getValAtIdx(20) << std::endl;
return 0;
}
when I run this code, the output from map is correct but output from vector is garbage. I get o/p as
20
50008
Ofcourse the second o/p changes at each run.
1. what is wrong with this code? (I also have to consider the scenario, where there can be multiple instances of class B, running at multiple threads)
2. For each member variable do I need separate mutex variable? like
std::mutex vec_mutex;
std::mutex map_mutex;
I don't understand why you think that the output is garbage. Your loop is executed in 4 threads, so a possible sharing of the tasks could be:
Thread 1: 0 <= i < 25000
Thread 2: 25000 <= i < 50000
Thread 3: 50000 <= i < 75000
Thread 4: 75000 <= i < 100000
Each thread is doing a push_back of i to the vector. If now thread 1 starts and writes "0, 1, 2, 3, 4, 5, 6, 7, 8, 9" and then thread 1 writes "25000, 25001" and then thread 3 writes "50000, 50001, 50002, 50003, 50004, 50005, 50006, 50007, 50008". So you will end up with value 50008 at index 20. Of course other thread interleaving is also possible and you might also see values like for example 25003 or 75004.
The output you see is fine, only your expectations are off.
You add elements into the vector via:
void insertInVec(int x) {
std::lock_guard<std::mutex> lock(m_mutex);
m_vec.push_back(x);
}
and then retrive them via:
int getValAtIdx(int i) {
std::lock_guard<std::mutex> lock(m_mutex);
return m_vec[i];
}
Because the loop is executed in parallel, there is no guarantee that the values are inserted in the order you expect. Whichever thread first grabs the mutex will insert values first. If you wanted to insert the values in some specified order you would need to resize the vector upfront and then use something along the line of:
void setInVecAtIndex(int x,size_t index) {
std::lock_guard<std::mutex> lock(m_mutex); // maybe not needed, see PS
m_vec[index] = x;
}
So this isn't the problem with your code. However, I see two problems:
getValAtKey returns a reference to the value in the map. It is a const reference, but that does not prevent somebody else to modify the value via a call to insertInMap. Returning a reference here defeats the purpose of using the lock. Using that reference is not thread safe! To make it thread safe you need to return a copy.
You forgot to protect the compiler generated methods. For an overview, see What are all the member-functions created by compiler for a class? Does that happen all the time?. The compiler generated methods will not use your getters and setter, hence are not thread-safe by default. You should either define them yourself or delete them (see also rule of 3/5).
PS: Accessing different elements in a vector from different threads needs no synchronization. As long as you do not resize the vector and only access different elements you do not need the mutex if you can ensure that no two threads access the same index.

Multithreading - Passing variables between methods of different classes

I am working on a project which requires multithreading. I have three threads two of which run in parallel and one asynchronously as shown in the example code. I have a few questions regarding the variables and boost::shared_mutex
Is there a more elegant approach to pass vectors and other variables between methods?
We are having problems with the boost::shared_mutex in locking critical sections. Is there a better method in this case?
Thank you for your help. Sorry for the length of the code.
// A.h
class A{
private:
std::vector<int> a_data;
public:
int capture_a(std::vector<int> &data_transfer,boost::shared_mutex &_access,int &flag);
};
// A.cpp
A::capture_a(std::vector<int> &a_data_transfer,boost::shared_mutex &_access,int &a_flag)
{
// collect data
while(true)
{
//data input
a_data.push_back(data);
if(a_data.size() > N) //a certain limit
{
// save a_data
a_flag = 1;
boost::unique_lock< boost::shared_mutex > lock(_access);
//swap with empty array
// a_data_transfer should be empty
a_data_transfer.swap(a_data);
}
if(boost::thread_interrupted&)
{
std::cout << " Thread interrupted" << std::endl;
return 0;
}
}
}
// B.h
class B{
private:
std::vector<int> b_data;
public:
int capture_b(std::vector<int> &b_data_transfer, boost::shared_mutex &_access,int &a_flag,int &b_flag);
};
// B.cpp
B::capture_b(std::vector<int> &b_data_transfer, boost::shared_mutex &_access, int &a_flag,int &b_flag)
{
// collect data
while(true)
{
//data input
b_data.push_back(data);
if(a_flag == 1) //a_flag is true
{
boost::unique_lock< boost::shared_mutex > lock(_access);
b_data_transfer.swap(b_data);
b_flag = 1;
// save data
a_flag = 0;
}
if(boost::thread_interrupted&)
{
std::cout << " Thread interrupted" << std::endl;
return 0;
}
}
}
// C.h
class C
{
private:
std::vector<int> c_data;
public:
int compute_c(std::vector<int> &a_data,std::vector<int> &b_data,boost::shared_mutex &_access, int &b_flag);
}
// C.cpp
C::compute_c(std::vector<int> &a_data,std::vector<int> &b_data,boost::shared_mutex &_access,int &b_flag)
{
while(true) {
if(b_flag == 1)
{
boost::unique_lock< boost::shared_mutex > lock(_access);
// compute on c
c_data = a_data + b_data; // for example
// save c_data
b_flag = 0;
a_data.clear();
b_data.clear();
}
if(boost::thread_interrupted&)
{
std::cout << " Thread interrupted" << std::endl;
return 0;
}
}
}
int main()
{
std::vector<int> a_data_transfer, b_data_transfer;
boost::shared_mutex _access;
int a_flag = 0, b_flag = 0;
boost::thread t1(&A::capture_a,boost::ref(a_data_transfer),boost::ref(_access),boost::ref(a_flag));
boost::thread t2(&B::capture_b,boost::ref(b_data_transfer),boost::ref(_access),boost::ref(a_flag),boost::ref(b_flag));
boost::thread t3(&C::compute_c,boost::ref(a_data_transfer),boost::ref(b_data_transfer),boost::ref(_access),boost::ref(b_flag));
// Wait for Enter
char ch;
cin.get(ch);
// Ask thread to stop
t1.interrupt();
t2.interrupt();
t3.interrupt();
// Join - wait when thread actually exits
t1.join();
t2.join();
t3.join();
}
**********EDIT*************
What I am trying to achieve is:
A and B should run parallelly and when a certain criteria is met in A, the a_data and b_data should be transferred to C. After transferring the data, the vectors should keep on collecting the new data.
C should take in the a_data and b_data and perform a computation when the flag is true.
Problem with boost::shared_mutex - we want the a_data_transfer to be empty when swapping. It happens sometimes but not all the times. I need a way to ensure that happens for the code to run properly.
Is there a more elegant approach to pass vectors and other variables between methods?
Well... elegancy is a matter of taste.... But you might want to encapsulate the shared data into some class or struct, containing:
The shared data
the mutex (or mutexes) to protect them
The methods dealing with the shared data and appropriately locking them.
That would simplify the amound of data that you need to pass to the thread execution.
Probably you already know this, but it is important to realize that a thread is not a code element, but just a scheduling concept. The fact that in modern libraries threads are represented by an object it is only for our convenience.
We are having problems with the boost::shared_mutex in locking critical sections. Is there a better method in this case?
Without further details on the actual problems is difficult to say.
Some notes:
shared_mutex is a read/write mutex. Intended to allow multiple simultaneous readers, but only one writer. From the code it seems that you only write (unique_lock), so perhaps you might be able to use a simpler mutex.
In my view, mutexes are introduced to protect pieces of data. And you must take care of locking the minimum amount of time, and only while really accessing the shared data, balancing the need to make a set opertions atomic. You have a single mutex that protects two vectors. That's fine, but you might want to think if that's needed.

Am I using this deque in a thread safe manner?

I'm trying to understand multi threading in C++. In the following bit of code, will the deque 'tempData' declared in retrieve() always have every element processed once and only once, or could there be multiple copies of tempData across multiple threads with stale data, causing some elements to be processed multiple times? I'm not sure if passing by reference actually causes there to be only one copy in this case?
static mutex m;
void AudioAnalyzer::analysisThread(deque<shared_ptr<AudioAnalysis>>& aq)
{
while (true)
{
m.lock();
if (aq.empty())
{
m.unlock();
break;
}
auto aa = aq.front();
aq.pop_front();
m.unlock();
if (false) //testing
{
retrieveFromDb(aa);
}
else
{
analyzeAudio(aa);
}
}
}
void AudioAnalyzer::retrieve()
{
deque<shared_ptr<AudioAnalysis>>tempData(data);
vector<future<void>> futures;
for (int i = 0; i < NUM_THREADS; ++i)
{
futures.push_back(async(bind(&AudioAnalyzer::analysisThread, this, _1), ref(tempData)));
}
for (auto& f : futures)
{
f.get();
}
}
Looks OK to me.
Threads have shared memory and if the reference to tempData turns up as a pointer in the thread then every thread sees exactly the same pointer value and the same single copy of tempData. [You can check that if you like with a bit of global code or some logging.]
Then the mutex ensures single-threaded access, at least in the threads.
One problem: somewhere there must be a push onto the deque, and that may need to be locked by the mutex as well. [Obviously the push_back onto the futures queue is just local.]

Multi Threading Using Boost C++ - Synchronisation Issue

I would like to do multithreading where Thread ONE passes data to 4-5 Worker Threads which process the data and ones ALL Worker Threads are finished I would like to continue. I'm using boost to realize that however I have a synchronisation problem. Meaning at one point the program stops and doesn't continue working.
I used OpenMP before and that works nicely but I would like to set the thread priorities individually and I could not figure out how to do that with OpenMP therefore I worked on my own solution:
I would be very glad if some could give hints to find the bug in this code or could help me to find another approach for the problem.
Thank you,
KmgL
#include <QCoreApplication>
#include <boost/thread.hpp>
#define N_CORE 6
#define N_POINTS 10
#define N_RUNS 100000
class Sema{
public:
Sema(int _n =0): m_count(_n),m_mut(),m_cond(){}
void set(int _n)
{
boost::unique_lock<boost::mutex> w_lock(m_mut);
m_count = -_n;
}
void wait()
{
boost::unique_lock<boost::mutex> lock(m_mut);
while (m_count < 0)
{
m_cond.wait(lock);
}
--m_count;
}
void post()
{
boost::unique_lock<boost::mutex> lock(m_mut);
++m_count;
m_cond.notify_all();
}
private:
boost::condition_variable m_cond;
boost::mutex m_mut;
int m_count;
};
class Pool
{
private:
boost::thread m_WorkerThread;
boost::condition_variable m_startWork;
bool m_WorkerRun;
bool m_InnerRun;
Sema * m_sem;
std::vector<int> *m_Ep;
std::vector<int> m_ret;
void calc()
{
unsigned int no_pt(m_Ep->size());
std::vector<int> c_ret;
for(unsigned int i=0;i<no_pt;i++)
c_ret.push_back(100 + m_Ep->at(i));
m_ret = c_ret;
}
void run()
{
boost::mutex WaitWorker_MUTEX;
while(m_WorkerRun)
{
boost::unique_lock<boost::mutex> u_lock(WaitWorker_MUTEX);
m_startWork.wait(u_lock);
calc();
m_sem->post();
}
}
public:
Pool():m_WorkerRun(false),m_InnerRun(false){}
~Pool(){}
void start(Sema * _sem){
m_WorkerRun = true;
m_sem = _sem;
m_ret.clear();
m_WorkerThread = boost::thread(&Pool::run, this);}
void stop(){m_WorkerRun = false;}
void join(){m_WorkerThread.join();}
void newWork(std::vector<int> &Ep)
{
m_Ep = &Ep;
m_startWork.notify_all();
}
std::vector<int> getWork(){return m_ret;}
};
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
Pool TP[N_CORE];
Sema _sem(0);
for(int k=0;k<N_CORE;k++)
TP[k].start(&_sem);
boost::this_thread::sleep(boost::posix_time::milliseconds(10));
std::vector<int> V[N_CORE];
for(int k=0;k<N_CORE;k++)
for(int i=0;i<N_POINTS;i++)
{
V[k].push_back((k+1)*1000+i);
}
for(int j=0;j<N_RUNS;j++)
{
_sem.set(N_CORE);
for(int k=0;k<N_CORE;k++)
{
TP[k].newWork(V[k]);
}
_sem.wait();
for(int k=0;k<N_CORE;k++)
{
V[k].clear();
V[k]=TP[k].getWork();
if(V[k].size()!=N_POINTS)
std::cout<<"ERROR: "<<"V["<<k<<"].size(): "<<V[k].size()<<std::endl;
}
if((j+1)%100==0)
std::cout<<"LOOP: "<<j+1<<std::endl;
}
std::cout<<"FINISHED: "<<std::endl;
return a.exec();
}
You have a race between the calls to Pool::newWork() and Pool::run().
You have to remember that signaling/broadcasting a condition variable is not a sticky event. If your thread is not waiting on the condition variable at the time of the signaling, the signal will be lost. This is what can happen in your program: There is nothing that prevents your main thread to call Pool::newWork() on each of your Pool objects before they have time to call wait() on your condition variable.
To solve this, you need to move boost::mutex WaitWorker_MUTEX as a class member instead of it being a local variable. Pool::newWork() needs to grab that mutex before doing updates:
boost::unique_lock<boost::mutex> u_lock(WaitWorker_MUTEX);
m_Ep = &Ep;
m_startWork.notify(); // no need to use notify_all()
Since you're using a condition variable in Pool::run(), you need to handle spurious wakeup. I would recommend setting m_Ep to NULL when you construct the object and every time you're done with the work item:
boost::unique_lock<boost::mutex> u_lock(WaitWorker_MUTEX);
while (1) {
while (m_Ep == NULL && m_workerRun) {
m_startWork.wait(u_lock);
}
if (!m_workerRun) {
return;
}
calc();
m_sem->post();
m_Ep = NULL;
}
stop() will need to grab the mutex and notify():
boost::unique_lock<boost::mutex> u_lock(WaitWorker_MUTEX);
m_workRun = false;
m_startWork.notify();
These changes should make the 10ms sleep you have un-necessary. You do not seem to call Pool::stop() or Pool::join(). You should change your code to call them.
You'll also get better performance by working on m_ret in Pool::calc() than copying the result at the end. You're also doing copies when you return the work. You might want Pool::getWork() to return a const ref to m_ret.
I have not run this code so there might be other issues. It should help you move
It seems from your code that you're probably wondering why condition variables need to go hand in hand with a mutex (because you declare one local mutex in Pool::run()). I hope my fix makes it clearer.
It could be done with Boost futures. Start the threads then wait for all of them to finish. No other synchronization needed.