how to insert vector only once in multiple thread

how to insert vector only once in multiple thread - c++

I have below code snippet.
std::vector<int> g_vec;
void func()
{
//I add double check to avoid thread need lock every time.
if(g_vec.empty())
{
//lock
if(g_vec.empty())
{
//insert items into g_vec
}
//unlock
}
...
}
func will be called by multiple thread, and I want g_vec will be inserted items only once which is a bit similar as singleton instance. And about singleton instance, I found there is a DCLP issue.
Question:
1. My above code snippet is thread safe, is it has DCLP issue?
2. If not thread safe, how to modify it?

Your code has a data race.
The first check outside the lock is not synchronized with the insertion inside the lock. That means, you may end up with one thread reading the vector (through .empty()) while another thread is writing the vector (through .insert()), which is by definition a data race and leads to undefined behavior.
A solution for exactly this kind of problem is given by the standard in form of call_once.
#include<mutex>
std::vector<int> g_vec;
std::once_flag g_flag;
void func()
{
std::call_once(g_flag, [&g_vec](){ g_vec.insert( ... ); });
}

In your example, it could happen that second reentrant thread will find a non empty half initialized vector, that it's something that you won`t want anyway. You should use a flag, and mark it when initialization job is completed. Better a standard one, but a simple static int will do the job as well
std::vector<int> g_vec;
void func()
{
//I add double check to avoid thread need lock every time.
static int called = 0;
if(!called)
{
lock()
if(!called)
{
//insert items into g_vec
called = 1;
}
unlock()
}
...
}

Related

How to initiate a thread in a class in C++ 14?

class ThreadOne {
public:
ThreadOne();
void RealThread();
void EnqueueJob(s_info job);
std::queue<s_info> q_jobs;
private:
H5::H5File* targetFile = new H5::H5File("file.h5", H5F_ACC_TRUNC);
std::condition_variable cv_condition;
std::mutex m_job_q_;
};
ThreadOne::ThreadOne() {
}
void ThreadOne::RealThread() {
while (true) {
std::unique_lock<std::mutex> lock(m_job_q_);
cv_condition.wait(lock, [this]() { return !this->q_jobs.empty(); });
s_info info = std::move(q_jobs.front());
q_jobs.pop();
lock.unlock();
//* DO THE JOB *//
}
}
void ThreadOne::EnqueueJob(s_info job) {
{
std::lock_guard<std::mutex> lock(m_job_q_);
q_jobs.push(std::move(job));
}
cv_condition.notify_one();
}
ThreadOne *tWrite = new ThreadOne();
I want to make a thread and send it a pointer of an array and its name as a struct(s_info), and then make the thread write it into a file. I think that it's better than creating a thread whenever writing is needed.
I could make a thread pool and allocate jobs to it, but it's not allowed to write the same file concurrently in my situation, I think that just making a thread will be enough and the program will still do CPU-bound jobs when writing job is in process.
To sum up, this class (hopefully) gets array pointers and their dataset names, puts them in q_jobs and RealThread writes the arrays into a file.
I referred to a C++ thread pool program and the program initiates threads like this:
std::vector<std::thread> vec_worker_threads;
vector_worker_threads.reserve(num_threads_);
vector_worker_threads.emplace_back([this]() { this->RealThread(); });
I'm new to C++ and I understand what the code above does, but I don't know how to initiate RealThread in my class without a vector. How can I make an instance of the class that has a thread(RealThread) that's already ready inside it?

From what I can gather, and as already discussed in the comments, you simply want a std::thread member for ThreadOne:
class ThreadOne {
std::thread thread;
public:
~ThreadOne();
//...
};
//...
ThreadOne::ThreadOne() {
thread = std::thread{RealThread, this};
}
ThreadOne::~ThreadOne() {
// (potentially) notify thread to finish first
if(thread.joinable())
thread.join();
}
//...
ThreadOne tWrite;
Note that I did not start the thread in the member-initializer-list of the constructor in order to avoid the thread accessing other members that have not been initialized yet. (The default constructor of std::thread does not start any thread.)
I also wrote a destructor which will wait for the thread to finish and join it. You must always join threads before destroying the std::thread object attached to it, otherwise your program will call std::terminate and abort.
Finally, I replaced tWrite from being a pointer to being a class type directly. There is probably no reason for you to use dynamic allocation there and even if you have a need for it, you should be using
auto tWrite = std::make_unique<ThreadOne>();
or equivalent, instead, so that you are not going to rely on manually deleteing the pointer at the correct place.
Also note that your current RealThread function seems to never finish. It must return at some point, probably after receiving a notification from the main thread, otherwise thread.join() will wait forever.

Is it necessary to lock?

I met a question in leetcode. I have viewed some solutions in the discuss. But my solution is different from others because I do not use the lock in the method first. I wonder whether my code is correct. Besides, can you give me some advice about my code?
I think it is not necessary to use unique_lock in method void first(function<void()> printFirst) like void second(function<void()> printSecond), is it right?
class Foo {
public:
Foo() {
}
void first(function<void()> printFirst) {
// cout<<1<<endl;
// printFirst() outputs "first". Do not change or remove this line.
// mtx.lock();
printFirst();
flag=1;
// mtx.unlock();
cond.notify_all();
// cout<<11<<endl;
}
void second(function<void()> printSecond) {
// cout<<2<<endl;
{
unique_lock<mutex> lk(mtx);
cond.wait(lk,[this](){return flag==1;});
// printSecond() outputs "second". Do not change or remove this line.
printSecond();
flag=2;
}
// cout<<22<<endl;
cond.notify_all();
}
void third(function<void()> printThird) {
// cout<<3<<endl;
unique_lock<mutex> lk(mtx);
cond.wait(lk,[this](){return flag==2;});
// printThird() outputs "third". Do not change or remove this line.
printThird();
flag=3;
// cout<<33<<endl;
}
mutex mtx;
condition_variable cond;
int flag=0;
};

Obviously your three element functions are supposed to be called by different threads. Thus you need to lock the mutex in each thread to protect the common variable flag from concurrent access. So you should uncomment mtx.lock() and mtx.unlock() in first to protect it there as well. Functions second and third apply a unique_lock as an alternative for that.
Always make sure to unlock the mutex before calling cond.notify_all() either by calling mtx.unlock() before or making the unique_lock a local variable of an inner code block as in second.
Further advice
Put a private: before the element variables at the bottom of your class definition to protect them from outside access. That will ensure that flag cannot be altered without locking the mutex.

It is necessary.
That your code produced correct output in this instance is besides the point. It is entirely possibly that printFirst might not be complete by the timeprintSecond is called. You need the mutex to prevent this and stop printSecond and printThird. from being ran at the same time.

The flag checking condition in second() or third() may be evaluated at the same time as first() assigns 1 to flag.
Rewrite this
cond.wait(lk, [this](){return flag==1;});
like this and it may be easier to see:
while(!(flag==1)) cond.wait(lk);
It does the same thing as your wait() with a lambda.
flag is supposed to be read while the mutex is held - but first does not care about mutexes and assigns to flag whenever it pleases. For non-atomic types this is a disaster. It may work 10000000 times (and probably will) - but when things actually happen at the same time (because you let it) - boom - Undefined Behaviour

When would getters and setters with mutex be thread safe?

Consider the following class:
class testThreads
{
private:
int var; // variable to be modified
std::mutex mtx; // mutex
public:
void set_var(int arg) // setter
{
std::lock_guard<std::mutex> lk(mtx);
var = arg;
}
int get_var() // getter
{
std::lock_guard<std::mutex> lk(mtx);
return var;
}
void hundred_adder()
{
for(int i = 0; i < 100; i++)
{
int got = get_var();
set_var(got + 1);
sleep(0.1);
}
}
};
When I create two threads in main(), each with a thread function of hundred_adder modifying the same variable var, the end result of the var is always different i.e. not 200 but some other number.
Conceptually speaking, why is this use of mutex with getter and setter functions not thread-safe? Do the lock-guards fail to prevent the race-condition to var? And what would be an alternative solution?

Thread a: get 0
Thread b: get 0
Thread a: set 1
Thread b: set 1
Lo and behold, var is 1 even though it should've been 2.
It should be obvious that you need to lock the whole operation:
for(int i = 0; i < 100; i++){
std::lock_guard<std::mutex> lk(mtx);
var += 1;
}
Alternatively, you could make the variable atomic (even a relaxed one could do in your case).

int got = get_var();
set_var(got + 1);
Your get_var() and set_var() themselves are thread safe. But this combined sequence of get_var() followed by set_var() is not. There is no mutex that protects this entire sequence.
You have multiple concurrent threads executing this. You have multiple threads calling get_var(). After the first one finishes it and unlocks the mutex, another thread can lock the mutex immediately and obtain the same value for got that the first thread did. There's absolutely nothing that prevents multiple threads from locking and obtaining the same got, concurrently.
Then both threads will call set_var(), updating the mutex-protected int to the same value.
That's just one possibility that can happen here. You could easily have multiple threads acquiring the mutex sequentially and thus incrementing var by several values, only to be followed by some other, stalled thread, that called get_var() several seconds ago, and only now getting around to calling set_var(), thus resetting var to a much smaller value.

The code show in thread-safe in a sense that it will never set or get partial value of the variable.
But your usage of the methods does not guarantee that value will correctly change: reading and writing from multiple threads can collide with each other. Both threads read the value (11), both increment it (to 12) and than both set to the same (12) - now you counted 2 but effectively incremented only once.
Option to fix:
provide "safe increment" operation
provide equivalent of InterlockedCompareExchange to make sure value you are updating correspond to original one and retry as necessary
wrap calling code into separate mutex or use other synchronization mechanism to prevent operations to intermix.

Why don't you just use std::atomic for the shared data (var in this case)? That will be more safe efficient.

This is an absolute classic.
One thread obtains the value of var, releases the mutex and another obtains the same value before the first thread has chance to update it.
Consequently the process risks losing increments.
There are three obvious solutions:
void testThreads::inc_var(){
std::lock_guard<std::mutex> lk(mtx);
++var;
}
That's safe because the mutex is held until the variable is updated.
Next up:
bool testThreads::compare_and_inc_var(int val){
std::lock_guard<std::mutex> lk(mtx);
if(var!=val) return false;
++var;
return true;
}
Then write code like:
int val;
do{
val=get_var();
}while(!compare_and_inc_var(val));
This works because the loop repeats until it confirms it's updating the value it read. This could result in live-lock though in this case it has to be transient because a thread can only fail to make progress because another does.
Finally replace int var with std::atomic<int> var and either use ++var or var.compare_exchange(val,val+1) or var.fetch_add(1); to update it.
NB: Notice compare_exchange(var,var+1) is invalid...
++ is guaranteed to be atomic on std::atomic<> types but despite 'looking' like a single operation in general no such guarantee exists for int.
std::atomic<> also provides appropriate memory barriers (and ways to hint what kind of barrier is needed) to ensure proper inter-thread communication.
std::atomic<> should be a wait-free, lock-free implementation where available. Check your documentation and the flag is_lock_free().

What is critical resources to a std::mutex object

I am new to concurrency and I am having doubts in std::mutex. Say I've a int a; and books are telling me to declare a mutex amut; to get exclusive access over a. Now my question is how a mutex object is recognizing which critical resources it has to protect ? I mean which variable?
say I've two variables int a,b; now i declare mutex abmut; Now abmut will protect what???
both a and b or only a or b???

Your doubts are justified: it doesn't. That's your job as a programmer, to make sure you only access a if you've got the mutex. If somebody else got the mutex, do not access a or you will have the same problems you'd have without the mutex. That goes for all thread-syncronization constructs. You can use them to protect a resource. They don't do it on their own.

Mutex is more like a sign rather than a lock. When someone sees a sign saying "occupied" in a public washroom, he will wait until the user gets out and flips the sign. But you have to teach him to wait when seeing the sign. The sign itself won't prevent him from breaking in. Of course, the "wait" order is already set by mutex.lock(), so you can use it conveniently.

A std::mutex does not protect any data at all. A mutex works like this:
When you try to lock a mutex you look if the mutex is not already locked, else you wait until it is unlocked.
When you're finished using a mutex you unlock it, else threads that are waiting will do that forever.
How does that protect things? consider the following:
#include <iostream>
#include <future>
#include <vector>
struct example {
static int shared_variable;
static void incr_shared()
{
for(int i = 0; i < 100000; i++)
{
shared_variable++;
}
}
};
int example::shared_variable = 0;
int main() {
std::vector<std::future<void> > handles;
handles.reserve(10000);
for(int i = 0; i < 10000; i++) {
handles.push_back(std::async(std::launch::async, example::incr_shared));
}
for(auto& handle: handles) handle.wait();
std::cout << example::shared_variable << std::endl;
}
You might expect it to print 1000000000, but you don't really have a guarantee of that. We should include a mutex, like this:
struct example {
static int shared_variable;
static std::mutex guard;
static void incr_shared()
{
std::lock_guard<std::mutex>{ guard };
for(int i = 0; i < 100000; i++)
{
shared_variable++;
}
}
};
So what does this exactly do? First of all std::lock_guard uses RAII to call mutex.lock() when it's created and mutex.unlock when it's destroyed, this last one happens when it leaves scope (here when the function exits). So in this case only one thread can be executing the for loop because as soon as a thread passes the lock_guard it holds the lock, and we saw before that no other thread can hold it. Therefore this loop is now safe. Note that we could also put the lock_guard inside the loop, but that might make your program slow (locking and unlocking is relatively expensive).
So in conclusion, a mutex protects blocks of code, in our example the for-loop, not the variable itself. If you want variable protection, consider taking a look at std::atomic. The following example is for example again unsafe because decr_shared can be called simultaneously from any thread.
struct example {
static int shared_variable;
static std::mutex guard;
static void decr_shared() { shared_variable--; }
static void incr_shared()
{
std::lock_guard<std::mutex>{ guard };
for(int i = 0; i < 100000; i++)
{
shared_variable++;
}
}
};
This however is again safe, because now the variable itself is protected, in any code that uses it.
struct example {
static std::atomic_int shared_variable;
static void decr_shared() { shared_variable--; }
static void incr_shared()
{
for(int i = 0; i < 100000; i++)
{
shared_variable++;
}
}
};
std::atomic_int example::shared_variable{0};

A mutex doesn't inherently protect any specific variables... instead, the programmer needs to realise that they have some group of 1 or more variables that several threads may attempt to use, then use a mutex so that only one of those threads can be running such variable-using/changing code at any point in time.
Note especially that you're only protected from other threads' code accessing/modifying those variables if their code locks the same mutex during the variable access. A mutex used by only one thread protects nothing.

mutex is used to synchronize access to a resource. Say you have a data say int, where you are going to do read write operation using an getter and a setter. So both getter and setters will use the same mutex to to sync read/write operation.
both of these function will lock the mutex at the beginning and unlock it before it returns. you can use scoped_lock that will automatically unlock on its destructor.
void setter(value_type v){
boost::mutex::scoped_lock lock(mutex);
value = v;
}
value_type getter() const{
boost::mutex::scoped_lock lock(mutex);
return value;
}

Imagine you sit at a table with your friends and a delicious cake (the resources you want to guard, e.g. some integer a) in the middle. In addition you have a single tennis ball (this is our mutex). Now, only a single person can have the ball (lock the mutex using a lock_guard or similar mechanisms), but everyone can eat the cake (access the integer a).
As a group you can decide to set up a rule that only whoever has the ball may eat from the cake (only the person who has locked the mutex may access a). This person may relinquish the ball by putting it on the table (unlock the mutex), so another person can grab it (lock the mutex). This way you ensure that no one stabs another person with the fork, while frantically eating the cake.
Setting up and upholding a rule like described in the last paragraph is your job as a programmer. There is no inherent connection between a mutex and some resource (e.g. some integer a).

multithreaded program producer/consumer [boost]

I'm playing with boost library and C++. I want to create a multithreaded program that contains a producer, conumer, and a stack. The procuder fills the stack, the consumer remove items (int) from the stack. everything work (pop, push, mutex) But when i call the pop/push winthin a thread, i don't get any effect
i made this simple code :
#include "stdafx.h"
#include <stack>
#include <iostream>
#include <algorithm>
#include <boost/shared_ptr.hpp>
#include <boost/thread.hpp>
#include <boost/date_time.hpp>
#include <boost/signals2/mutex.hpp>
#include <ctime>
using namespace std;
/ *
* this class reprents a stack which is proteced by mutex
* Pop and push are executed by one thread each time.
*/
class ProtectedStack{
private :
stack<int> m_Stack;
boost::signals2::mutex m;
public :
ProtectedStack(){
}
ProtectedStack(const ProtectedStack & p){
}
void push(int x){
m.lock();
m_Stack.push(x);
m.unlock();
}
void pop(){
m.lock();
//return m_Stack.top();
if(!m_Stack.empty())
m_Stack.pop();
m.unlock();
}
int size(){
return m_Stack.size();
}
bool isEmpty(){
return m_Stack.empty();
}
int top(){
return m_Stack.top();
}
};
/*
*The producer is the class that fills the stack. It encapsulate the thread object
*/
class Producer{
public:
Producer(int number ){
//create thread here but don't start here
m_Number=number;
}
void fillStack (ProtectedStack& s ) {
int object = 3; //random value
s.push(object);
//cout<<"push object\n";
}
void produce (ProtectedStack & s){
//call fill within a thread
m_Thread = boost::thread(&Producer::fillStack,this, s);
}
private :
int m_Number;
boost::thread m_Thread;
};
/* The consumer will consume the products produced by the producer */
class Consumer {
private :
int m_Number;
boost::thread m_Thread;
public:
Consumer(int n){
m_Number = n;
}
void remove(ProtectedStack &s ) {
if(s.isEmpty()){ // if the stack is empty sleep and wait for the producer to fill the stack
//cout<<"stack is empty\n";
boost::posix_time::seconds workTime(1);
boost::this_thread::sleep(workTime);
}
else{
s.pop(); //pop it
//cout<<"pop object\n";
}
}
void consume (ProtectedStack & s){
//call remove within a thread
m_Thread = boost::thread(&Consumer::remove, this, s);
}
};
int main(int argc, char* argv[])
{
ProtectedStack s;
Producer p(0);
p.produce(s);
Producer p2(1);
p2.produce(s);
cout<<"size after production "<<s.size()<<endl;
Consumer c(0);
c.consume(s);
Consumer c2(1);
c2.consume(s);
cout<<"size after consumption "<<s.size()<<endl;
getchar();
return 0;
}
After i run that in VC++ 2010 / win7
i got :
0
0
Could you please help me understand why when i call fillStack function from the main i got an effect but when i call it from a thread nothing happens?
Thank you

Your example code suffers from a couple synchronization issues as noted by others:
Missing locks on calls to some of the members of ProtectedStack.
Main thread could exit without allowing worker threads to join.
The producer and consumer do not loop as you would expect. Producers should always (when they can) be producing, and consumers should keep consuming as new elements are pushed onto the stack.
cout's on the main thread may very well be performed before the producers or consumers have had a chance to work yet.
I would recommend looking at using a condition variable for synchronization between your producers and consumers. Take a look at the producer/consumer example here: http://en.cppreference.com/w/cpp/thread/condition_variable
It is a rather new feature in the standard library as of C++11 and supported as of VS2012. Before VS2012, you would either need boost or to use Win32 calls.
Using a condition variable to tackle a producer/consumer problem is nice because it almost enforces the use of a mutex to lock shared data and it provides a signaling mechanism to let consumers know something is ready to be consumed so they don't have so spin (which is always a trade off between the responsiveness of the consumer and CPU usage polling the queue). It also does so being atomic itself which prevents the possibility of threads missing a signal that there is something to consume as explained here: https://en.wikipedia.org/wiki/Sleeping_barber_problem
To give a brief run-down of how a condition variable takes care of this...
A producer does all time consuming activities on its thread without the owning the mutex.
The producer locks the mutex, adds the item it produced to a global data structure (probably a queue of some sort), lets go of the mutex and signals a single consumer to go -- in that order.
A consumer that is waiting on the condition variable re-acquires the mutex automatically, removes the item out of the queue and does some processing on it. During this time, the producer is already working on producing a new item but has to wait until the consumer is done before it can queue the item up.
This would have the following impact on your code:
No more need for ProtectedStack, a normal stack/queue data structure will do.
No need for boost if you are using a new enough compiler - removing build dependencies is always a nice thing.
I get the feeling that threading is rather new to you so I can only offer the advice to look at how others have solved synchronization issues as it is very difficult to wrap your mind around. Confusion about what is going on in an environment with multiple threads and shared data typically leads to issues like deadlocks down the road.

The major problem with your code is that your threads are not synchronized.
Remember that by default threads execution isn't ordered and isn't sequenced, so consumer threads actually can be (and in your particular case are) finished before any producer thread produces any data.
To make sure consumers will be run after producers finished its work you need to use thread::join() function on producer threads, it will stop main thread execution until producers exit:
// Start producers
...
p.m_Thread.join(); // Wait p to complete
p2.m_Thread.join(); // Wait p2 to complete
// Start consumers
...
This will do the trick, but probably this is not good for typical producer-consumer use case.
To achieve more useful case you need to fix consumer function.
Your consumer function actually doesn't wait for produced data, it will just exit if stack is empty and never consume any data if no data were produced yet.
It shall be like this:
void remove(ProtectedStack &s)
{
// Place your actual exit condition here,
// e.g. count of consumed elements or some event
// raised by producers meaning no more data available etc.
// For testing/educational purpose it can be just while(true)
while(!_some_exit_condition_)
{
if(s.isEmpty())
{
// Second sleeping is too big, use milliseconds instead
boost::posix_time::milliseconds workTime(1);
boost::this_thread::sleep(workTime);
}
else
{
s.pop();
}
}
}
Another problem is wrong thread constructor usage:
m_Thread = boost::thread(&Producer::fillStack, this, s);
Quote from Boost.Thread documentation:
Thread Constructor with arguments
template <class F,class A1,class A2,...>
thread(F f,A1 a1,A2 a2,...);
Preconditions:
F and each An must by copyable or movable.
Effects:
As if thread(boost::bind(f,a1,a2,...)). Consequently, f and each an are copied into
internal storage for access by the new thread.
This means that each your thread receives its own copy of s and all modifications aren't applied to s but to local thread copies. It's the same case when you pass object to function argument by value. You need to pass s object by reference instead - using boost::ref:
void produce(ProtectedStack& s)
{
m_Thread = boost::thread(&Producer::fillStack, this, boost::ref(s));
}
void consume(ProtectedStack& s)
{
m_Thread = boost::thread(&Consumer::remove, this, boost::ref(s));
}
Another issues is about your mutex usage. It's not the best possible.
Why do you use mutex from Signals2 library? Just use boost::mutex from Boost.Thread and remove uneeded dependency to Signals2 library.
Use RAII wrapper boost::lock_guard instead of direct lock/unlock calls.
As other people mentioned, you shall protect with lock all members of ProtectedStack.
Sample:
boost::mutex m;
void push(int x)
{
boost::lock_guard<boost::mutex> lock(m);
m_Stack.push(x);
}
void pop()
{
boost::lock_guard<boost::mutex> lock(m);
if(!m_Stack.empty()) m_Stack.pop();
}
int size()
{
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.size();
}
bool isEmpty()
{
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.empty();
}
int top()
{
boost::lock_guard<boost::mutex> lock(m);
return m_Stack.top();
}

You're not checking that the producing thread has executed before you try to consume. You're also not locking around size/empty/top... that's not safe if the container's being updated.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js